Fine Tuning a Local LLM to Categorize Questions
Summary
A practical case study on fine-tuning a very small local LLM (Qwen 3:0.6B) to categorize household questions. The author shows baseline prompting performance (~10% accuracy) and significant gains after finetuning with Unsloth and QLoRa, achieving about 92% accuracy using fixed two-letter codes, while noting persistent challenges with overlapping meanings and semantically similar categories.