← Back to context

Comment by rickcarlino

5 days ago

I used fine-tuning back in the day because GPT 3.5 struggled with the concept of determining if two sentences were equivalent or not. This was for grading language learning drills. It was a single skill for a specific task and I had lots of example data from thousands of spaced repetition quiz sessions. The base model struggled with the vague concept of “close enough” equivalence. Since that time, the state of the art has advanced to the point that I don’t need it anymore. I could probably do it to save some money but I’m pretty happy with GPT 4.1.