← Back to context

Comment by deepsquirrelnet

10 days ago

I’m not sure what I’d use them for, except maybe tag generation? Encoders of this size usually outperform by a wide margin on tasks they would overlap with.

I'm making an app where literally all I want to do with an LLM is generate tags. This model has failed with flying colours, literally takes forever to parse anything and doesn't follow instructions.

Edit - I should add, currently the model I'm using is Gemini Flash Lite through the Gemini API. It's a really good combo of fast, follows instructions, gives correct results for what I want and cost-effective. I still would love a small open model that can run on edge though.

  • Oof. I also had it refuse an instruction for “safety”, which was completely harmless. So that’s another dimension of issues with operationalizing it.

  • Well, Gemini Flash Lite is at least one, or likely two orders of magnitude larger than this model.

    • That's fair but one can dream of being able to simply run a useful LLM on CPU on your own server to simplify your app and save costs...