Comment by deepsquirrelnet

10 days ago

I’m not sure what I’d use them for, except maybe tag generation? Encoders of this size usually outperform by a wide margin on tasks they would overlap with.

5 comments

deepsquirrelnet

dismalaf 10 days ago

I'm making an app where literally all I want to do with an LLM is generate tags. This model has failed with flying colours, literally takes forever to parse anything and doesn't follow instructions.

Edit - I should add, currently the model I'm using is Gemini Flash Lite through the Gemini API. It's a really good combo of fast, follows instructions, gives correct results for what I want and cost-effective. I still would love a small open model that can run on edge though.

coder543 9 days ago

I'm pretty sure you're supposed to fine tune the Gemma 3 270M model to actually get good results out of it: https://ai.google.dev/gemma/docs/core/huggingface_text_full_...
Use a large model to generate outputs that you're happy with, then use the inputs (including the same prompt) and outputs to teach 270M what you want from it.
deepsquirrelnet 10 days ago

Oof. I also had it refuse an instruction for “safety”, which was completely harmless. So that’s another dimension of issues with operationalizing it.
thegeomaster 10 days ago
Well, Gemini Flash Lite is at least one, or likely two orders of magnitude larger than this model.
- dismalaf 10 days ago
  
  That's fair but one can dream of being able to simply run a useful LLM on CPU on your own server to simplify your app and save costs...