← Back to context

Comment by jart

18 days ago

Mozilla and Google provide an alternative called gemmafile which gives you an airgapped version of Gemini (which Google calls Gemma) that runs locally in a single file without any dependencies. https://huggingface.co/jartine/gemma-2-27b-it-llamafile It's been deployed into production by 32% of organizations: https://www.wiz.io/reports/the-state-of-ai-in-the-cloud-2025

There's nothing wrong with promoting your own projects, but its a little weird that you don't disclose that you're the creator.

  • It would be more accurate to say I packaged it. llamafile is a project I did for Mozilla Builders where we compiled llama.cpp with cosmopolitan libc so that LLMs can be portable binaries. https://builders.mozilla.org/ Last year I concatenated the Gemma weights onto llamafile and called it gemmafile and it got hundreds of thousands of downloads. https://x.com/JustineTunney/status/1808165898743878108 I currently work at Google on Gemini improving TPU performance. The point is that if you want to run this stuff 100% locally, you can. Myself and others did a lot of work to make that possible.

    • I keep meaning to investigate how I can use your tools to create single-file executables for Python projects, so thanks for posting and reminding me.

      1 reply →

That is just Gemma model. Most people seek capabilities equivalent for Gemini 2.5 Pro if they want to do any kind of coding.

  • Gemma 27b can write working code in dozens of programming languages. It can even translate between languages. It's obviously not as good as Gemini, which is the best LLM in the world, but Gemma is built from the same technology that powers Gemini and Gemma is impressively good for something that's only running locally on your CPU or GPU. It's a great choice for airgapped environments. Especially if you use old OSes like RHEL5.

    • It may be sufficient for generating serialized data and for some level of autocomplete but not for any serious agentic coding where you won't end up wasting time. Maybe some junior level programmers may find it still fascinating but senior level programmers end up fighting with bad design choices, poor algorithms and other verbose garbage most of the time. This happens even with the best models.

      1 reply →

    • I've spent a long time with models, gemma-3-27b feels distilled from Gemini 1.5. I think the useful coding abilities really started to emerge with 2.5.

    • The technology that powers Gemini created duds until Gemini 2.5 Pro; 2.5 Pro is the prize.