Comment by olejorgenb

2 days ago

Homepage: https://sites.google.com/view/eurollm

Deliverables:

- A series of models of different sizes for optimal effectiveness and efficiency (1B, 9B and 22B) trained on 4T tokens

- A multimodal model which can process and understand speech or text input

- Full project codebase available to the public with detailed data and model descriptions

I can't find the codebase yet though

Results don't seem that bad for 9b https://huggingface.co/blog/eurollm-team/eurollm-9b

  • I've been running it with Ollama, it's actually pretty good for working with text in Latvian (and other EU languages). I'd be hard pressed to find another model of a similar size that's good at it, for example: https://huggingface.co/spaces/openGPT-X/european-llm-leaderb...

    This won't be relevant to most people here, but it's cool to see even the smaller languages getting some love, instead of getting garbage outputs from Qwen (some versions of which are otherwise pretty good for programming) and anything below Llama 70B, or maybe looking at Gemma as a middle ground.

  • "...EuroLLM-9B was trained on approximately 4 trillion tokens, using 400 Nvidia H100 GPUs on the MareNostrum5 supercomputer..."

    • Related: MareNostrum5 is part of the eurohpc some bashed in other comments.

      Not related: it's in a church in Barcelona, I wanted to visit it but when I went to Barcelona it was closed for holidays so they didn't do visited tours. I also discovered the 5th most powerful supercomputer is by Italian energy company ENI, of all the companies that could have been I wouldn't have never expected an energy company to have such a powerful supercomputer (the first 3 are usa labs supercomputers, the 4th is a Microsoft supercomputer)