← Back to context

Comment by mrshu

1 day ago

It is worth noting there is _another_, completely unrelated project (also) called *EuroLLM* that is also EU funded which not only shares many of the same goals, but has already fulfilled many of them:

1. large multilingual dataset

2. open science approach

3. competitive performance

Here is the HF blogpost that introduced it in December 2024 (along with various benchmarks): https://huggingface.co/blog/eurollm-team/eurollm-9b

The project's lead has summarized the situation succinctly in their LinkedIn post [0]

  I hope the different communities collaborate openly, share their expertise, and don't decide to reinvent the wheel every time a new project gets funded. Next what? "OpenEuroLLM with real cheese"?

[0] https://www.linkedin.com/posts/andre-martins-31476745_ai-art...

Homepage: https://sites.google.com/view/eurollm

Deliverables:

- A series of models of different sizes for optimal effectiveness and efficiency (1B, 9B and 22B) trained on 4T tokens

- A multimodal model which can process and understand speech or text input

- Full project codebase available to the public with detailed data and model descriptions

I can't find the codebase yet though

Thanks for the heads up, I missed this project! However, on their page they write "Project Timeline: 1 May 2024 - 30 April 2025". April isn't far away, anyone knows what's supposed to happen afterwards?

  • That timeline is just for the preliminary hearing on potential committee members.

    No sarcasm, sorry.