Comment by zackangelo

1 day ago

Your reply adds more confusion, imo.

The inference code and model architecture IS open source[0] and there are many other high quality open source implementations of the model (in many cases contributed by Google engineers[1]). To your point: they do not publish the data used to train the model so you can't re-create it from scratch.

[0] https://github.com/google-deepmind/gemma [1] https://github.com/vllm-project/vllm/pull/2964

6 comments

zackangelo

candiddevmike 1 day ago

If for some reason you had the training data, is it even possible to create an exact (possibly same hash?) copy of the model? Seems like there are a lot of other pieces missing like the training harness, hardware it was trained on, etc?

OneDeuxTriSeiGo 1 day ago

to be entirely fair that's quite a high bar even for most "traditional" open source.
And even if you had the same data, there's no guarantee the random perturbations during training are driven by a PRNG and done in a way that is reproducible.
Reproducibility does not make something open source. Reproducibility doesn't even necessarily make something free software (under the GNU interpretation). I mean hell, most docker containers aren't even hash-reproducible.
zackangelo 1 day ago

Yes, this is true. A lot of times labs will hold back necessary infrastructure pieces that allow them to train huge models reliably and on a practical time scale. For example, many have custom alternatives to Nvidia’s NCCL library to do fast distributed matrix math.
Deepseek published a lot of their work in this area earlier this year and as a result the barrier isn’t as high as it used to be.

nicce 1 day ago

I am not sure if this adds even more confusion. Linked library is about fine-tuning which is completely different process.

Their publications about producing Gemma is not accurate enough that even with data you would get the same results.

zackangelo 1 day ago
In the README of the linked library they have a code snippet showing how to have a conversation with the model.
Also, even if it were for fine tuning, that would require an implementation of the model’s forward pass (which is all that’s necessary to run it).
- nicce 1 day ago
  
  That is completely different discussion. Otherwise, even Gemini 2.5 Pro would be open-source with this logic since clients are open-source for interacting with the cloud APIs.