Comment by BoorishBears
4 hours ago
I tested it against Gemma 4 31B and it's expectedly not favorable for world knowledge.
But even against E4B it's shaky, which is surprising given how many tokens they trained on. I guess it was on a lot of synthetic data.
No comments yet
Contribute on Hacker News ↗