Comment by refulgentis

10 months ago

I love open source and the general vibe of good vibes you're bringing, but...this isn't SOTA, or close, even on the papers own terms. (i.e. excluding models released the last 6 months, including their own, which is a strange, yet understandable, choice given the results they report)

Quickest way to show this:

- Table 2, top of page 7

- Gemma 2 27B, 0 interventions, has 94.1/56.6/60.2

- Gemma 2 27B, with all their interventions, has 86/64/69.

- Gemma 2 27B, with all their interventions, sampled 32 times, is at 90.4/67.2/70.3.

- Gemma 2 27B came out in...June 2024. :/

Quick heuristics employed here:

- What models did they compare against? (this isn't strictly an issue, the big screaming tell is "What models did they compare against compared to their last N papers?"

- How quickly does the paper have to move towards N samples, and how big does N get before they're happy enough to conclude? (32). How much does that improve performance on their chosen metric? (1.8%)

1 comment

refulgentis

ALLTaken 10 months ago

Good vibes, I mean yeah, we need more breakthroughs and AI isn't here to take our jobs if WE can own the AI too and not just a super-corperation.

I think what we are all really excited about having finally AI at home and being unchained and freed from a central SaaS controlling all the AI is ever going to tell you.

So, 6-7y ago google had these AI Chats internally and never intended to release it, a friendly googler told me.

Then ChatGPT came along and locked you into their SaaS. That was fantastic in the beginning, but the more you used the AI, the more you felt helpless, swound by anyone who may have access to an AI at OpenAI that is unfiltered and uses the full power of the model. Then came the jailbreak and accounts being banned for using it.

Then came the freedom by LLAMA and DeepSeek and waves of otheres. It rolled into your laptop real quick and this freedom is priceless! Something we should be really thankful for that it happened and support more OSS.

Google and Facebook would never share their trove of data with us ever and very few people have enough storage and compute to even attempt to replicate them. But their Data Dominance doesn't protect them anymore. Once the models became intelligent enough to slurp up large chunks of the web, they became a better search, a better teacher and a better experience than sponsored ads, with ads with internal google/bing products listed up, then SEO websites and somewhere hidden what we really were looking for. Or often.. just being deleted for copyright and other reasons.