← Back to context

Comment by scosman

5 months ago

Context: someone announced a Llama 3.1 70B fine tune with incredible benchmark results a few days ago. It's been a dramatic ride:

- The weight releases were messed up: released Lora for Llama 3.0, claiming it was a 3.1 fine tune

- Evals initially didn't meet expectations when run on released weights

- The evals starting performing near/at SOTA when using a hosted endpoint

- Folks are finding clever ways to see what model is running on the endpoint (using model specific tokens, and model specific censoring). This post claims there's proof it's not running on their model, but just a prompt on Sonnet 3.5

- After it was caught and posted as being Sonnet, it stop reproducing. Then others in the thread claimed to find evidence he just switched the hosted model to GPT 4o using similar techniques.

Lots of mixed results, inconsistent repos, and general confusion from the bad weight releases. Lots of wasted time. Not clear what's true and what's not.

Who is Sahil Chaudhary? Why he doesn't announce such a great advancement himself? Why Matt Shumer first announces it only because -- according to a later claim on X.com -- he trusted Sahil, does that mean Matt is unable to participate most of the progress? Then why announce a breakthrough without mentioning he was not fully involved to a level he can verify the result in the first place?

  • I recognize that surname from Twitter spams. Twitter has had financial rebates program for paying accounts for a while, and for months tons of paid spam accounts have been reply squatting trending tweets with garbage. Initially they appeared Sub-Saharan African, but the demographic seem to be constantly shifting eastward from there for some reason, through the Middle East and now around South-Indian/Pakistani regions. This one and variants thereof are common one in the Indian category among those.

    Maybe someone got lucky with that and trying their hands at LLM finetuning biz?

  • Matt and Sahil did an interview and it was mostly Matt doing the talking while Sahil looked like a hostage forced by Matt to do the interview.

When they were using the Sonnet 3.5 API, they censored the word "Claude" and replaced "Anthropic" with "Meta", then later when people realized this, they removed it.

Also, after GPT-4o they switched to a llama checkpoint (probably 405B-inst), so now the tokenizer is in common (no more tokenization trick).

  • Yeah I managed to get it to admit that it was Claude without much effort (telling it not to lie), and then it magically stopped doing that. FWIW Constitutional AI is great.

    • They implemented the censoring of "Claude" and "Anthropic" using the system prompt?

      Shouldn't they have used simple text replacement? they can buffer the streaming response on the server and then .replace(/claude/gi, "Llama").replace(/anthropic/gi, "Meta") on the streaming response while streaming it to the client.

      Edit: I realized this can be defeated, even when combined with the system prompt censoring approach.

      For example when given a prompt like this: tell me a story about a man named Claude...

      It would respond with: once upon a time there was a man called Llama...

      1 reply →

I was following the discussion on /r/LocalLlama over the weekend. Even before the news broke that it was Claude not a Llama 3.1 finetune, people had figured out that all Reflection really had was a custom system prompt telling it to check its own work and such.