Comment by jsw97

9 days ago

Given the high rate of false positives people are reporting for the non-silent cybersecurity, biological, etc., safeguards, there is a strong likelihood that you will encounter silently nerfed behavior even if you are _not_ violating their TOS.

Ultimately this will be evident in the way customers / external benchmarkers experience Fable. Hopefully competition will drive future models toward a lower false positive rate. Until that happens, Mythos and Fable users seem likely to have pretty divergent experiences.

15 comments

jsw97

nsingh2 9 days ago

It's such an obviously bad policy, it's mind-boggling that they thought this was a good idea. It just breeds paranoia and mistrust, especially when people are already a bit paranoid about silent model quantification for cost cutting reasons.

SXX 9 days ago

Its not pranoia when entity you are dealing with cant be trusted and will do everything to abuse your trust.
llelouch 9 days ago
What's the alternative? Not release the model at all?
"Make the guardrails better" isn't very hard and probably not worth the effort.
- hagbarth 9 days ago
  
  The alternative is to be explicit when you nerf, so users know what they are working with.
  
  2 replies →
- schnitzelstoat 9 days ago
  
  That seems to be working well for Mythos. Just never release it and keep talking about how 'dangerous' it is to pump up the IPO price.
SamvitJ 9 days ago
Do you mean "quantization" not quantification?
- nsingh2 9 days ago
  
  Yup, I meant to write quantization there.
KennyBlanken 9 days ago

Another "knob" is reducing the thinking time...

azalemeth 9 days ago

I'm a medical physicist. I use the word nuclear a lot. Opus is fine (well, 99% of the time - I've certainly hit the CBRN filters a few times and even been invited to email anthropic about the false positives).

Fable has literally refused to work on any of my problems (even those about fluid dynamics!) and just tells me that I'm violating anthropic's AUP.

jsw97 9 days ago

This problem is compounded by the fact that you can be banned (really by any provider) based on an algorithm, and the methods for restoring your account seem like they do not function as well as might be desired. So be careful with your queries, basically, or you might get locked out.

imrehg 9 days ago

I encountered this when I was checking why my gluten-free bread came out the bread machine the way it did. I guess it latched onto some yeast-related points and it fell back to Opus...

Having said that, on this query I've seen very little difference in the quality, there's nothing to be "2x as good on" for the "2x quota usage", so shrugs?

KennyBlanken 9 days ago

If a benchmark is affected the model owner will almost certainly tune it, so there will be a game of cat and mouse...

Honestly, wouldn't surprise me if the AI companies try to detect benchmarking. Most hardware companies do...

supriyo-biswas 9 days ago

I mean, the other day I got blocked from Claude for asking about releasing genetically modified sterile mosquitoes; I'm sure everything will be totally fine as Anthropic's restrictions are completely reasonable, measured and appropriate.