Comment by zulgin

5 days ago

I think you are broadly correct, but just to pushback on a few points: (1) Ability to solve hard problems in days vs weeks as immense value (2) Back-end improvements (if done right), should improve platform speed, stability, scalability etc. which should have revenue implication (3) Ability to on-board a SWE equivalent entity in minutes, have them work on a specific hard problem and then off-board them in minutes can have value

All of the above, of course, depends upon Fable consistently being a 2x-3x SWE at minimum.

18 comments

zulgin

gmerc 5 days ago

You're not really solving problems, you're retrieving the best match of solved problems from compressed corpus. And that corpus is available to many companies, meaning "hard" problems stop having "hard problem" value the moment they enter the weights of any model via the internet ... or distill from one model to another. Anthropics business model is commoditising knowledge, but as we see with the Fable model card, they only want it done to the knowledge of other businesses, in their own field, they totally hate it.

aroman 5 days ago
I don’t think that’s an accurate or useful characterization of modern AI like Claude at all. It is not simply regurgitating knowledge. It applies its knowledge to create bespoke solutions to the problem you pose to it, and is able to self evaluate its progress towards the completion criteria. If you don’t think that counts as “problem solving”, your definition would exclude nearly all knowledge work and engineering.
- geraneum 5 days ago
  
  People underestimate the vastness of training data (internet) and overestimate their ability to recognize if something is really bespoke. Not to say the no problem solving is happening, because there are many problems that we inefficiently solve again and again and the LLMs are making the solutions more accessible to everyone with a subscription.
- computably 5 days ago
  
  > It applies its knowledge to create bespoke solutions to the problem you pose to it, and is able to self evaluate its progress towards the completion criteria.
  It imitates applying knowledge. The imitation may be uncanny, but assigning LLMs intentionality and ToM is a category error.
  
  2 replies →
- squeegmeister 4 days ago
  
  It’s like saying you can’t make a unique sentence unless you first make unique words
naasking 5 days ago

> You're not really solving problems, you're retrieving the best match of solved problems from compressed corpus.
This is not correct. LLMs interpolate in a high dimensional space, so you're actually composing the best matches in a compressed corpus to find novel points/paths in that space. That is problem solving.

ahtihn 5 days ago

> Back-end improvements (if done right), should improve platform speed, stability, scalability etc. which should have revenue implication

Depends entirely on the domain. If you're selling entreprise software, this kind of stuff barely matters for sales.

It can reduce operational costs which is good but there's a limit to how much that's worth.

UqWBcuFx6NV4r 5 days ago

Yep, there are many, many, non-niche domains in which this doesn’t mean much at all.

skywhopper 5 days ago

The thing about AI-generated “solutions” is that they often go down bad rabbit holes and need to be re-run, or since they are so “cheap” to create they are often just thrown away and rebuilt when requirements evolve. Plus, just more stuff is created and needs to be maintained. So in the end, your efficiency gains go out the window.

ponector 5 days ago

In my experience, the challenge in software development is not to solve a problem, but to define the outcome, the scope, the acceptance criteria etc.

majkinetor 5 days ago

Exactly, this is the hardest part and the reason why many projects fail

fendy3002 5 days ago

20x the cost means you need to have fable to be 20x better than the alternative, which is a tall order. And there's more options out there too, perhaps the 4x cost is enough.

This means if the deepseek / under 1k alternative is at least x1.2 improvement, fable needs to be x24, which I think is very2 unreasonable. It is possible for it to worth if it can x2 a $20k SWE, though I doubt it can do that.

henry2023 5 days ago

“Ability to solve hard problems in days vs weeks as immense value”. Citation needed.

LlMs are incredible don’t get me wrong, but they are good on tiny contexts (writing a script). Not on software engineering (adding features to Chrome).

AussieWog93 5 days ago

Honestly, LLMs been OK at adding features to software since around Opus 4.5. From what I've tried of Fable, it's a decent step up from the Opus models and I can only see things getting better.

system2 5 days ago

>pushback on a few points

Claude keeps telling me this when I argue with it. LMAO.

UqWBcuFx6NV4r 5 days ago

“gently push back”