Comment by nickdothutton

2 months ago

- Claude, please optimise the project for performance.

o Claude goes away for 15 minutes, doesn't profile anything, many code changes.

o Announces project now performs much better, saving 70% CPU.

- Claude, test the performance.

o Performance is 1% _slower_ than previous.

- Claude, can I have a refund for the $15 you just wasted?

o [Claude waffles], "no".

18 comments

nickdothutton

klysm 2 months ago

I’ve always found the hard numbers on performance improvement hilarious. It’s just mimicking what people say on the internet when they get performance gains

dominotw 2 months ago

> It’s just mimicking what people say on the internet when they get performance gains
probably read bunch of junior/mid level resumes saying they optimized 90% of company by 80%

lukev 2 months ago

If you provide it a benchmark script (or ask it to write one) so it has concrete numbers to go off of, it will do a better job.

I'm not saying these things don't hallucinate constantly, they do. But you can steer them toward better output by giving them better input.

jama211 2 months ago

While you’re making unstructured requests and expecting results, why don’t you ask your barista to make you a “better coffee” with no instructions. Then, when they make a coffee with their own brand of creativity, complain that it tastes worse and you want your money back.

wongarsu 2 months ago
Both "better coffee" and "faster code" are measurable targets. Somewhat vaguely defined, but nobody is stopping the Barista or Claude from asking clarifying questions.
If I gave a human this task I would expect them to transform the vague goal into measurable metrics, confirm that the metrics match customer (==my) expectations then measure their improvements on these metrics.
This kind of stuff is a major topic for MBAs, but it's really not beyond what you could expect from a programmer or a barista. If I ask you for a better coffee, what you deliver should be better on some metric you can name, otherwise it's simply not better. Bonus points if it's better in a way I care about
- jama211 2 months ago
  
  Sure, and LLM’s are pretty good at using measurable targets such as using tests to verify their work - if you direct them to do so.
mrguyorama 2 months ago
"Optimize this code for performance" is not an unstructured or vague request.
Any "performance" axis could have been used: Number of db hits, memory pressure, cpu usage, whatever.
The LLM chose (or whatever) to use CPU performance, claimed a specific figure, and that figure was demonstrably not real.
If you ask a barista to make you a better coffee, and the barista says "this coffee is hotter" and it just isn't, the problem is not underspecified requirements, the problem is that it just doesn't make any attempt to say things that are only correct. Technically it can't make any attempt.
If I tell an intern "Optimize this app for performance" and they come back having reduced the memory footprint by half, but that didn't actually matter because the app was never memory constrained, I could hem and haw about not giving clear instructions, but I could also use that as a teachable moment to help the budding engineer learn how to figure out what matters when given that kind of leeway, to still have impact.
If they instead come back and say "I cut memory usage in half" and then you have them run the app and it has the exact same memory usage, you don't think about not giving clear enough instructions, because you should be asking the intern "Why are you lying to my face?" and "Why are you confidently telling me something you did not verify?".
- jama211 2 months ago
  
  Yes it is
chillfox 2 months ago
I assume a good barista would ask some follow up questions before making the coffee.
- jama211 2 months ago
  
  A fair criticism!
nickdothutton 2 months ago
I was experimenting with Claude Code and requested something more CPU efficient in a very small project, there were a few avenues to explore, I was interested to see what path it would take. It turned out that it seized upon something which wasn't consuming much CPU anyway and was difficult to optimise further. I learned that I'd have to be more explicit in future and direct an analysis phase and probably kick-in a few strategies for performance optimisation which it could then explore. The refund request was an amusement. It was $15 well spent on my own learning.
- jama211 2 months ago
  
  Ah ok, so you just totally misrepresented your experiences for comedic effect. Good for you I guess
hexbin010 2 months ago
I could also argue if a barista gets multiple complaints about their coffee it's very much their and their employer's job to go away and figure out to make good coffee.
It's very much not the customers job to learn about coffee and to direct them how to make a quality basic coffee
And it's not rocket science.
- jama211 2 months ago
  
  But also if they get multiple complaints about the coffee being not to the customers liking when the customer provided no details or preferences as to what they like, those would be unfounded complaints.

touristtam 2 months ago

The last bit, in my limited experience:

> Claude: sorry you have to want until XX:00 as you have run out of credit.

_flux 2 months ago

If you really want to do this, you should probably ask for a plan first and review it.

antonvs 2 months ago

I can't help but notice that your first two bullets match rather closely the behavior of countless pre-AI university students assigned a project.

bongodongobob 2 months ago

You need to let it actually benchmark. They are only as good as the tools you give them.