Comment by alganet

8 months ago

> I think a huge reason why LLMs are so far ahead in programming

Are they? Last time I checked (couple of seconds ago), they still made silly mistakes and hallucinated wildly.

Example: https://imgur.com/a/Cj2y8km (AI teaching me about the Coltrane operator, that obviously does not exist).

17 comments

alganet

You're using the worst model when it comes to programming, not sure what point you're trying prove here. That's why when someone starts ranting how useless ai models are when it comes to coding I always assume they're just using inferior models.

alganet 8 months ago
My question was very simple. Suitable for a simpler model.
I can come up with prompts that make better models hallucinate (see post below).
I don't understand your objection. This is a known fact, LLMs hallucinate shit regardless of the model size.
- CamperBob2 8 months ago
  
  LLMs are getting better. Are you?
  Nothing matters in this business except the first couple of time derivatives.
  
  1 reply →

aoeusnth1 8 months ago

Are you intentionally sandbagging the LLMs to prove a point, or do you really think 4o-mini is good enough for programming?

Even 2.5 flash easily gets this https://imgur.com/a/OfW30eL

alganet 8 months ago
The point is that I can make them hallucinate quite easily. And they don't demonstrate knowing their own limitations.
For example, 2.5 Flash fails to explain the difference between the short ternary operator (null coalescing) and the Elvis operator.
https://imgur.com/a/xKjuoqV
Even when I specify a language (therefore clearing the confusion, supposedly), it still fails to even recognize the Elvis operator by its toupe, and mixes it up the explanation (it doesn't even understand what I asked).
https://imgur.com/a/itr87hM
So, the point I'm trying to make is that they're not any better for programming than they're for chemistry.
- CamperBob2 8 months ago
  
  Flash is the wrong model for questions like that -- not that you care -- but if you'd like to share the actual prompt you gave it, I'll try it in 2.5 Pro.
  
  9 replies →
CamperBob2 8 months ago

They aren't getting any better at programming, so they naturally assume the LLMs aren't, either.