← Back to context

Comment by wg0

6 hours ago

It is quantifiable thing not a feeling.

Between ten thousand runs of:

``` const int MAX_COUNT = 10000;

printf("I'll count up to %d", MAX_COUNT); for(int i=1; < MAX_COUNT; i++) printf("I'm now counting %d", i); ```

And of the following prompt:

``` You'll count to 10,000. At the start say "I'll count up to 10,000" and then for each number say "I'm now counting <number>" and do not say anything else. Do not miss numbers in between. ```

Which one is going to produce 100% correct results out of a 10,000 run of each?

Now don't give me "these are different tools". We all know. I'm talking about reliability and predictability.

Well, for starters the program you wrote is wrong (very unreliable) 100% of the time (very predictable)... so you just got your answer I guess.

In any case, most -if not nearly all- of the top-100 LLM will answer your prompt with some code that does what you intended the first program to do. Only they'll actually code it properly of course.