← Back to context

Comment by vultour

5 days ago

I spent a good part of yesterday attempting to use ChatGPT to help me choose an appropriate API gateway. Over and over it suggested things that literally do not exist, and the only reason I could tell was that I spent a good amount of time in the actual documentation. This has been my experience roughly 80% of the time when trying to use an LLM. I would like to know what is the magical prompt engineering technique that makes it stop confidently hallucinating about literally everything.

> I spent a good part of yesterday attempting to use ChatGPT to help me choose an appropriate API gateway.

If you mean the ChatGPT interface, I suspect you're headed in the wrong direction.

Try Aider, with API interface. You can use whatever model you like (as you're paying per token). See my other comment:

https://news.ycombinator.com/item?id=44259900

I mirror the GP's sentiment. My initial attempts using a chat like interface were poor. Then some months ago, due to many HN comments, I decided to give Aider a try. I had put my kid to bed and it was 10:45pm. My goal was "Let me just figure out how to install Aider and play with it for a few minutes - I'll do the real coding tomorrow." 15 minutes later, not only had I installed it, my script was done. There was one bug I had to fix myself. It was production quality code, too.

I was hooked. Even though I was done, I decided to add logging, command line arguments, etc. An hour later, it was a production grade script, with a very nice interface and excellent logging.

Oh, and this was a one-off script. I'll run it once and never again. Now all my one-off scripts have excellent logging, because it's almost free.

There was no going back. For small scripts that I've always wanted to write, AI is the way to go. That script had literally been in my head for years. It was not a challenging task - but it had always been low in my priority list. How many ideas do you have in your head that you'll never get around to because of lack of time. Well, now you can do 5x more of those than you would have without AI.

  • Just wanted to add to your post with my anecdote.

    I was at the "script epiphany" stage a few months ago and I got cool Bash scripts (with far more bells and whistles I would normally implement) just by iterating with Claude via its web interface.

    Right now I'm at the "Gemini (with Aider) is pretty good for knock-offs of the already existing functionality" stage (in a Go/HTMX codebase).

    I'm yet to get to the "wow, that thing can add brand new functionality using code I'm happy with just by clever context management and prompting" stage; but I'm definitely looking forward to it.

I'm having a very good experience with ChatGPT at the moment. I'm mostly using it for little tasks where I don't remember the exact library functions. Examples:

"C++ question: how do I get the unqualified local system time and turn into an ISO time string?"

"Python question: how do I serialize a C struct over a TCP socket with asyncio?"

"JS question: how do I dynamically show/hide an HTML element?" (I obviously don't write a lot of JS :-D)

ChatGPT gave me the correct answers on the first try. I have been a sceptic, but I'm now totally sold on AI assisted coding, at least as a replacement for Google and StackOverflow. For me there is no point anymore in wading through all the blog spam and SEO crap just to find a piece of information. Stack Overflow is still occasionally useful, but the writing is on the wall...

EDIT: Important caveat: stay critical! I have been playing around asking ChatGPT more complex questions where I actually know the correct answer resp. where I can immediately spot mistakes. It sometimes gives me answers that would look correct to a non-expert, but are hilariously wrong.

  • The problem with this approach is that you might lose important context which is present in the documentation but doesn’t surface through the LLM. As an example, I just asked GPT-4o how to access Nth character in a string in Go. Predictably, it answered str[n]. This is a wildly dangerous suggestion because it works correctly for ASCII but not for other UTF8 characters. Sure, if you know about this and prompt it further it tells you about this limitation, but that’s not what 99% of people will do.

    • > The problem with this approach is that you might lose important context which is present in the documentation but doesn’t surface through the LLM.

      Oh, I'm definitely aware of that! I mostly do this with things I have already done, but can't remember all the details. If the LLM shows me something new, I check the official documentation. I'm not into vibe coding :) I still want to understand every line of code I write.

Sure, this was exactly how I felt three weeks ago, and I could have written that comment myself. The agentic approach where it works out it made something up by looking at the errors the type-check generates is what makes the difference.

Which model did you use?

I find using o3 or o4-mini and prompting "use your search tool" works great for having it perform research tasks like this.

I don't trust GPT-4o to run searches.

Did you use search grounding? O3 or o4-mini-high with search grounding (which will usually come on by default with questions like this) are usually the best option.