← Back to context

Comment by c-linkage

6 days ago

I very much appreciate the fact that the OP posted not just the code developed by AI but also posted the prompts.

I have tried to develop some code (typically non-web-based code) with LLMs but never seem to get very far before the hallucinations kick in and drive me mad. Given how many other people claim to have success, I figure maybe I'm just not writing the prompts correctly.

Getting a chance to see the prompts shows I'm not actually that far off.

Perhaps the LLMs don't work great for me because the problems I'm working on a somewhat obscure (currently reverse engineering SAP ABAP code to make a .NET implementation on data hosted in Snowflake) and often quite novel (I'm sure there is an OpenAuth implementation on gitbub somewhere from which the LLM can crib).

This is something that I have noticed as well. As soon as you venture into somewhat obscure fields, the output quality of LLMs drastically drops in my experience.

Side note, reverse engineering SAP ABAP sounds torturous.

  • The worst part isn't even that the quality drops off, it's that the quality drops off but the tone of the responses don't. So hallucinations can start and it's just confidently wrong or even dangerous code and the only way to know better is to be better than the LLM in the first place.

    They might surpass us someday, but we aren't there yet.

The usual solution is a multi-tiered one.

First you use any LLM with a large context to write down the plan - preferably in a markdown file with checkboxes "- [ ] Task 1"

Then you can iterate on the plan and ask another LLM more focused on the subject matter to do the tasks one by one, which allows it to work without too much hallucination as the context is more focused.