Comment by briandw
7 days ago
I feel much more confident that I can take on a project in a domain that im not very familiar with. Ive been digging into llvm ir and I had not prior experience with it. ChatGPT is a much better guide to getting started than the documentation, which is very low quality.
Careful - if you’re not familiar with the domain how are you going to spot when the LLM gives you suboptimal or even outright wrong answers?
Just like anything else, stackoverflow, advice from a coworker or expert. If it doesn’t work, it will become clear that it’s not fixing your problem.
If all you’re doing is ping-ponging back and forth between an expert and an LLM, then what’s your value ?
1 reply →
Testing
Good luck with that.
I have been exploring local AI tools for coding (ollama + aider) with a small stock market simulator (~200 lines of python).
First I tried making the AI extract the dataclasses representing events to a separated file. It decided to extract some extra classes, leave behind some others, and delete parts of the code.
Then I tried to make it explain one of the actors called LongVol_player_v1, around 15 lines of code. It successfully concluded it does options delta hedging, but it jumped to the conclusion that it calculates the implied volatility. I set it as a constant, because I'm simulating specific interactions between volatility players and option dealers. It hasn't caught yet the bug where the vol player buys 3000 options but accounts only for 2000.
When asking for improvements, it is obsessed with splitting the initialization and the execution.
So far I wasted half of Saturday trying to make the machine do simple refactors. Refactors I could do myself in half of an hour.
I'm yet to see the wonders of AI.
If you are using Ollama that suggests you are using local models - which ones?
My experience is that the hosted frontier models (o3, Gemini 2.5, Claude 4) would handle those problems with ease.
Local models that fit on a laptop are a lot less capable, sadly.
I have tried with qwen2.5-coder:3b, deepseek-coder:6.7b, deepseek-r1:8b, and llama3:latest.
All of them local, yes.
1 reply →
Could you link the repo and prompts? What you described seems like the type of thing I’ve done before with no issue so you may have an interesting code base that is presenting some issues for the LM.
I cannot post the link to the repo, as it contains sensitive stuff. The code is mostly a bunch of classes with interleaved dataclasses, and a bunch of main() and run() functions at the end.
Some of the logs:
-----------------------------------
2 replies →
For what it's worth, commercial models are in a completely different league to locally runnable models. If you are really interested in seeing state of the art right now at least give it a whack with opus/gemini/o3 or something of that calibre.
You might still be disappointed but at least you won't have shot your leg off out of the gates!
I'm trying to use local models for privacy reasons. Also, at some point, one of the employers out there will start suing people because those people shared code with a commercial model, and the commercial model decided to train on that code and output learned code to someone else. I'm not interested in anything of such a situation.
1 reply →