← Back to context

Comment by salawat

9 days ago

Here's the thing. You're kind of dopamine hacking yourself. Using current LLM's is something akin to using a slot machine, and one specifically tuned to work on/predate on knowledge workers.

The fact is, while it can talk a good game, and has been RLHF'd to high heaven to validate you all the bloody time to keep you engaged and burning tokens, your brain is simply tuned to reward any semblance of progress, and you getting a little bit more out of the LLM is in the same damn family of hit you get off coding. The dangerous bit though, is the inherently probabilistic nature of it though. This crank on a prompt may be different from same inputs, but different crank on the machine.

Just remember to get out from in front of the screen, and try to experience the worldly implementations of the systems you think you're building. Without that real world experience, no one's going to trust a bloody thing you do. You are a world model. It's a language model. It may know how to shoot the lingo, you know or can reckon how to do the thing.

Try running yourself a local model on a sufficiently beefy laptop. The lack of instant feedback tends to help soften the feedback loop, and gives you a less "ecstasy" coded position from which to actually objectively evaluate the efficacy of the thing at converting raw electricity -> thing. You'll find the added friction from the additional constraints (no outsourcing to a datacenter funded by someone else's money), suddenly changes the character of the thing.

> Try running yourself a local model on a sufficiently beefy laptop

I don't understand why you think the solution to using a well tuned and intelligent model is to use one that is a dumbass

  • As the parachutist says to the pilot: "I'll stop jumping out of perfectly good airplanes as soon as they start making them."

    I've fiddled with enough hosted ones enough to know the hosted solution's value add is almost entirely in the hardware stack it takes to serve inference traffic at scale, and whatever that company decides "alignment" means. The major quality is found in asking the right question, and getting the right answer. Not vomiting out subtly wrong answers faster. I don't need the thing doing work for me, I just need it to occasionally give me hints on where I might want to look when I get stuck. A local model fulfills that purpose just fine. It may take a bit to get an answer back, but you can bet your sweet ass that by the time I get that answer back, I've thought things through enough where I can pretty much instantly point out bullshit vs signal. Bonus: It doesn't add my codebases to the training dataset.

    But please, do go on. Continue to tell me how a tool perfectly good for my use case is a bad idea. I'm all ears. > /dev/null

  • When was the last time you tried a local model? You might actually be addicted to a dumb model that's just very quick at everything, which is what the GP is positing.

    I've got the QWEN3.5 coder running and it's dumb, but if the task is 'clone this component and rewrite it to testWhatever' it does it perfectly well.

    One might thing you're falling for the delusion being posited in this thread and you don't have a proper benchmark. You're just getting that token hit.