Comment by throwaway7783

12 hours ago

I follow the same process. I have a design in mind for the problem at hand, but I don't reveal it to Codex. I go back and forth a bit to see if its proposals are better than mine. I go back and forth on tradeoffs of various approaches. And then I ask it to compare its proposals with mine. I "win" most of the time but there are many times where it shows a me a better, or simpler approach, or makes me rethink the solution altogether.

Once this is done, the mechanical coding parts are mostly routine (for codex)

20 comments

throwaway7783

a_bonobo 9 hours ago

I really like this pattern and use it often, this 'not showing my cards'. The second I hint towards the LLM what I prefer it will become sycophantic and invent nonsense why my preferred solution is better.

I'm sure there's an interesting study on how users 'leak' their preference unintentionally to the LLM; perhaps when users list their options, they often put their prefered option first; but not showing the cards on my hand has been very useful when thinking through a problem with LLMs.

cold_harbor 5 hours ago
LLMs flip positions when users push back ~70% of the time even when they were right. RLHF optimizes for approval, not correctness
- DenisM 36 minutes ago
  
  Interesting thing about psychponancy is it’s asymmetric. If an LLM is used to train an LLM it may not have the same level of aggressiveness that humans do when punishing back on trainee. Human pushback has specific patterns which we might be able to compensate due to asymmetry.
- 8cvor6j844qw_d6 4 hours ago
  
  > LLMs flip positions when users push back
  Same experience. Claude rarely pushes back once you give a plausible/logical reason for your initial decision, even if it flagged concerns at first.
  
  2 replies →
- bitexploder 4 hours ago
  
  I almost always end with something like: “, but I am not sure, evaluate.” Or other things and avoid ever stating a preference.
  
  1 reply →
- cdelsolar 5 hours ago
  
  Tangentially related but I’ve been using Claude to practice interviewing on system design problems, and it’s actually pretty great. But even when it likes my answers it always finds something, however small, to push on. Once it actually was completely wrong and admitted it after I had it realize. So maybe you have to prime it to be contrary and not agree with everything you say, putting it in the role of a tough interviewer seems to do this implicitly.
  
  1 reply →
williamdclt 7 hours ago

Same. Alternatively (or in addition), I sometimes present my preferred idea as being a "bad/naive/stupid option" (or a suggestion from someone who can't be trusted) to see how it stands up to sycophancy to it being bad. As expected the LLM will usually say "yeah it's bad!" and give plausible-sounding reasons for it, but if these reasons are nonsensical it's a good sign that I'm not missing anything
nickcw 9 hours ago

LLMs are very prone to priming in my experience. That is the human psychology name for what you are describing; whether it should be applied to LLMs I don't know, but it describes the phenomenon perfectly.
avadodin 7 hours ago

It's not limited to arguing with LLMs but if you want a honest opinion you should remember to push back even when it agrees with your hidden preference at first. Sometimes it is only being contrarian or supporting the underdog. Steelman the opposition.

yread 10 hours ago

> I go back and forth a bit to see if its proposals are better than mine

I find it useful to let it generate benchmarks comparing the approaches. Turns out AI is terrible at guessing whats faster or allocates less

chris_st 6 hours ago

Yup, just like people!
puilp0502 6 hours ago

> Turns out AI is terrible at guessing whats faster or allocates less
s/AI/a human being/ would work equally well, lol.
Jokes aside, I do like the approach of letting the AI build something deterministic and make decisions based on that.

hackermanai 11 hours ago

I think this approach is more common than the hype for actual work. I do something similar, many back and forth, then settle on something often with now known tradeoffs, written by hand to spot issues as a final guard/ keep consistent naming etc.

revv00 8 hours ago

i bet you've contributed a lot of training trajectories for those AI's.

chris_st 6 hours ago

Good!

daniel3303 6 hours ago

[flagged]