Comment by scottcha
5 months ago
3.5 Sonnet is definitely my goto for straightforward tasks in github copilot. It seems much more effective due to its lack of verbosity and focus on completing the task rather than explaining it. Really helps in the new agent mode too.
Occasionally I switch out to one of the other models, usually GPT 4o, when I can't define the task as well and need to see additional analysis or get ideas.
Interesting, any reason to not use reasoning models? Is there anything 4o seems better at with respect to coding?
I typically use o1 or o3-mini, but I am seeing that they just released an agent mode and, honestly, I think it depends on what you use it for. I don’t think the agent mode is going to be useful for me. Typically my tasks are quite pedestrian, like I don’t know how to use a certain regex format, I need a python script to print list of directories, etc.
My main issue (which is not really covered in the paper) is that it’s not clear what models are most aligned to my work; by this I mean not lazy and willing to put in the required work, not incentivized to cheat, etc. So I’ll use them for the very small tasks (like regex) or the very big ones (like planning), but still don’t use them for the “medium” tasks that you’d give an intern. It’s not clear to me how they will operate totally unsupervised, and I think more benchmarking for that would be incredible.
Excited to see that hopefully change this year though!
Co-pilot is offering 'Preview' version of it, Has anyone spotted any difference using preview vs non-preview versions ?