Comment by girvo

2 months ago

Considering the full fat Qwen3.5-plus is good, but barely Sonnet 4 good in my testing (but incredibly cheap!) I doubt the quantised versions are somehow as good if not better in practice.

8 comments

girvo

rustyhancock 2 months ago

I think it depends on work pattern.

Many do not give Sonnet or even Opus full reign where it really pushes ahead of over models.

If you're asking for tightly constrained single functions at a time it really doesn't make a huge difference.

I.e. the more vibe you do the better you need the model especially over long running and large contexts. Claude is heading and shoulders above everyone else in that setting.

girvo 2 months ago

>I.e. the more vibe you do the better you need the model especially over long running and large contexts
For sure, but the coolest thing about qwen3.5-plus is the 1mil context length on a $3 coding plan, super neat. But the model isn't really powerful enough to take real advantage of it I've found. Still super neat though!

stavros 2 months ago

When you say Sonnet 4, do you mean literally 4, or 4.6?

girvo 2 months ago
It's not as capable as Sonnet 4.6 in my usage over the past couple days, through a few different coding harnesses (including my own for-play one[0], that's been quite fun).
[0] https://github.com/girvo/girvent/
- dr_kiszonka 2 months ago
  
  What is the benefit of writing your own harness? I am asking because I need to get better at using AI for programming. I have used Cursor, Gemini CLI, Antigravity quite a bit and have had a lot of difficulties getting them do what I want. They just tend to "know better."
  
  3 replies →