Comment by trilogic

5 hours ago

Qwen 3.6 35B (finetuned) is so good that it became standard open weights for everyday use. Is not far at all from proprietary models if you give it tools, skills and agents etc, it can actually finish the job. (Thank you Qwen team, appreciated). Using opensource now we can definitely rely to design from scratch very complicated architecture and build pretty fast the full pack. Wish to see Europe AI unleashed, wake up.

22 comments

trilogic

Aurornis 4 hours ago

> Is not far at all from proprietary models if you give it tools, skills and agents etc,

I use Qwen 3.6 27B, the dense version of this model which is slightly better.

I don't agree that it's close at all. Maybe for some small, easy tasks, but not for working on real codebases. It's amazing for something I can run at home, but the difference between it and Opus or GPT-5.5 is huge.

trilogic 4 hours ago
Really, how so? Because we work with codebases daily, can you tell us a concrete example! In our case we work in consumer hardware (ish), 10 million ctx (1 million output, 1 million input proven, sometimes it loops or breaks at over 500k ctx byt at ~17tps linear). IT can read the full codebase, unleash agents, and write in disk editing and patching files creating a full app in 3-4 minutes. IT can do Web search and Rag pretty fast, it understands and fix the user query, sys prompts and adapt/fix them if needed on the fly. I am wondering what more do you do?
- trilogic 4 hours ago
  
  Edit: Forgot to mention that it can process images and pdf, and 100s of other files, it can even create presentations in code or mermaid, svg, charts js etc. Here a basic version of it: https://hugston.com/chat
  
  7 replies →
tedivm 4 hours ago
I've had the opposite experience, and have built multiple fantastic applications with Qwen3.6 27b. What quantization have you tested with?
- hedgehog 4 hours ago
  
  Similarly I haven't seen Qwen 27B as remotely competitive with Opus, at least Q4 hooked up to Claude Code. What harness are you using?
- trilogic 4 hours ago
  
  As funny as it may sound a q4_k_m well converted and quantized properly (and finetuned, impereative) would do the job. The 27b it may be good but is heavy, it burns the hardware. I personally prefer the 397B if I am stucked and can´t progress, it can still run with 7 tps. Now with the Mtp (multitoken prediction) it nearly double the speed ( reached 82tps today with the 35b 100000ctx). I recommend it you give it a try.
0xbadcafebee 1 hour ago

> not for working on real codebases
You don't pick just one model to "work on real codebases". You use a very advanced model to plan, and a not-very-advanced, cheaper, faster model to execute planned tasks. This saves money and speeds up work. This is the guidance from Anthropic & OpenAI.

storus 3 hours ago

It's 3.7-max; max was never open-weighted before. I don't see any smaller models in that tweet.

b3ing 4 hours ago

For coding it’s really bad. Writing is ok, chat is good. It’ll get better but it’s not that close yet

jedisct1 2 hours ago
Depends on the language and harness, I guess.
It works really well for me, at least for Python and JavaScript, with swival.dev as a harness.
- kajecounterhack 2 hours ago
  
  You should probably disclaimer that you're the author of swival.dev, but nice project :)

ethanpil 1 hour ago

Can you share the GGUF for this specific success story? I'd like to try it for myself.

mettamage 5 hours ago

Do you have a good resource on how to finetune a model like Qwen? I am curious to try it out.

trilogic 5 hours ago

Here is a dataset you can choose from: https://huggingface.co/datasets/Avtrkrb/combined-reasoning-o... Get a 10000 samples from it according to your needs and go for it. The key (in my opinion) is not cutting the Sequence Length among other things. Whatever traditional finetuning repo will do, if your hardware supports it Unsloth is faster.
verdverm 5 hours ago

Unsloth has good resources