← Back to context

Comment by PAndreew

1 month ago

Critics are (rightly) pointing to the fact that these models are not on par with SOTA for complex coding tasks. But many seems to forget that a large part of white collar office work is Excel crushing, file moving, translating dry legal documents, e-mail drafting, PPT drudgery, etc. These are absolutely doable with 30-35b+ models with the added benefit of keeping company data private.

I think the conclusion is flawed here? Sure qwen3.5 9b is nowhere near the sota models. It's 9b and was made a year ago? Everyone taking about local models is pumped about the models released in April this year. Qwen 3.6 27b and qwen 35b a3b if you have a sad GPU. Those are comparable to sota models, seriously.

Arguably excel and legal are much worse than code because catching the mistakes can be much harder.

Case in point, JPMorgan London Whale incident, $6 billion loss caused by an excel error...

  • Yes... I mean organisations have to adapt to this new working scheme. First they need new processes (maybe borrowed from SW development) that enables them to triage work products on a risk/reward scale. For example my wife works on medical device tenders. It is obligatory to translate every frikkin Word document to our native language which in the end noone will read. Do we use LLMs to do the translation? Hell yeah. For a critical legal document? Eeee. Also I think enablers like speical harnesses shall be developed/improved by keeping these folks in mind. For example to build hooks into the harness that forces the LLM to test/review/sample its output. So yes it's a complex topic, but my point was rather that the inherent capabilities of medium-large-ish open LLMs are sufficient for let's say 70-80% of such office work, and it's a huge market.