Comment by verdverm

6 hours ago

We are exiting a hype cycle, well into the adoption curve. Subscriptions were never going to last.

My next step is going to be evaluating open and local models to see if they are sufficiently close to par with frontier models.

My hope is that the end of seat based pricing comes with this tech cycle. I was looking for document signing provider that doesn't charge a monthly, I only need a few docs a year.

7 comments

verdverm

alifeinbinary 5 hours ago

I'm developing software in this area right now, so I try a lot of the new models. They're not even close for coding tasks. It basically comes down to 26b parameters vs 1T parameters / quantisation / smaller context sizs, there's no comparison. However, for agentic work, tool calling, text summarisation, local LLMs can be quite capable. Workloads that run as background tasks where you're not concerned about TTFB, cold starts, tok/s etc., this is where local AI is useful.

If you have an M processor then I would recommend that you ditch Ollama because it performs slowly. We get double or triple tok/s using omlx or vmlx, respectively, but vmlx doesn't have extensive support for some models like gpt-oss.

AstroBen 5 hours ago

Kimi K2.5 (as an example) is an open model with 1T params. I don't see a reason it has to be local for most use cases- the fact that it's open is what's important.
verdverm 4 hours ago

first session with gemma4:31b looks pretty good, like it may actually be up to coding tasks like gemini-3-flash levels
you can tell gemma4 comes from gemini-3

__mharrison__ 5 hours ago

I recently experimented creating a Python library from scratch with Codex. After I was done, I took the PRD and Task list that was generated and fed them to opencode with Qwen 3.5 running locally.

Opencode was able to create the library as well. It just took about 2x longer.

selectodude 5 hours ago
Which version of Qwen 3.5 did you use?
- verdverm 5 hours ago
  
  which quant as well
  
  1 reply →