Comment by petercooper

5 hours ago

I've been using the Q4 version on my Mac Studio over my local network and it's been good. Indeed, I had the first ever experience where I was playing with it alongside my various other agents and forgot it was a local model as it was doing such a good job.

I do wonder, though, if another agent is really needed. I've been driving it with Pi (Claude Code's system prompt is far too heavy given the prefill speeds) and it's been great. OpenCode is another good option. Is there anything else to gain from another similar tool specific to Deepseek 4?

There is no need for another agent, functionally. But if you follow the idea of DS4 itself: the API agents use forces to do odd things, like translating the DSML stanzas to JSON, with all the canonicalization / KV cache checkpointing problems resulting from that. Is it really the case? What about also providing a sane alternative? Also I'm not sure why people don't try to write more stuff in that area in C/Go/Rust to have more control / speed / less dependences.

Also there is a lot more to imagine, TUI side. The problem is that most projects all copy what they already saw. For instance I just did this in 20 minutes: https://x.com/antirez/status/2055190821373116619 Now that code is cheap, ideas have more value. Are we sure that today it is still the case to think in terms: "Is another XYZ needed"? It could be the case that only just to explore new ideas, it is worth it. I I don't like the Javascript / Node ecosystem for my code, so if I have to explore a new TUI or agent workflow, if I do it with the tools I'm more happy to use, the result, the iterations, are different.

  • > ...I'm not sure why people don't try to write more stuff in C to have more control / speed / less dependences.

    Codex CLI is written in Rust, which should give comparable raw performance to C/C++. Of course you can care about the "less dependencies" point but this is somewhat less of a concern on a properly maintained project like Codex. That's not so much "wild, out of control" third-party dependencies and closer to the old ideal of proper software componentry.

    > Also there is a lot more to imagine, TUI side. The problem is that most projects all copy what they already saw. For instance I just did this in 20 minutes.

    This mockup is really nice and the sidebar display gives you a natural way to expose running multiple thinking flows in parallel, at least if you keep them from stepping on each other's toes with code edits (keep them all in read-only "plan" mode or working on completely separate directories/files). That's not so helpful on a 128GB MacBook where a single agentic flow brings you to thermal/power limits already, but it suddenly becomes useful on other hardware (DGX Spark, Strix Halo, lower-RAM machines with SSD offload, multiple nodes with pipeline parallelism) where you have more compute than you could use for single-stream decode.

  • For Golang, I highly recommend yzma to explore this surface. I’ve used it for embedding and summarization (with small models) and just mucking around with integrated LLM BubbleTea TUI idea (with bigger models).

    https://github.com/hybridgroup/yzma

    And thank you antirez for using your rep and quality output to push this line of evangelism; it is even more important than the software itself.

DS4 is an inference engine, not a harness. It provides an inference API server and you point your coding harness to it.

  • You misunderstood the OP. I hinted, in my blog, at my interest to also putting an agent harness inside.