Comment by embedding-shape

9 hours ago

Spark is fun and cool, but it isn't some revolution. It's a different workflow, but not suitable for everything that you're use GPT5.2 for with thinking set to high, for example, it's way more dumb and makes more mistakes, while 5.2 will carefully thread through a large codebase and spend 40 minutes just to validate the change actually didn't break anything, as long as you provide prompts for it.

Spark on the other hand is a bit faster at reaching a point when it says "Done!", even when there is lots more it could do. The context size is also very limiting, you need to really divide and conquer your tasks, otherwise it'll gather files and context, then start editing one file, trigger the automatic context compaction, then forget what it was doing and begin again, repeating tons of time and essentially making you wait 20 minutes for the change anyways.

Personally I keep codex GPT5.2 as the everyday model, because most of the stuff I do I only want to do once, and I want it to 100% follow my prompt to the letter. I've played around a bunch with spark this week, and been fun as it's way faster, but also completely different way of working, more hands-on, and still not as good as even the gpt-codex models. Personally I wouldn't get ChatGPT Pro only for Spark (but I would get it for the Pro mode in ChatGPT, doesn't seem to get better than that).

3 comments

embedding-shape

bluegatty 9 hours ago

Spark is the 'same model and harness' but on Cerebras.

Your intuition may be deceiving you, maybe assuming it's a speed/quality trade-off, it's not.

It's just faster hardware.

No IQ tradeoff.

If you toy around with Cerebras directly, you get a feel for it.

Edit: see note below, I'm wrong. Not same model.

striking 8 hours ago
> Today, we’re releasing a research preview of GPT‑5.3‑Codex‑Spark, a smaller version of GPT‑5.3‑Codex, and our first model designed for real-time coding.
from https://openai.com/index/introducing-gpt-5-3-codex-spark/, emphasis mine
- bluegatty 7 hours ago
  
  You're right. It's funny because I kind of noticed that, but with all of these subtle model issues, I'm so used to being distraught by the smallest thing I've had to learn to 'trust the data' aka the charts, model standings, performance, etc. and in this case, I was under the assumption 'it was the same model' clearly it's not.
  Which is a bummer because it would be nice to try a true side-by-side analysis.