Comment by causal

2 hours ago

I can't even get Claude or GPT-5 to consistently produce good flows for common use cases, much less domain-specific shit. They have deep vocabulary though, which makes them sound better informed than they are.

They are very good at writing code and debugging visible errors- but that's like 50% the harness.