Comment by ghshephard

17 days ago

I use cursor 8+ hours/day at work, and have full (and effectively unlimited) access to Claude Code and Codex - tools which I also use personally. I suspect that your "constant popups" were when you were using the editor - a mode that I'll confess I haven't touched in 3+ months.

Workflow in Cursor is actually awesome - I'm a little outdated in how I use it - I still establish goals/objectives, rather than managing the loop which does so - but if you can think broadly enough - I find it's pretty efficient.

Key things I like about Cursor (and I recognize I'm dating myself a bit here): - Plan Mode is really solid - I shift-tab, have it go create the plan using whatever insanely expensive SOTA model is available - I will usually spend 5-10 minutes on the Plan - review it, maybe even tweak it a little. (though 90% of the time it's fine out of the gate)

  - Ability to select any model for every task - I'll switch between Opus 4.8 High/xHigh/...  I'll even switch to 1M context for the planning phase upfront.   

  - It does an *excellent* job managing permissions and looping the agents and spinning up sub-agents for you - you set the goal, run the plan mode - and then let it churn for however long is required - pretty common to have a 30-45 minute run and come back to a fully created/tested product.   

   

The nice thing about Cursor (and honestly Claude Code, Codex) - there isn't really any "prompt engineering" involved. You just say, "Go Build me x - it should have y,z features - and build it in golang for me" - and that's it - the 3-4 page Plan comes back - usually pretty credible - and then you click "build.".

> there isn't really any "prompt engineering" involved

You should make an experiment; take someone who never used any LLMs or agents, and tell them to use it for the first time in front of you, and tell them to build something like a calculator program or whatnot. Bonus points if they're ICs or at least not-managers.

I think there is a lot us engineers take for granted, when it comes to communicating via text, how to state things clearly and what we think/reason when we read things. A lot of people don't have those "skills" innate, and the first time they use LLMs, they basically don't know how to interact with them, until they realize what they're able to do and not. Then they also learn what to say to steer the model into the right way, this is quite literally a "prompt engineering" skill they're now learning.

  • You don't even have to go outside engineers. I have teammates that get very little out of Claude Code because the way they integrate their own knowledge doesn't allow them to think of what Claude might not know. They'd say a task was impossible with the tooling, and I'd get instant answers, because I understand what is weird internal business logic sitting 6 repos away, and what is knowledge claude has by default. I can commit Claude.md files for them, but I have to include EVERYTHING, because otherwise they'll let Claude make assumptions and waste minutes, if not hours.

    It's a big part of what, in my experience, is separating the very good engineer from the iffy one: Do you have a good mental model, and can you put yourself in the shoes of people sitting in a different mental model? It makes you a better dev, and even more so when it comes to AI tools, which have their own kind of alien brain.

    • Coding LLMs are distilling developers. It's like the old experiment where you have someone write down the steps to make pancakes and they don't tell you to crack the eggs before adding them to the batter: it takes a particular mindset to be able to make a model of what is supposed to happen and deconstruct that to the level appropriate for implementation.

      Until now, the actual act of writing code: terminology, syntax, etc. was a significant hurdle, and that underlying mindset was a very useful, but missing in a surprisingly large number of developers, skill.

      Now with LLMs doing the work of "translate this into code," increasingly the only thing that matters is that exact ability. And developers that don't have it or can't develop it won't be developers for long.

      1 reply →

    • Thanks for putting into words what I have been seeing a lot at work and haven't been able to put my finger on. We tend to have quite diverse _workflows_ between devs at my company, and success seems to correlate with injecting better context earlier in the process.

      I like to chat with Claude about how to approach a given problem, bring in extra context, etc, before even really drafting up a plan, while other people dive into implementation immediately and go on wild goose chases.

      90% of the time we end up in the same place in roughly the same amount of time, and there are obviously tradeoffs to spending more time planning vs implementing. I'm oversimplifying as well.

    • I couldn't agree more. Socratic methodologu, domain modelling, systems thinking, pipes-and-arrows problem solving etc. These are the skills that get real work done in coding agents these days.

  • This makes a lot of sense and explains why some people are so captivated by modern models, while others see progress as merely incremental.

    • I'm sure that explains some of it but I really don't think it explains most of the people who have been AI-pilled in the last nine months. There was no amount of context I could give GPT-4o that would make it a net benefit to use that for agentic development. I tried it with quite sophisticated prompt systems and much simpler ones, compendiums of code & business analysis and sparser ones. Yet it just wasted my time - still there were people using Cursor with that model and saying it was life changing. I didn't have that experience until Opus 4.5 - its possible I could have had it earlier but that was when I happened to try it again.

      4 replies →

  • By that same logic (and I’m agreeing with you as of now), engineers shouldn’t get too comfortable treating “being good at text communication” as a lasting edge. With how quickly agentic coding is evolving, it’s worth considering the possibility that many of the prompting and steering skills we view as valuable today could become far less important in a matter of weeks or months.

  • Recently I have the SEO guy governing the mostly static, public site with Claude Code. He loves it but you would never imagine the level of mental illness Claude comes up with. If it were an employee I’d literally throw him out the front door, labor laws be damned. And as always, every insane thing it does is some direct echo of its concept and training.

But what's the $60B differentiator here? There are so many similar tools out there. I generally use Opencode, but also Claude code, antigravity and sometimes Kilo code on VS Studio. How can cursor be worth even 10% of 60B?

  • I don't know what cursors market share is but it feels like 20-25% to me. That is not worth nothing. Then;

    1) The data they have flowing through the system that enabled them to build composer (which is much better than stock kimi 2.5) and is presumably allowing the training of a new model on space Xs compute.

    2) Cursors new 'github' replacement.

    3) Enterprise sales/traction

    If you look at all of these together, it's not implausible that they end up mostly 'owning' coding in 5 years time. If they replace GitHub with something more compatible with agentic coding and bring it into their whole ecosystem providing cloud and local agents, PR review and own frontier coding model.

    It's specialised vs 'borg' isn't it. One way of thinking is that the world is owned by Anthropic/OpenAI and coding is just one of many things their model and software does. Another view is we have a 'coding with LLMs' company that specialises in this field of endeavour. Hard to say which wins, but I think they have a shot.

    Personally my only objection to cursor is that it's more expensive. That's it, otherwise it is great to be able to choose say GPT-5.5 when I want to work on backend and Opus when I want to work on front end. Great to have PR review built in. If they were able to get composer 3 to as good as GPT5.5 / fable at the price of composer 2.5 they'd be winning on price again.

    • > If you look at all of these together, it's not implausible that they end up mostly 'owning' coding

      They really need to change their trajectory then?

      And regardless being owned by xAI, a failed AI company which turned into a datacentre operator probably won't help them to achieve that.

      > Hard to say which wins, but I think they have a shot.

      The market for "coding harnesses" and "AI IDEs" is already oversaturated and they are effectively a commodity at this point, you can use any of them with any provider more or less interchangeably.

      29 replies →

    • I agree, Composer Fast 2.5 is getting really good. I started using it for a personal project after I had to switch from Sonnet because I hit the API limits, and I was surprised by how good it has become.

    • Have you looked at gitlab lately? They have a ton of ai features built in.

      I'm not a gitlab user, just learning it, so I can't say how half baked they are or not.

      At a high level though it seems like a huge step forward than GitHub

  • I believe they have some very good training data because of all the data generated by people using the service.

    This is the same data they used to finetune Kimi K2.5 to make their newer Composer models, which benchmark substantially better than Kimi K2.5.

    I've heard they also want to build their own base models, which will also benefit from their large amount of high-quality training data. Which will solve Grok's model quality problem.

    This is all unsourced conjecture of course. But it's what I've heard.

    • Also from what I understand (not my day job) we're now at the point where the post-training tuning (RLHF etc.) is increasingly important since pre training no longer scales.

      So it's not really fair to call it "fine tuning", it's an important part of building a coding model in 2026, and cursor have done a pretty good job with Composer

  • they are paying for marketshare/customer base. Cursor has a good chunk of it.

    xAI overbuilt their data centers - they can't find paying customers for them, that's the reason they made deals with other companies like Google to use their own datacenters.

    Cursor has the opposite problem of not having enough capacity. So this works well for them together.

    Weather it's worth it - if you beleive that AI will solve every problem then having a piece of the pie early on might be worth it.

    Remember how when google bought youtube for 1.65 billions people thought they are crazy? Or when facebook bought instagram.

    60B is a crazy number but might be worth it for someone fighting for world dominance :)

    • you are completely equivocated on most points.

      xai is on the line to delivery capacity they already sold to Google and most analysts think they are 50/50 on actually meeting it.

      the only proof they have capacity is that musk claims all the money they are burning is going to datacenters and gpu (mostly because if he put it on anything else the lie would be obvious)

      1 reply →

    • > Remember how when google bought youtube for 1.65 billions people thought they are crazy? Or when facebook bought instagram.

      I think these are good examples: in both of those cases the buyer had a plan to monetize.

      If you are a user of Cursor, expect to pay more for it or switch.

    • > they are paying for marketshare/customer base

      Or are they paying for talent? It seems like xAI is sorely lacking in talent, most likely due to the CEO and folks' aversion to him. By throwing around some SpaceX monopoly money he can trap some talent with retention clauses and try to invigorate his failed AI business.

  • I think the argument for Cursor is that it's the dominant tool that enterprises are using for coding, so the theory is Cursor wins that as the "model agnostic", it has a phenomenal Enterprise Sales Team.

    From a valuation model - $4B ARR with rapid growth, and the ability to shift traffic to internal models (honestly, massive amount of the time "composer" - their internal model is fine, and obviously going to get better). Say 17x Multiple which isn't unheard for a rapidly growing Startup with solid future structural profit elements (moving to internal model) - that gets you to $68B.

    • The fact it's agnostic has to be useful.

      Being able to compare outcomes for workflows involving competitors will obviously be v v v v useful.

    • > so the theory is Cursor wins that as the "model agnostic"

      But there are many model agnostic harnesses out there: OpenCode, Roo, Cline, and many others. And even Claude Code can be setup to use non-Anthropic models.

      5 replies →

    • Terminal is also model agnostic. Does it matter where you enter your prompt text?

    • > $4B ARR

      If you resell something worth $5 for $5 while having to pay for R&D and operating expenses that's not exactly comparable with a company that's selling actual products.

      > Say 17x Multiple

      On an extremely low margin business it is, yet again that wouldn't be the stupidest thing in today's market.

  • >How can cursor be worth even 10% of 60B?

    It can't as long as there is plenty of AI without it.

    The real differentiatior is that if $60B today turns out to be all thrown away in a worst-case scenario, it would be easily more affordable and there would be less negative impact than $47B at the time if it was all thrown away on Twitter.

  • Their revenue is 3B, and 20x is pretty typical.

    We’re in the new era where startups boast about and bought based on revenue and not on just a number of users with unclear path to monetizing as it had been for the previous couple decades.

    We can also note that we see Thrive Capital (Kushner) again in a win.

  • Where else are you going to get access to a real-time fresh high quality stream of human intelligence to grow your baby AGI? You can’t buy Codex, Claude, Copilot, so what’s left?

  • How are you switching between like 5 different editors lol. Bro sloppers will do anything to get their fix. Like the old people at the casino switching slot machines all day based on some occulted understanding that only they think they have.

There is most certainly still prompt engineering involved. How there can be both the responsivity to different cues like "plan this", "write this", "analyze this", "defend this", "poke holes in this", but not responsivity to the various terminology you provide in your explanations of "this", where to get information about specs/standards/requirements, what details I care about, and therefore can't compromise on, vs what details I'm willing to accept whatever the top reddit post from 4 years ago recommends.

I don't see how these systems can have the ability to be effectively expressive about all of the minutia, and not have all of the various different possible expressions lead to vastly different outcomes.

  • I think all of the cues that you just described are in the plan.

    For example - I might (real world example from this morning):

    "Create a script that installs hashicorp vault and consul, store the data on consul. Then create ahelper script that will fill the vault server with sample data. Add HTTPS support. Now write a framework that reads and decrypts the encrypted data in consul. Support old (pre 1.3) and new (post 1.3 vault). "

    That generates a 6 page plan using Opus 4.8 w/1mm context, including notes on what to prioritize, what format to create the scripts in, etc... (My cursor guidance already has a couple months of hints as to what I want in terms of scaffolding unit tests, canonical linux, performance, security, etc...)

    That 6 page plan is the "Prompt" - but it's entirely generated by Cursor/Opus. It's there to tweak if you want to emphasize, or provide some taste - but, honestly - it probably does a better job than I would - so ~90% of the time I just accept the plan as is.

  • I would say prompt engineering, in the sense of people claiming you need to include in every prompt magic incantations like "You are a senior engineer from a superintelligent alien species" and "take a deep breath and make no mistakes" doesn’t really do that much for everyday work I feel or they are all already included in the system prompt maybe. I reckon it can still edge out a few percentage points in automation.

    What actually matters is the ability to communicate well in general, not anything LLM-specific. Being able to state what you want clearly and unambiguously, and having a sense for what additional information you need to dump, even when the other side claims they already have everything they need.

Yes, I tried to use Cursor as an editor. Terrible idea in hindsight.

So your workflow now looks like mine except I prefer a different editor and only use the latest and greatest model so Cursor basically offers nothing over Codex.

I disagree about prompt engineering, but it's one of those things that probably varies because of what language you use, what problems you solve, and the degree to which you care about the output. Unless I'm writing tests, I keep AI on a very short leash because I'm writing critical code used by a very large number of users. I have noticed big differences in output quality depending on how I steer AI. Without steering, it will happily leave in dead code, change the use of variables so they need to be renamed, assume or fail to assume invariants, etc. As I said in another comment, I think we won't need to do that for very much longer, but right now it seems essential.

> You just say, "Go Build me x - it should have y,z features - and build it in golang for me" - and that's it - the 3-4 page Plan comes back - usually pretty credible - and then you click "build.".

What you're describing seems like a workflow for building toys only. There's currently no reality in which someone would actually know what the y,z features are before making them. A plan generated in 5min would likely suggest a suboptimal solution compared to what a good solution would look like (which might take a year or two to figure out, for a human, so still a week or so for SOTA models if at all possible). Building something in golang is cute, but hard to be convinced until more novel applications are being generated from prompts.

The data submitted by Cursor's users tho, that seems to be very valuable.

But that sounds like the same workflow as Codex or Claude, except Cursor is only a harness without its own model? (Or do they have their own model?)

  • You nailed it - in fact, most of Anthropic's early revenue came from Cursor - much of claude code programming components is essentially a feature copy of Cursor, so it makes sense they are similar.

    Cursor does have it's own model - it's a heavily reworked version of KimiK2, called "composer" - that I use a lot of the time when I have fairly straightforward tasks that don't require a lot of exploration or independent thought. Lot cheaper - the Input/CacheWrite/CacheRead/Output costs of Opus 4.8 are $5/$6.25/$0.5/$25 per mm tokens, vs $0.5/-/$0.2/$2.5.

> Key things I like about Cursor (and I recognize I'm dating myself a bit here)

What a world we live in - "dating oneself" is measured in weeks/months! :)

Not trying to be funny but seriously, if these tools can produce a tested 'product' in 45m, shouldn't we be seeing millions of them out there? I mean how far are we from a fully AI built Oracle ERP or even a notepad or helix?

  • It's a solid question - and to some degree what https://programbench.com/ tries to measure.

    Some of the issues (off the top of my head):

    - Note - that my "product" was about 3,000 lines of code - so tiny. But https://metr.org/ should give you some insight into the complexity the models are capable of.

    - you have to be able to imagine the product. If I have the time, and energy, to imagine what I want - the model will build it. Here is an example of a much better programmer than I and something he wanted built - https://www.boatbomber.com/blog/claude-fable-5

    - These are the first drafts. On average - any complex system needs about 10 years and at least 1000 active and enthusiastic about reporting users to really get robust code. Writing if via LLM doesn't (at least so far in my experience) help that much in reducing bugs if you were previously following any semblance of TDD. Lots of bugs in the code - the products you listed above have literally tens of millions of years of user experiences and bug reports that got them to where they are today. No silver bullet yet - just faster, less effort - and it enables non-technical people to create (still buggy) products.

  • Have you ever heard "I can do that in a weekend" and they usually can. The difficult part is not building the product, it's selling and marketing, the buisness part. It's quite common buisness tactic to outright copy someone else's product or buisness.

  • Millions of produced verified software engineered products in 45 minutes in the likeness of Oracle ERP or notepad++, helix are small potatoes when you see the unbounded ambitions of SpaceX in full.

    The end point may squeeze quality of operations at the subminute time span for ground control environment seriously launching Starship rockets one an hour, for example.

I think I do this with Claude every day. I don’t see why I need to pay for cursor to get this too.

  • You absolutely don't. I use all three products. My preference is Claude Code for my personal project. The one at work is kind of sandboxed off - but does have the benefit of an MCP for every enterprise service we have (Kibana, Victoria Metrics, Grafana, Jira, etc...) - which is nice.

    Over time - I expect Composer will be cheaper than Opus 4.8 - but the nice thing about Cursor - you can flick between models.

    And (this is purely a personal thing) - I really like the extensive collection of "Plans" that cursor tracks - there isn't really a similar thing in Claude Code - but I really like the Claude.AI interface for everything else. It's also a much better general knowledge agent - the Cursor Chat interface isn't as nice.

    • I’m not sure what you’re on about. I had Claude doing swarm engineering using different models. It would write specs that haiku would implement, it would check itself etc etc. with a simple phrase it goes into planning, multi agent mode, and chews on a problem until it’s done. It’s pretty autonomous.

      Maybe you haven’t looked deeper into what modern Claude can do?

      2 replies →