Comment by dangus

16 hours ago

Perhaps the real takeaway is that there really is only one product, two if you count image generation.

Perhaps the only reason Cursor is so good is because editing code is so similar to the basic function of an LLM without anything wrapped around it.

Like, someone prove me wrong by linking 3 transformative AI products that:

1. Have nothing to do with "chatting" to a thin wrapper (couldn't just be done inside a plain LLM with a couple of file uploads added for additional context)

2. Don't involve traditional ML that has existed for years and isn't part of the LLM "revolution."

3. Has nothing to do with writing code

For example, I recently used an AI chatbot that was supposed to help me troubleshoot a consumer IoT device. It basically regurgitated steps from the manual and started running around in circles because my issue was simply not covered by documentation. I then had to tell it to send me to a human. The human had more suggestions that the AI couldn't think of but still couldn't help because the product was a piece of shit.

Or just look at Amazon Q. Ask it a basic AWS question and it'll just give you a bogus "sorry I can't help with that" answer where you just know that running over to chatgpt.com will actually give you a legitimate answer. Most AI "products" seem to be castrated versions of ChatGPT/Claude/Gemini.

That sort of overall garbage experience seems to be what is most frequently associated with AI. Basically, a futile attempt to replace low-wage employees that didn't end up delivering any value to anyone, especially since any company interested in eliminating employees just because "fuck it why not" without any real strategy probably has a busted-ass product to begin with.

Putting me on hold for 15 minutes would have been more effective at getting me to go away and no compute cycles would have been necessary.

Outside of coding, Google's NotebookLM is quite useful for analysing complex documentation - things like standards and complicated API specs.

But yes, an AI chatbot that can't actually take any actions is effectively just regurgitating documentation. I normally contact support because the thing I need help with is either not covered in documentation, or requires an intervention. If AI can't make interventions, it's just a fancy kind of search with an annoying interface.

  • I don’t deny that LLMs are useful, merely that they only represent one product that does a small handful of things well, where the industry-specific applications don’t really involve a whole lot of extra features besides just “feed in data then chat with the LLM and get stuff back.”

    Imagine if during the SaaS or big data or containerizaiton technology “revolutions” the application being run just didn’t matter at all. That’s kind of what’s going on with LLMs. Almost none of the products are all that much better than going to ChatGPT.com and dumping your data into the text box/file uploader and seeing what you get back.

    Perhaps an analogy to describe what I mean would be if you were comparing two SaaS apps, like let’s say YNAB and the Simplifi budget app. In the world of the SaaS revolution, the capabilities of each application would be competitive advantages. I am choosing one over the other for the UX and feature list.

    But in the AI LLM world, the difference between competing products is minimal. Whether you choose Cursor or Copilot or Firebase Studio you’re getting the same results because you’re feeding the same data to the same AI models. The companies that make the AI technologies basically don’t have a moat themselves, they’re basically just PaaS data center operators.

Everything where structured output is involved, from filling in forms based on medical interview transcripts / court proceedings / calls, to an augmented chatbot that can do things for you (think hotel reservations over the phone), to directly generating forms / dashboards / pages in your system.

  • If thats the best current llms can do, my job is secured till retirement

    • The best that current LLMs can do is PhD-level science questions and getting high scores in coding contests.

      Your job? Might be secure for a lifetime, might be gone next week. No way to tell — "intelligence" isn't yet so well understood to just be an engineering challenge, but it is so well understood that the effect on jobs may be the same.

Two off the top of my head:

- https://www.clay.com/

- https://www.granola.ai/

There are a lot of tools in the sales space which fit your criteria.

  • Granola is the exact kind of product I’m criticizing as being extremely basic and barely more than a wrapper. It’s just a meeting transcriber/summarizer, barely provides more functionality than leaving the OpenAI voice mode on during a call and then copying and pasting your written notes into ChatGPT at the end.

    Clay was founded 3 years before GPT 3 hit the market so I highly doubt that the majority of their core product runs on LLM-based AI. It is probably built on traditional machine learning.

I have used LLMs for some simple text generation for what I’m going to call boilerplate, eg why $X is important at the start of a reference architecture. But maybe it saved me an hour or two in a topic I was already fairly familiar with. Not something I would have paid a meaningful sum for. I’m sure I could have searched and found an article on the topic.

> Perhaps the only reason Cursor is so good is because editing code is so similar to the basic function of an LLM without anything wrapped around it.

I think this is an illusion. Firstly, code generation is a big field - it includes code completion, generating entire functions, and even agenting coding and the newer vibe-coding tools which are mixes of all of these. Which of these is "the natural way LLMs work"?

Secondly, a ton of work goes into making LLMs good for programming. Lots of RLHF on it, lots of work on extracting code structure / RAG on codebases, many tools.

So, I think there are a few reasons that LLMs seem to work better on code:

1. A lot for work on it has been done, for many reasons, mostly monetary potential and that the people who build these systems are programmers.

2. We here tend to have a lot more familiarity with these tools (and this goes to your request above which I'll get to).

3. There are indeed many ways in which LLMs are a good fit for programming. This is a valid point, though I think it's dwarfed by the above.

Having said all that, to your request, I think there are a few products and/or areas that we can point to that are transformative:

1. Deep Research. I don't use it a lot personally (yet) - I have far more familiarity with the software tools, because I'm also a software developer. But I've heard from many people now that these are exceptional. And they are not just "thing wrappers on chat", IMO.

2. Anything to do with image/video creation and editing. It's arguable how much these count as part of the LLM revolution - the models that do these are often similar-ish in nature but geared towards images/videos. Still, the interaction with them often goes through natural language, so I definitely think these count. These are a huge category all on their own.

3. Again, not sure if these "count" in your estimate, but AlphaFold is, as I understand it, quite revolutionary. I don't know much about the model or the biology, so I'm trusting others that it's actually interesting. It is some of the same underlying architecture that makes up LLMs so I do think it counts, but again, maybe you want to only look at language-generating things specifically.

  • 1. Deep Research (if you are talking about the OpenAI product) is part of the base AI product. So that means that everything building on top of that is still a wrapper. In other words, nobody besides the people making base AI technology is adding any value. An analogy to how pathetic the AI market is would be if during the SaaS revolution everyone just didn’t need to buy any applications and directly used AWS PaaS products like RDS directly with very similar results compared to buying SaaS software. OpenAI/Gemini/Claude/etc are basically as good as a full blown application that leverage their technology and there’s very limited need to buy wrappers that go around them.

    2. Image/video creation is cool but what value is it delivering so far? Saving me a couple of bucks that I would be spending on Fiverr for a rough and dirty logo that isn’t suitable for professional use? Graphic designers are already some of the lowest paid employees at your company so “almost replacing them but not really” isn’t a very exciting business case to me. I would also argue that image generation isn’t even as valuable as the preceding technology, image recognition. The biggest positive impact I’ve seen involves GPU performance for video games (DLSS/FSR upscaling and frame generation).

    3. Medical applications are the most exciting application of AI and ML. This example is something that demonstrates what I mean with my argument: the normal steady pace of AI innovation has been “disrupted” by LLMs that have added unjustified hype and investment to the space. Nobody was so unreasonably hyped up about AI until it was packaged as something you can chat with since finance bro investors can understand that, but medical applications of neural networks have been developing since long before ChatGPT hit the scene. The current market is just a fever dream of crappy LLM wrappers getting outsized attention.

LLMs make all sorts of classification problems vastly easier and cheaper to solve.

Of course, that isn't a "transformative AI product", just a regular old product that improves your boring old business metrics. Nothing to base a hype cycle on, sadly.

  • Agree 100%.

    We built a very niche business around data extraction & classification of a particular type of documents. We did not have access to a lot of sample data. Traditional ML/AI failed spectacularly.

    LLMs have made this super easy and the product is very successful thanks to it. Customers love it. It is definitely transformative for them.

Is Cursor actually good though? I get so frustrated at how confidently it spews out the completely wrong approach.

When I ask it to spit out Svelte config files or something like that, I end up having to read the docs myself anyway because it can’t be trusted, for instance it will spew out tons of lines to configure every parameter as something that looks like the default when all it needs to do is follow the documentation that just uses defaults()

And it goes out of its way to “optimise” things that actually picks the wrong options versus the defaults which are fine.

This challenge is a little unfair. Chat is an interface not an application.

  • Generating a useful sequence of words or word-like tokens is an application.

    • I would describe that as a method or implementation, not as an application.

      Almost all knowledge work can be described as "generating a useful sequence of words or word like tokens", but I wouldn't hire a screen writer to do the job of a lawyer or a copy editor to do the job of a concierge or an HR director to do the job of an advertising consultant.

      1 reply →

LLMs in data pipelines enable all sorts of “before impossible” stuff. For example, this creates an event calendar for you based on emails you have received:

https://www.indexself.com/events/molly-pepper

(that’s mine, and is due a bugfix/update this week. message me if you want to try it with your own emails)

I have a couple more LLM-powered apps in the works, like next few weeks, that aren’t chat or code. I wouldn’t call them transformative, but they meet your other criteria, I think.

  • What part of this can't be done by a novice programmer who knows a little pattern matching and has enough patience to write down a hundred patterns to match?

    • Long tail, coping with typos, and understanding negation.

      If natural language was as easy as "enough patience to write down a hundred patterns to match", we'd have had useful natural language interfaces in the early 90s — or even late 80s, if it was really only "a hundred".