← Back to context

Comment by TeMPOraL

16 days ago

> The problem then shifts from "can we extract this data from the PDF" to "how do we teach an LLM to extract the data we need, validate its performance, and deploy it with confidence into prod?"

A smart vendor will shift into that space - they'll use that LLM themselves, and figure out some combination of finetunes, multiple LLMs, classical methods and human verification of random samples, that lets them not only "validate its performance, and deploy it with confidence into prod", but also sell that confidence with an SLA on top of it.

That's what we did with our web scraping saas - with Extraction API¹ we shifted web scraped data parsing to support both predefined models for common objects like products, reviews etc. and direct LLM prompts that we further optimize for flexible extraction.

There's definitely space here to help the customer realize their extraction vision because it's still hard to scale this effectively on your own!

1 - https://scrapfly.io/extraction-api

What's the value for a customer to pay a vendor that is only a wrapper around an LLM when they can leverage LLMs directly? I imagine tools being accessible for certain types of users, but for customers like those described here, you're better off replacing any OCR vendor with your own LLM integration

Software is dead, if it isn't a prompt now, it will be a prompt in 6 months.

Most of what we think software is today, will just be a UI. But UIs are also dead.

  • I wonder about these takes. Have you never worked in a complex system in a large org before?

    OK, sure, we can parse a PDF reliably now, but now we need to act on that data. We need to store it, make sure it ends up with the right people who need to be notified that the data is available for their review. They then need to make decisions upon that data, possible requiring input from multiple stakeholders.

    All that back and forth needs to be recorded and stored, along with the eventual decision and the all supporting documents and that whole bundle needs to be made available across multiple systems, which requires a bunch of ETLs and governance.

    An LLM with a prompt doesn't replace all that.

    • We need to think terms of light cones, not dog and pony take downs of whatever system you are currently running. See where thigns are going.

      I have worked in large systems, both in code and people, compilers, massive data processing systems, 10k business units.

      2 replies →

  • Can you prompt a salesforce replacement for an org with 100 000 employees?

    • Yesterday I read an /r/singularity post in awe cus of a screenshot of a lead management platform from OAI in a japan convention supposedly meant a direct threat to SalesForce. Like, yeah sure buddy.

      I would say most acceleracionist/AI bulls/etc don't really understand the true essential complexity in software development. LLMs are being seen as a software development silver bullets, and we know what happens with silver bullets.

      6 replies →

  • Software without data moats, vender lock-in, etc sure will. All the low handing fruit saas is going to get totally obliterated by LLM built-software.

    • If I'm an autobody shop or some other well-served niche, how unhappy with them do I have to be to decide to find a replacement, either a competitor of theirs that used an LLM, or bring it in house and go off and find a developer to LLM-acceleratedly make me a better shopmonkey? And there are the integrations. I don't own a low hanging fruit SaaS company, but it seems very sticky, and since the established company already exists, they can just lower prices to meet their competitors.

      B2B is different from B2C, so if one vendor has a handful of clients and they won't switch away, there's no obliterating happening.

      What's opened up is even lower hanging fruit, on more trees. A SaaS company charging $3/month for the left-handed underwater basket weaver niche now becomes viable as a lifestyle business. The shovels in this could be supabase/similar, since clients can keep access to their data there even if they change frontends.

      2 replies →

    • The only thing that will be different for most is vendor lock-in will be to LLM vendors.

>A smart vendor will shift into that space - they'll use that LLM themselves

It's a bit late to start shifting now since it takes time. Ideally they should already have a product on the market.

  • There's still time. The situation in which you can effectively replace your OCR vendor with hitting LLM APIs via a half-assed Python script ChatGPT wrote for you, has existed for maybe few months. People are only beginning to realize LLMs got good enough that this is an option. An OCR vendor that starts working on the shift today, should easily be able to develop, tune, test and productize an LLM-based OCR pipeline way before most of their customers realize what's been happening.

    But it is a good opportunity for a fast-moving OCR service to steal some customers from their competition. If I were working in this space, I'd be worried about that, and also about the possibility some of the LLM companies realize they could actually break into this market themselves right now, and secure some additional income.

    EDIT:

    I get the feeling that the main LLM suppliers are purposefully sticking to general-purpose APIs and refraining from competing with anyone on specific services, and that this goes beyond just staying focused. Some of potential applications, like OCR, could turn into money printers if they moved on them now, and they all could use some more cash to offset what they burn on compute. Is it because they're trying to avoid starting an "us vs. them" war until after they made everyone else dependent on them?

    • To the point after your edit, I view it like the cloud shift from IaaS to PaaS / SaaS. Start with a neutral infrastructure platform that attracts lots of service providers. Then take your pick of which ones to replicate with a vertically integrated competitor or manager offering once you are too big for anyone to really complain.

  • Never underestimate the power of the second mover. Since the development is happening in the open, someone can quickly cobble up the information and cut directly to the 90% of the work.

    Then your secret sauce will be your fine tunes, etc.

    Like it or not AI/LLM will be a commodity, and this bubble will burst. Moats are hard to build when you have at least one open source copy of what you just did.

    • And next year your secret sauce will be worthless because the LLMs are that much better again.

      Businesses that are just "today's LLM + our bespoke improvements" won't have legs.