Comment by pera

5 days ago

It's fascinating how over the past year we have had almost daily posts like this one, yet from the outside everything looks exactly the same, isn't that very weird?

Why haven't we seen an explosion of new start-ups, products or features? Why do we still see hundreds of bug tickets on every issue tracking page? Have you noticed anything different on any changelog?

I invite tptacek, or any other chatbot enthusiast around, to publish project metrics and show some actual numbers.

"Why haven't we seen an explosion of new start-ups, products or features?"

You're posting this question on a forum hosted by YC. Here's a story from March 2024: "YC’s latest W24 batch includes 240 companies. A significant portion of the companies have some AI component, with 63% tagged as “Artificial Intelligence” — a notable increase from 51% in the preceding S23 batch and 29% before that.". https://jamesin.substack.com/p/analysis-of-ycs-latest-w24-ba...

I've not seen the same analysis for more recent batches.

  • I don't think that refutes the parent's point. So many AI companies, but where are the companies _using_ the AI?

    • It's also interesting how when you look at the websites for the new wave of AI B2B SaaS startups, most of the customers they list are other AI B2B SaaS startups. It makes me wonder how much of the "AI industry" is just startups sending VC money back and forth to each other.

      2 replies →

  • Sorry I don't follow, would you mind clarifying your point?

    • Not the person you are asking to clarify. But, I can read it in two ways:

      1. A huge part of the demographic group visiting HN is biased in favor of AI given the sort of startups YN decides to fund.

      2. The large amount of start-ups funded by HN that are related to AI should answer your question.

      I am slightly leaning towards the first one combined with a little bit of the latter one. A lot of people working in startups will be used to building up a structure from scratch where incorporating the latest "thing" is not that big of a deal. It also means they rarely see the long term impact of the code they write.

      They have a huge blind spot for the reality of existing code bases and company structures where introducing these tools isn't as easy and code needs to be maintained for much longer.

    • You said "Why haven't we seen an explosion of new start-ups?" so I replied by pointing out that a sizable percentage of recent YC batches are new AI startups.

      I categorize that as "an explosion", personally. Do you disagree?

      5 replies →

Most likely there’s a slight productivity increase.

The enthusiasts have a cognitive dissonance because they are pretty sure this is huge and we’re living in the future, so they go through various denial strategies when the execs ask them where the money is.

In this case it’s blame. These darned skeptics are ruining it for everyone.

This is an important question. The skepticism tracks with my personal experience - I feel 10-20% more productive but certainly not 5x when measured over a long period of time (say, the last 6 months or more)

I’m nonetheless willing to be patient and see how it plays out. If I’m skeptical about some grandiose claims I must also be equally skeptical and accepting about the possibility of large scale effects happening but not being apparent to me yet.

  • There were many similar transformations in recent decades. I remember first Windows with true Graphics User Interface was big WOW: productivity boost, you can have all those windows and programs running at the same time! Compare it with DOS where you normally had just one active user-facing process.

> Why haven't we seen an explosion of new start-ups, products or features? Why do we still see hundreds of bug tickets on every issue tracking page? Have you noticed anything different on any changelog?

In my personal experience (LLM and code suggestion only) it's because I use LLMs to code unimportant stuff. Actually thinking what I want to do with the business code is exhausting and I'd rather play a little with a fun project. Also, the unit tests that LLMs can now write (and which were too expensive to write myself) were never important to begin with.

Simply put, if we’re living during such a major technological revolution, why does using software suck in such disastrous ways that were unthinkable even ten years ago?

Your argument relies on the idea of an "actual product", what is happening—and I’m seeing it firsthand both in my company’s codebase and in my personal projects—is that AI is contributing more and more to product development. If this trend continues, we may reach a point where 90% of a product is written by AI.

At that stage, the real value will lie in the remaining 10%—the part that requires human judgment, creativity, or architectural thinking. The rest will be seen as routine: simple instructions, redundant CRUD operations, boilerplate, and glue code.

If we focus only on the end result, human will inevitably write less code overall. And writing less code means fewer programming jobs.

  • You said a bunch without saying much. It also doesn't track. If the majority of AI work is supposed to be done by agents, capable of doing the entire process including making PRs. Then, why isn't there an explosion in such PRs on a large amount of open source projects? Even more so, why am I not seeing these PRs on AI related open source projects? If I need to target it even more directly, why am I a not seeing hints of this being applied on code agent repositories?

    Call me naive, but you'd think that these specifically want to demonstrate how well their product works. Making an effort to distinguish PRs that are largely the work of their own agents. Yet, I am not seeing that.

    I have no doubt that people find use in some aspects of these tools. Though I personally more subscribe to the interactive rubber ducky usage of them. But 90% from where I am standing seems like a very, very far way off.

    • From what I've heard anecdotally, there have been a bunch more PRs and bug reports generated by AI. But I've also heard they're generally trash and just wasting the project maintainers' time.

    • More than likely loads of the PR’s you see _are_ mostly AI work, you just don’t know that because the developers cleaned it up and just post it as their own. Most PR’s where I work are like this, from what I see from speaking to the developers.

      1 reply →

    • > Then, why isn't there an explosion in such PRs on a large amount of open source projects?

      People don't like working for free, either by themselves or with an AI agent.

      2 replies →

Maybe because these companies are smaller and fly under the radar. They require less funding team size is small, probably bankrolled by the founders.

At least that's what I do and what I see among friends.

i don't know - I agree we haven't seen changes to our built environment, but as for an "explosion of new start-ups, products" we sort of are seeing that?

I see new AI assisted products everyday, and a lot of them have real usage. Beyond the code-assistants/gen companies which are very real examples, here's an anecdote.

I was thinking of writing a new story, and found http://sudowrite.com/ via an ad, an ai assistant for helping you write, its already used by a ton of journalists and serious writers, and am trying it out.

Then i wanted to plan a trip - tried google but saw nothing useful, and then asked chatgpt and now have a clear plan

  • > I was thinking of writing a new story, and found http://sudowrite.com/ via an ad, an ai assistant for helping you write, its already used by a ton of journalists and serious writers, and am trying it out.

    I am not seeing anything indicating it is actually used by a ton of journalists and serious writers. I highly doubt it is, the FAQ is also paper thin in as far as substance goes. I highly doubt they are training/hosting their own models yet I see only vague third party references in their privacy policy. Their pricing is less than transparent given that they don't really explain how their "credits" translate to actual usage. They blatantly advertise this to be for students, which is problematic in itself.

    This ignores all the other issues around so heavily depending on LLMs for your writing. This is an interesting quirk for starters: https://www.theguardian.com/technology/2024/apr/16/techscape... . But there are many more issues about relying so heavily on LLM tools for writing.

    So this example, to me, is actually exemplifying the issue of overselling capabilities while handwaving away any potential issues that is so prevalent in the AI space.

    • Hey co-founder of Sudowrite here. We indeed have thousands of writers paying for and using the platform. However, we aim to serve professional novelists, not journalists or students. We have some of both using it, but it's heavily designed and priced for novelists making a living off their work.

      We released our own fiction-specific model earlier this year - you can read more it at https://www.sudowrite.com/muse

      A much-improved version 1.5 came out today -- it's preferred 2-to-1 vs Claude in blind tests with our users.

      You're right on the faq -- alas, we've been very product-focused and haven't done the best job keeping the marketing site up to date. What questions do you wish we'd answer there?

      3 replies →

So as an example of what this could look like that would be convincing to me. I started out pretty firmly believing that Rust was a fad.

Then Mozilla and Google did things with it that I did not think were possible for them to do. Not "they wrote a bunch of code with it", stuff like "they eliminated an entire class of bugs from a section of their codebase."

Then I watched a bunch of essentially hobby developers write kernel drivers for brand new architecture, and watched them turn brand new Macbooks into one of the best-in-class ways to run Linux. I do not believe they could have done that with their resources at that speed, using C or C++.

And at that point, you kind of begrudgingly say, "okay, I don't know if I like this, but fine, heck you, whatever. I guess it might genuinely redefine some parts of software development, you win."

So this is not impossible. You can convince devs like me that your tools are real and they work.

And frankly, there are a billion problems in modern computing that are high impact - stuff like Gnome accessibility, competitive browser engines, FOSS UX, collaboration tools. Entire demographics who have serious problems that could be solved by software if there was enough expertise and time and there were resources to solve them. Often, the issue at play is that there is no intersection between people who are very well acquainted with those communities and understand their needs, and people who have experience writing software.

In theory, LLMs help solve this. In theory. If you're a good programmer, and suddenly you have a tool that makes you 4x as productive as a developer: you could have a very serious impact on a lot of communities right now. I have not seen it happen. Not in the enterprise world, but also not in the FOSS world, not in communities with lower technical resources, not in the public sector. And again, I can be convinced by this, I have dismissed tools that I later switched opinions on because I saw the impact and I couldn't ignore the impact: Rust, NodeJS, Flatpak, etc, etc.

The problem is people have been telling me that Coding Assistants (and now Coding Agents) are one of those tools for multiple years now, and I'm still waiting to see the impact. I'm not waiting to see how many companies pick them up, I'm not waiting to see the job market. I'm waiting to see if this means that real stuff starts getting written at a higher quality significantly faster, and I don't see it.

I see a lot of individual devs showing me hobby projects, and a lot of AI startups, and... frankly, not much else.

One of our AI enabled internal projects is moving ~135 faster than before. Of course you can't perfectly compare. New framework, better insights, updated libraries etc.

  • Bless your metric.

    If you end up finishing it in 6 months, are you going to revise that estimate, or celebrate the fact that you don't need to wait until 2092 to use the project?

    • Yeah, these 100x numbers being thrown out are pretty wild. It dawned on me during the Shopify CEO post awhile back. 100x is unfathomable!

      You did a years worth of work in 3 days? That is what 100x means.