Comment by a4isms

1 day ago

Short reply:

I agree, it only goes half-way.

Elaboration:

I like the "horseless carriage" metaphor for the transitionary or hybrid periods between the extinction of one way of doing things and the full embrace of the new way of doing things. I use a similar metaphor: "Faster horses," which is exactly what this essay shows: You're still reading and writing emails, but the selling feature isn't "less email," it's "Get through your email faster."

Rewinding to the 90s, Desktop Publishing was a massive market that completely disrupted the way newspapers, magazines, and just about every other kind of paper was produced. I used to write software for managing classified ads in that era.

Of course, Desktop Publishing was horseless carriages/faster horses. Getting rid of paper was the revolution, in the form of email over letters, memos, and facsimiles. And this thing we call the web.

Same thing here. The better interface is a more capable faster horse. But it isn't an automobile.

> You're still reading and writing emails, but the selling feature isn't "less email," it's "Get through your email faster."

The next logical step is not using email (the old horse and carriage) at all.

You tell your AI what you want to communicate with whom. Your AI connects to their AI and their AI writes/speaks a summary in the format they prefer. Both AIs can take action on the contents. You skip the Gmail/Outlook middleman entirely at the cost of putting an AI model in the middle. Ideally the AI model is running locally not in the cloud, but we all know how that will turn out in practice.

Contact me if you want to invest some tens of millions in this idea! :)

  • Taking this a step farther; both AIs also deeply understand and advocate for their respective 'owner', so rather than simply exchanging a formatted message, they're evaluating the purpose and potential fit of the relationship writ large (for review by the 'owner' of course..). Sort of a preliminary discussion between executive assistants or sales reps -- all non-binding, but skipping ahead to the heart of the communication, not just a single message.

> > Seems like many, if not all, AI applications, when taken to the limit, reduce the need of interaction between humans to 0.

> Same thing here. The better interface is a more capable faster horse. But it isn't an automobile.

I'm over here in "diffusion / generative video" corner scratching my head at all the LLM people making weird things that don't quite have use cases.

We're making movies. Already the AI does things that used to cost too much or take too much time. We can make one minute videos of scale, scope, and consistency in just a few hours. We're in pretty much the sweet spot of the application of this tech. This essay doesn't even apply to us. In fact, it feels otherworldly alien to our experience.

Some stuff we've been making with gen AI to show you that I'm not bullshitting:

- https://www.youtube.com/watch?v=Tii9uF0nAx4

- https://www.youtube.com/watch?v=7x7IZkHiGD8

- https://www.youtube.com/watch?v=_FkKf7sECk4

Diffusion world is magical and the AI over here feels like we've been catapulted 100 years into the future. It's literally earth shattering and none of the industry will remain the same. We're going to have mocap and lipsync, where anybody can act as a fantasy warrior, a space alien, Arnold Schwarzenegger. Literally whatever you can dream up. It's as if improv theater became real and super high definition.

But maybe the reason for the stark contrast with LLMs in B2B applications is that we're taking the outputs and integrating them into things we'd be doing ordinarily. The outputs are extremely suitable as a drop-in to what we already do. I hope there's something from what we do that can be learned from the LLM side, but perhaps the problems we have are just so wholly different that the office domain needs entirely reinvented tools.

Naively, I'd imagine an AI powerpoint generator or an AI "design doc with figures" generator would be so much more useful than an email draft tool. And those are incremental adds that save a tremendous amount of time.

But anyway, sorry about the "horseless carriages". It feels like we're on a rocket ship on our end and I don't understand the public "AI fatigue" because every week something new or revolutionary happens. Hope the LLM side gets something soon to mimic what we've got going. I don't see the advancements to the visual arts stopping anytime soon. We're really only just getting started.

  • You make some very strong claims and presented material. I hope I am not out of line if I give you my sincere opinion. I am not doing this to be mean, to put you down or to be snarky. But the argument you're making warrants this response, in my opinion.

    The examples you gave as "magical", "100 years into the future", "literally earth shattering" are very transparently low effort. The writing is pedestrian, the timing is amateurish and the jokes just don't land. The inflating tea cup with magically floating plate and the cardboard teabag are... bad. These are bad man. At best recycled material. I am sorry but as examples of why using automatically generated art they are making the opposite argument from what you think you're making.

    I categorically do not want more of this. I want to see crafted content where talent shines through. Not low effort, automatically generated stuff like the videos in these links.

    • I appreciate your feedback.

      If I understand correctly, you're an external observer who isn't from the film or media industry? So I'll reframe the topic a little.

      We've been on this ride for four years, since the first diffusion models and "Will Smith eating spaghetti" videos. We've developed workflows such as sampling diffusion generations, putting them into rotational video generation, and creating LoRAs out of synthetic data to scale up points in latent space. We've used hundreds of ControlNet modules and Comfy workflows. We've hooked this up to blender and depth maps and optical flow algorithms. We've trained models, Frankensteined schedulers, frozen layers, lobotomized weights, and read paper after paper. I say all of this because I think it's easy to under appreciate the pace at which this is moving unless you're waist deep in the stuff.

      We're currently using and demonstrating workflows that a larger studio like Disney is absolutely using with a larger budget. Their new live action Moana film uses a lot of the techniques we're using, just with a larger army of people at their disposal.

      So then if your notion of quality is simply how large the budget or team making the film is, then I think you might need to adjust your lenses. I do agree that superficial artifacts in the output can be fixed with more effort, but we're just trying to move fast in response to new techniques and models and build tools to harness them.

      Regardless of your feelings, the tech in this field will soon enable teams of one to ten to punch at the weight of Pixar. And that's a good thing. So many ideas wither on the vine. Most film students never get the nepotism card or get "right time, right place, right preparation" to get to make the films of their dreams. There was never enough room at the top. And that's changing.

      You might not like what you see, but please don't advocate to keep the written word as a tool reserved only for the Latin-speaking clergy. We deserve the printing press. There are too many people who can do good things with it.

      2 replies →