Comment by CodingJeebus

1 day ago

Curious to see how this works out. The flight booking example is interesting because it’s one of the last purchase powers I’d want to hand over to an AI.

If it gets a major travel detail wrong, purchases a business class ticket on accident, etc. and I need to adjust the booking by calling the airline, then I’m way less happy than I was if I just bought the ticket myself. Not to mention what happens when Google flights gets a UI refresh and knocks the accuracy rate of the agent down even 10%.

Digital criminals are gonna love it, though.

I’m personally much more interested in automating browser tasks that aren’t economically valuable because that mitigates the risk.

11 comments

CodingJeebus

wujerry2000 1 day ago

UI refreshes knocking down simulator realism is a real issue that we're still trying to solve.

I think this will probably be a mixture of automated QA/engineering and scale.

Another interesting path is actually partnering directly with software providers to offer their platforms as simulators IF they see there is a competitive advantage to training agents to perform well on their UI.

This idea we're really excited about, but it would require a company to see real revenue potential in enabling agentic access vs not. I'd say we're still on the "block them out" phase of the internet (ex. see Cloudflare's recent post about bot detection: https://blog.cloudflare.com/perplexity-is-using-stealth-unde...)

mousetree 1 day ago

Why are flight bookings the go to example always? For most people, booking a flight happens infrequently, is a non-trivial expense (to your point), and is not that burdensome to do yourself.

wujerry2000 1 day ago

We agree that as a demo flight booking is probably overused.
However, in talking with my AI Labs, their perspective on flight booking is a little different. "Solving" flight booking requires the AI agent to solve a LOT of hard problems. Namely, personalization, context, weighing multiple options, interacting with the UI, math, then wrapping that all up into a coherent response. The thought process is IF a computer use agent is able to solve flight booking well, then we will have developed many other powerful primitives that will scale to other problems.
So as a standalone use case, I'm inclined to agree this might not be where the most agent traction is seen. However, as a research/capability goal, there are some generalizations that could apply to other very important use cases.
jedberg 15 hours ago
> and is not that burdensome to do yourself.
I don't know about you, but it takes me hours to book a flight if it's for my family, because I'm usually booking a flight, a car, and a hotel, and I have to constantly min-max the costs between hotels on certain days, flights on certain days, and cars on certain days.
If it's not burdensome for you, then you're either taking very simple trips or you're so rich that you don't care.
- mandeepj 12 hours ago
  
  > I have to constantly min-max the costs between hotels on certain days, flights on certain days, and cars on certain days.
  I agree it's a burdensome chore!
  Just wondering - your hotel stay can't be less than the days between your flight. For car, one can manage to cut down with Uber/public transport, but still turns out to be expensive than a rental car.
  
  1 reply →
fragmede 1 day ago
It's because most people have done it; and it's infrequent and sufficiently expensive that makes it enough of a pain point to make for a good example. Because it's infrequent, most people don't have a rigorous well-practiced system for how to go about it to get the optimal ticket for their particular circumstances for that flight, and because it can be somewhat expensive, there's a bit of a burden taken on in order to optimize for price as well, especially given all the shenanigans airlines play with pricing.
If you're rich, you can just look for the ticket at the time you like on your preferred airline and buy a first class ticket, whatever the price, for whenever you want to fly, even if it's tomorrow. For the rest, that's not practical. So the flight search has to begin a few months out, with the burden of doing multiple searches (in incognito mode) across various airlines and/or aggregators, in order to optimize various factors. This takes a non-trivial amount of time. Add in looking for hotels and rental cars, and for some it's fun, for others it's an annoying burdensome chore that stands in the way of being on vacation.
It's just an example use case though. Similar to how "robot maid" that folds clothes isn't the be-all or end-all for robotics, if an AI is able to perform that task, it's going to have capabilities necessary for performing a wide variety of other tasks.
- mandeepj 12 hours ago
  
  > (in incognito mode)
  I used to do that, but when I cross-compared with normal mode, the prices were the same.

superb_dev 16 hours ago

Airlines will love it too. How long until an AI company gets paid to prefer a certain company

wujerry2000 16 hours ago

I think this is totally going to be the case!
AI vibe coding tools already prefer some solutions over others, probably because of training data distribution/post training preferences. This is leading to massive revenue differences and growth compared to companies that have not optimized to be AI agent preferred/in their training data distribution.
I imagine something similar will happen over time, where companies who are in the training data distribution get used by agents more, while others who neglect this get slowly phased out because systems don't know how to use them (out of distribution).