Comment by aurareturn
1 day ago
In an agentic world, the OS needs to be completely rethought. For example, every single app functionality should be exposable via an API while remaining human friendly.
I think OpenAI designing their own phone is the next logical step. I hope they succeed which should bring major competition to Apple and Android.
This will not happen. None of the existing apps people use daily on their phones have any incentive to support this. Social media wants the people to doomscroll, shopping apps and booking sites want to use their own dark patterns to make people believe they get a special discount if they buy _now_ and everything else just wants users to see the ads. Why on earth would they offer convenient hooks for AI chatbots?
It's even more fascinatingly dumb to have this discussion like 2 or so years after every major platform decided to kill any notion of 3rd party clients they used to support.
Yes, in an ideal world, that'd be great for both humans and LLMs, but we are about as far from that ideal world as we could be. You can't even do some of the "advanced actions" as a human with human-level reflexes without encountering a captcha, but sure, all of a sudden, everyone will just decide to make their bread and butter that is data easier to explore via an LLM.
Competition. If I ask my OS-level AI assistant to find a social media reel about a elephant dancing, the social media app that exposes a set of APIs for an AI agent might get used more.
Watch how fast Meta adds this if a new hot shot social media app succeeds by designing for AI agents controlled by users.
>Competition. If I ask my OS-level AI assistant to find a social media reel about a elephant dancing, the social media app that exposes a set of APIs for an AI agent might get used more.
This is the exact opposite of what will happen (and in fact what has happened). Reddit is suing Perplexity right now for scraping.
Meta will not serve content to some other app for free - for what benefit? They will not see advertising data.
2 replies →
Having used a chatbot to find a reel Meta was censoring from search in the past... I'm not sure how well the incentives align
Actually this more or less describes how accessibility APIs work.
Not really. For the most part, accessibility APIs provide programmatic interfaces to user interfaces, application APIs provide semantically meaningful interfaces to application functionality.
A closer analogue would be AppleScript, or rather, the underlying Apple Event and Open Scripting Architecture functionality supplied by the OS to support AppleScript, that allowed applications to expose these interfaces along with metadata documenting them, and for external tools to record manually performed tasks across applications as programs expressed in terms of these interfaces to make them easier to use (this last bit, while not strictly required, is convenient, and especially useful for less technical users).
If you're familiar with VBA in Microsoft Office applications, sort of like that, except with support provided by OS APIs that could be used by any application that chose to implement scripting support, official guidance from Apple suggesting that all well-designed applications should be scriptable and recordable, and application design patterns and frameworks designed with scriptability and recordability in mind.
Note that I use the past tense here, despite AppleScript still being available in macOS, because it is not well-supported by modern applications.
https://dl.acm.org/doi/epdf/10.1145/1238844.1238845
because the social media sites that do will outcompete once people get personal AI coaches that tell them to use technology that is better for them.
How is an AI posting on your social media better for you?
3 replies →
These people are delusional and want to build a world thats convenient for them to accomplish things lazily with LLMs.
There are no shortcuts in life and its just expensive text autocomplete.
"Lets spin up $750k in GPUs full throttle to scrape a web page with my $200.00 CC subscription."
Everyone is delusional.
At the beginning of the internet we were promised the free flow of digital information between computers, peer-to-peer. What we got was silos of content each fighting each other to make sure that the silos stay intact with DRM.
I could imagine an AI future where agentic shopping companies who promise me the best deal are pitted against Walmart and Amazon, trying to algorithmically squeeze me for $2 more- just two bots playing a cat and mouse game to save me a few bucks.
For some reason a lot of tech ends up in these antagonistic monopolies- Apple wants to sell privacy aware devices as a product feature, Google wants give you mail and maps, but sell your data. Despite any appearances neither give a shit about you, even if you benefit from the dynamic.
I still have to understand what my AI agents could do that I don't want to do myself. Buy stuff? No thanks, I want to see what I buy. I think that they are 99% a solution in search of a problem.
My family (unfortunately) uses InstaCart and probably 15% of items are a shitty "replacement" "not what I wanted". For time sensitive items, having the shitty replacement item NOW is better than having to wait for the "item I actually wanted", so we often just accept the inferior product. This is a dark pattern that I could see AI adopting -- it buys tons of cheap crap you didn't want, some of it was right, and you're left with a mess of returns to sort out, esp. if those returns require you to physically take some sort of action like physically returning the item to the store
Same. Well the biggest thing I don't want to do that they could help with is work. But in the cases where it can do that for me, there's no world where that benefit goes to me rather than my employer.
Well, that's the very nature of the employer / employee relationship. In my case I write software for my customers and I trade time for money. If I use an AI to write code two times faster my daily rate doesn't double. However I can keep my costumers.
That's only another step in the path I experienced since the 80s, when I had to type every single character because there was no auto complete, no command line history, very few libraries. I was very good at writing trees, hash tables, linked lists and so was everybody else. Nobody would hire me if I were that slow at writing code today.
Everything exposed programmatically would have been great even without agents—the NixOSes and Emacses of the world show just how amazing a fully flexible and programmable world would be—but I'm glad that the advent of AI is getting people invested in this vision :P
> I think OpenAI designing their own phone is the next logical step. I hope they succeed which should bring major competition to Apple and Android.
This is not going to happen, or if it does it will just be Android (like Samsung reskins/modifies it) and it will certainly use Google Play Services.
Openai should not design a phone... They should try making money first
Nonsense. Don't you know how bubbles work? Everyone does massive rushes for all the low hanging and medium hanging fruit. The the bubble pops and the randomized carnage of companies big and small being destroyed is sifted through by the next wave of companies actually intended to make money.
The good ideas and the bad ideas don't signal success in a bubble, nor does making money or not. Its random and any notion of "this was a good business model and that was bad" is post-hoc rationalization. The number of people who make fun of pets.com but order from chewy.com is a prime example of this.
> In an agentic world, the OS needs to be completely rethought
Isn't that what Apple is doing with its Foundation Models Framework?[0] Developers can integrate Apples on device llm that includes things like tool calling. I don't write Apple specific apps so not sure what can actually be done with it, but it looks promising and something Apple already seems to think things are headed.
> I think OpenAI designing their own phone is the next logical step
ChatGPT is already integrated into Apple Intelligence for those that want to use that instead of Apples model -- I don't see OpenAI trying to change lanes into phone making when they can focus on doing what they know while collecting a large check from Apple
[0] https://developer.apple.com/documentation/foundationmodels
"In an agentic world, the OS needs to be completely rethought" - if AI is progressing as fast as we think it is, I don't think we'll be interested in waiting for the world to rebuild all the legacy tooling from the OS up. For new stuff, that'd be great.
I imagine the AIs will get a lot better at intercepting things at an intermediate level - API calls under the hood, etc. Probably much better (and cheaper) vision abilities, and perhaps even deeper integration into the machine code itself. It's really hard to anticipate what an advanced model will be capable of 5 years from now.
Maybe not the same approach, but start with a kernel that acts as a governed membrane for everything else?
Open source research/project I have been exploring on the topic: https://aevum.build/learn/architecture/
> In an agentic world, the OS needs to be completely rethought. For example, every single app functionality should be exposable via an API while remaining human friendly.
So, like a Unix system?
We used to have this. It was called OLE Automation.
yep, and applescript...
i'm really not sure companies will allow their apps to be automated so easily, and the reason is api abuse (think of a saas where you can upload file attachments for example); you'd either end up banned or throttled pretty fast, and in the end the company will be like "cost > opportunity" and just close it off (and its like this already, llms just make this worse)
This is like insisting - after the problem turns out to be harder than thought - that the worlds roads need to be completely redone to make them self driving friendly, so self driving can work.
Isn’t the whole ‘promise’ of AI that it doesn’t need any of those things?
We have a much better chance of an ai-addressable Harmony OS version than of OpenAI making a serious competitor.
It doesn't need to be mobile. The AI-first OS will be headless, undoubtedly.
Humans would be the second-class users of said OS, which can generate UIs on demand as needed.
I've thought about this quite a bit. Started implementing as a side project, but I have too many side projects at the moment...
In other words, AppleScript in the late '90s.
That’s actually what the Reflex plugin behind the APIs in the benchmark does. It creates APIs from your app’s event handlers, thereby providing a stateful way for agents to navigate apps.
It’s why we did this benchmark :) - reflex team member
Android is working on it. See AppFunctions.
https://developer.android.com/ai/appfunctions
We'll just close the loop with a systemd MCP, set the shell to /usr/bin/codex, and find some other way to pay the bills.
Perfect.
One of the most seductive (and destructive) forces in software is the desire to rewrite from scratch because rewrites never, ever, ever go as planned. With AI, we're now thinking it's a good idea to rewrite the entire platform from the ground-up. Wild.
except every single piece of progress that we have is the result of trying to do things a different way. so unless you really think we've reached the pinnacle of operating system design, there has to be some room for this?
There's a very big difference between building onto an existing system and rewriting from the ground up. I'm not opposed to making progress and trying things differently, but saying things like "we need to completely rethink the operating system" is like saying "we need to completely redesign New York City". The most effective progress is incremental, not throwing the old system away wholesale.
The modern javascript ecosystem is a perfect example of what happens when everyone tries to rebuild from scratch and it's a nightmare.
Presumably on Linux at least apps could just expose a DBus API? The machinery for this is already in place as far as I can tell.
Why not use the same acc disability features?
And when the agent fucks up badly (as we've seen over and over again) who will be held accountable? The user?
The GUI was a mistake. Long live the shell!
Ah yes. The trains everywhere approach to self driving cars.
The future is "dark OSs" - OSes with no human users.
Launched to nuclear fanfare on August 29th.
Lots of apps actually do have all their functionality exposable via an API - but it's an internal API that's hidden from the user.
[dead]