Comment by raw_anon_1111

1 month ago

According to everything that has been reported, both the Google Assistant and Alexa are less reliable now that they are LLM based.

I don’t know why, in my much smaller scale experience, converting to an LLM “tools” based approached from the Intent based approach is much more reliable.

Siri was behind pre LLM because Apple didn’t throw enough monkeys at the problem.

Everything that an assistant can do is “hardcoded” even when it is LLM based.

Old way: voice -> text -> pattern matching -> APIs to back end functionality.

New Way: voice -> text -> LLM -> APIs to back end functionality.

How often have you come across a case where Siri understood something and said “I can’t do that”? That’s not an AI problem. That’s Apple not putting people on the intent -> API mapping. An LLM won’t solve the issue of exposing the APIs to Siri.

5 comments

raw_anon_1111

sodafountan 1 month ago

I don't really want to continue on with this discussion, as AI in general can be absolutely infuriating. It's one of those buzzwords that's just being thrown around without a care in the world at this point, but do you have any links to those reports? I'd be willing to bet that if Google Assistant and Alexa were being run properly, then they shouldn't be less reliable when working with an LLM.

I don't think Apple didn't have enough people working on Siri, I think they had too many people working on the wrong problems. If they had any eye on the industry like they did in their heyday when Jobs was at the helm they would've been all over LLMs like Sam Altman was with his OpenAI startup. This report of SIRI using Gemini going forward is one of the biggest signs that Apple is failing to innovate, let alone the constant rehashing of Iphone and IOS. They haven't been innovative in years.

And yes that's the point I was trying to make, AI assistants shouldn't be hardcoded to do certain things, that's not AI - but with Apple's marketing, they'd have you believe that SIRI is what AI should be, except now everyone's wiser, everyone and their grandmother has used ChatGPT which is really what SIRI should have been. Changes to the IOS API should roll out and an LLM backed AI assistant should be able to pick up on those changes automatically, SIRI should be an LLM trained on Apple Data, its APIS, your personal data (emails, documents,etc.), and a whole host of publicly available data. This would actually make SIRI useful going into the future.

Again, if Apple's marketing team were to be believed, SIRI would be the most advanced LLM on the planet, but from a technical standpoint, they haven't even started training an LLM at all. It's nonsense.

raw_anon_1111 1 month ago
AI assistants can’t magically “do stuff” without “tools” exposed. A tool is always an API that someone has to write an expose to the orchestrator whether it’s AI or just a dumb intent system.
And ChatGPT can’t really “do anything” without access to tools.
You don’t want an LLM to have access to your total system without deterministic guardrails and limiting the permissions of what the tools can do just like you wouldn’t expose your entire database with admin privileges to the web.
You also don’t want to expose too many tools to the system. Every tool you expose you also have to have a description of what the tool does, the parameters it needs etc. Ot will both blow up your context window and start hallucinating. I suspect that’s why Alexa and Google Assistant got worse when they became LLM based and my narrow use cases don’t suffer those problems when I started implementing LLM based solutions.
And I am purposefully yada yada yadaing some of the technical complexities and I hate the entire “appeal to authority” thing. But I worked at AWS for 3.5 years until 2 years ago and I was at one point the second highest contributor to a popular open source “AWS Solution” that almost everyone in the niche had heard of dealing with voice automation. I really do know about this space.
- sodafountan 1 month ago
  
  yeah, there's just really nothing left to discuss. Apple could have been a real leader in the AI space had they hired the right researchers to implement LLMs and beaten OpenAI to the punch.
  I understand that AI assitants need access to tools in order to do anything on a computer, I've been working with AI augmented development for a few months now and everytime I need a prompt to run a tool it asks for permission first, or just straight up gives me the command to paste into a terminal.
  ideally this would have been abstracted away if siri were an LLM, with Apple controlling which apis siri has access to and bipassing user confirmation all together.
  It would have been neat if I were able to say, "Hey, Siri: send a text to John Smith with a playfully angry prose thanking him for not inviting me to the party". which would have the LLM automatically craft the message and send upon confirmation, perhaps with a disclaimer "made with ai" at the bottom of the text or something along those lines.
  "Hey, Siri: What's the weather in Los Angeles, California" would fallback to a web api endpoint.
  "Hey, Siri: How do I compile my C# application without Visual Studio" would provide step-by-step instructions on working with MSBUILD.
  different prompts would fallback on different apis that only Apple would expose. Obviously not allowing the user to gain root access to the system, which is what you would expect from Apple.
  I guess from a purely technical standpoint, you'd train two models, one as "Safe" and the other as "Unsafe". "Safe" is what would be used by the end-user, allowing them to access safe data, apis, messaging, web.. you name it. "Unsafe" would be used internally at Apple and would have system-wide access, access to unlimited user data, root privileges, perhaps unsafe image generation and web search... basically no limit to what an LLM could achieve.
  
  2 replies →