← Back to context

Comment by mmasu

20 hours ago

we tried to build something similar lately for outbound calls (for simple reminders to partners) and faced massive issues using gpt-4o-realtime-audio. Noise detection, turn detection, random telephony issues (we were using Twilio too), prompt not holding together, and more.

We dropped the project because it would have resulted in a terrible experience for the person on the other side of the phone. Building these things is non trivial.

The plan would have been to A/B test and see what the response would have been (watching NPS and business metrics uplift). Human handoff was always the plan in case things got too tricky for the LLM to handle.

I see some hostility here towards this project and while I share many concerns, it is very naive to think that these services won’t be massively leveraged going forward. An AI agent can handle things as well as humans (not in our case but there are good services out there, i.e. Parloa) and the key elements are the same as all the other agentic based workflows:

- narrow use cases

- human in the loop ready to pick up/steer/correct

we will see a lot more of this and as LLM capabilities improve, it will only get better - it is inevitable at this point and might (_might_) result in a better experience for customers in some cases.

Nevertheless I also see the possibility that we will go full circle and we will always reach for a human, maybe showing up in person in a physical office to make sure cases or requests are handled well… or not :-)

I've read this comment twice and I genuinely can't understand it.

Uh, so your own attempt at a similar project didn't work and was a terrible experience and the fundamentals of the system are specific and still require babysitting. But it's inevitable (???) that it'll get better... and this improvement only MIGHT make things better for people, only some of the time?

I'm not alone in being unimpressed by this, right? Nothing about what was written here sounds... good? Even the most optimistic part is "well, maybe it might be good, sometimes". Like, this sucks. This is a bad system that doesn't work and makes things worse.

  • what I mean is: building these systems is nontrivial, but if done well it can help. Imagine non being in an endless queue on a phone call when having to do a simple task through a customer center call, or having a phone reminder with more information and less noise than from a written notification. The fact that I failed at it (for lack of experience and resources) does not mean it should just be shrugged off as useless or impractical. Some companies offer this service and it works just fine for narrow use cases.