← Back to context

Comment by ericmcer

3 days ago

People will want what LLMs can do they just don't want "AI". I think having it pervade products in a much more subtle way is the future though.

For example, if you close a youtube browser tab with a comment half written it will pop up an `alert("You will lose your comment if you close this window")`. It does this if the comment is a 2 page essay or "asdfasdf". Ideally the alert would only happen if the comment seemed important but it would readily discard short or nonsensical input. That is really difficult to do in traditional software but is something an LLM could do with low effort. The end result is I only have to deal with that annoying popup when I really am glad it is there.

That is a trivial example but you can imagine how a locally run LLM that was just part of the SDK/API developers could leverage would lead to better UI/UX. For now everyone is making the LLM the product, but once we start building products with an LLM as a background tool it will be great.

It is actually a really weird time, my whole career we wanted to obfuscate implementation and present a clean UI to end users, we want them peaking behind the curtain as little as possible. Now everything is like "This is built with AI! This uses AI!".

> if you close a youtube browser tab with a comment half written it will pop up an `alert("You will lose your comment if you close this window")`. It does this if the comment is a 2 page essay or "asdfasdf". Ideally the alert would only happen if the comment seemed important but it would readily discard short or nonsensical input. That is really difficult to do in traditional software but is something an LLM could do with low effort.

I don't think that's a great example, because you can evaluate the length of the content of a text box with a one-line "if" statement. You could even expand it to check for how long you've been writing, and cache the contents of the box with a couple more lines of code.

An LLM, by contrast, requires a significant amount of disk space and processing power for this task, and it would be unpredictable and difficult to debug, even if we could define a threshold for "important"!

  • I think it's an excellent example to be honest. Most of the time whenever someone proposes some use case for a large language model that's not just being a chat bot, it's either a bad idea, or a decent idea that you'd do much better with something much less fancy (like this, where you'd obviously prefer some length threshold) than with a large language model. It's wild how often I've heard people say "we should have an AI do X" when X is something that's very obviously either a terrible idea or best suited for traditional algorithms.

    Sort of like how most of the time when people proposed a non-cryptocurrency use for "blockchain", they had either re-invented Git or re-invented the database. The similarity to how people treat "AI" is uncanny.

    • > It's wild how often I've heard people say "we should have an AI do X" when X is something that's very obviously either a terrible idea or best suited for traditional algorithms.

      Likewise when smartphones were new, everyone and their mother was certain that random niche thing that made no sense as an app would be a perfect app and that if they could just get someone to make the app they’d be rich. (And of course ideally, the idea haver of the misguided idea would get the lions share of the riches, and the programmer would get a slice of pizza and perhaps a percentage or two of ownership if the idea haver was extra generous.)

      3 replies →

  • I dont want every interaction or partial thought sent to OpenAI or Antropic for them to determine if it was "important" or not. That sounds terrifying and dystopic.

  • The difference between "AI" and "linear regression" is whether you are talking to a VC or an engineer.

> Ideally the alert would only happen if the comment seemed important but it would readily discard short or nonsensical input. That is really difficult to do in traditional software but is something an LLM could do with low effort.

I read this post yesterday and this specific example kept coming back to me because something about it just didn't sit right. And I finally figured it out: Glancing at the alert box (or the browser-provided "do you want to navigate away from this page" modal) and considering the text that I had entered takes... less than 5 seconds.

Sure, 5 seconds here and there adds up over the course of a day, but I really feel like this example is grasping at straws.

  • It’s also trivially solvable with idk, a length check, or any number of other things which don’t need to 100b parameters to calculate.

    • This was a problem at my last job. Boss kept suggesting shoving AI into features, and I kept pointing out we could make the features better with less effort using simple heuristics in a few lines of code, and skip adding AI altogether.

      So much of it nowadays is like the blockchain craze, trying to use it as a solution for every problem until it sticks.

      2 replies →

  • The problem isn't so much the five seconds, it is the muscle memory. You become accustomed to blindly hitting "Yes" every time you've accidentally typed something into the text box, and then that time when you actually put a lot of effort into something... Boom. Its gone. I have been bitten before. Something like the parent described would be a huge improvement.

    Granted, it seems the even better UX is to save what the user inputs and let them recover if they lost something important. That would also help for other things, like crashes, which have also burned me in the past. But tradeoffs, as always.

    • > You become accustomed to blindly hitting "Yes" every time you've accidentally typed something into the text box, and then that time when you actually put a lot of effort into something... Boom. Its gone.

      Wouldn't you just hit undo? Yeah, it's a bit obnoxious that Chrome for example uses cmd-shift-T to undo in this case instead of the application-wide undo stack, but I feel like the focus for improving software resilience to user error should continue to be on increasing the power of the undo stack (like it's been for more than 30 years so far), not trying to optimize what gets put in the undo stack in the first place.

      3 replies →

    • Which is fine! That's me making the explicit choice that yes, I want to close this box and yes, I want to lose this data. I don't need an AI evaluating how important it thinks I am and second guessing my judgement call.

      I tell the computer what to do, not the other way around.

      1 reply →

    • >You become accustomed to blindly hitting "Yes" every time you've accidentally typed something into the text box, and then that time when you actually put a lot of effort into something... Boom. Its gone.

      I'm not sure we need even local AI's reading everything we do for what amounts to a skill issue.

      1 reply →

> Ideally the alert would only happen if the comment seemed important but it would readily discard short or nonsensical input.

That doesn't sound ideal at all. And in fact highlights what's wrong with AI product development nowadays.

AI as a tool is wildly popular. Almost everyone in the world uses ChatGPT or knows someone who does. Here's the thing about tools - you use them in a predictable way and they give you a predictable result. I ask a question, I get an answer. The thing doesn't randomly interject when I'm doing other things and I asked it nothing. I swing a hammer, it drives a nail. The hammer doesn't decide that the thing it's swinging at is vaguely thumb-shaped and self-destruct.

Too many product managers nowadays want AI to not just be a tool, they want it to be magic. But magic is distracting, and unpredictable, and frequently gets things wrong because it doesn't understand the human's intent. That's why people mostly find AI integrations confusing and aggravating, despite the popularity of AI-as-a-tool.

  • > The hammer doesn't decide that the thing it's swinging at is vaguely thumb-shaped and self-destruct.

    Sawstop literally patented this and made millions and seems to have genuinely improved the world.

    I personally am a big fan of tools that make it hard to mangle my body parts.

  • But... A lot of stuff you rely on now was probably once distracting and unpredictable. There are a ton of subtle UX behaviors a modern computer is doing that you don't notice, but if they all disappeared and you had to use windows 95 for a week you would miss.

    That is more what I am advocating for, subtle background UX improvements based on an LLMs ability to interpret a users intent. We had limited abilities to look at an applications state and try to determine a users intent, but it is easier to do that with an LLM. Yeah like you point out some users don't want you to try and predict their intent, but if you can do it accurately a high percentage of the time it is "magic".

    • > subtle UX behaviors

      I'd wager it's more likely to be the opposite.

      Older UIs were built on solid research. They had a ton of subtle UX behaviors that users didn't notice were there, but helped in minor ways. Modern UIs have a tendency to throw out previous learning and to be fashion-first. I've seen this talked about on HN a fair bit lately.

      Using an old-fashioned interface, with 3D buttons to make interactive elements clear, and with instant feedback, can be a nicer experience than having to work with the lack of clarity, and relative laggyness, of some of today's interfaces.

      1 reply →

    • Serious question: what are those things from windows 95/98 I might miss?

      Rose tinted glasses perhaps, but I remember it as a very straightforward and consistent UI that provided great feedback, was snappy and did everything I needed. Up to and including little hints for power users like underlining shortcut letters for the & key.

      5 replies →

    • I remember seeing one of those "kids use old technology" videos, where kids are confused by rotary phones and the like.

      One of the episodes had them using Windows 98. As I recall, the reaction was more or less "this is pretty ok, actually". A few WTFs about dialup modems and such, but I don't recall complaints about the UI.

    • > But... A lot of stuff you rely on now was probably once distracting and unpredictable.

      And nobody relied on them when they were distracting and unpredictable. People only rely on them now because they are not.

      LLMs won't ever be predictable. They are designed not to be. A predictable AI is something different from a LLM.

    • > There are a ton of subtle UX behaviors a modern computer is doing that you don't notice, but if they all disappeared and you had to use windows 95 for a week you would miss.

      Like what? All those popups screaming that my PC is unprotected because I turned off windows firewall?

  • I want magic that works. Sometimes I want a tool to interrupt me! I know my route to work so I'm not going to ask how I should get there today - but 1% of the time there is something wrong with my plan (accident, construction...) and I want the tool to say something. I know I need to turn right to get someplace, but sometimes as a human I'll say left instead: confusing me and the driver where they don't turn right, and AI that realizes who made the mistake would help.

    The hard part is the AI needs to be correct when it doesn't something unexpected. I don't know if this is a solvable problem, but it is what I want.

    • Magic in real life never works 100% of the time. It is all an illusion were some observers understand the trick and others do not. Those that understand it have the potential to break the magic. Even the magician has the ability to fault the trick.

      I want reproducibility not magic.

      3 replies →

>For example, if you close a youtube browser tab with a comment half written it will pop up an `alert("You will lose your comment if you close this window")`. It does this if the comment is a 2 page essay or "asdfasdf". Ideally the alert would only happen if the comment seemed important but it would readily discard short or nonsensical input. That is really difficult to do in traditional software but is something an LLM could do with low effort. The end result is I only have to deal with that annoying popup when I really am glad it is there.

The funny thing is that this exact example could also be used by AI skeptics. It's forcing an LLM into a product with questionable utility, causing it to cost more to develop, be more resource intensive to run, and behave in a manner that isn't consistent or reliable. Meanwhile, if there was an incentive to tweak that alert based off likelihood of its usefulness, there could have always just been a check on the length of the text. Suggesting this should be done with an LLM as your specific example is evidence that LLMs are solutions looking for problems.

  • I've been totally AI-pilled because I don't see why that's of questionable utility. How is a regexp going to tell the difference between "asdffghjjk" and "So, she cheated on me". A mere byte count isn't going to do it either.

    If the computer can tell the difference and be less annoying, it seems useful to me?

    • Who said anything about regexp? I was literally talking about something as simple as "if(text.length > 100)". Also the example provided was distinguishing "a 2 page essay or 'asdfasdf'" which clearly can be accomplished with length much easier than either an LLM or even regexp.

      We should keep in mind that we're trying to optimize for user's time. "So, she cheated on me" takes less than a second to type. It would probably take the user longer to respond to whatever pop up warning you give than just retyping that text again. So what actual value do you think the LLM is contributing here that justifies the added complexity and overhead?

      Plus that benefit needs to overcome the other undesired behavior that an LLM would introduce such as it will now present an unnecessary popup if people enter a little real data and intentionally navigate away from the page (and it should be noted, users will almost certainly be much more likely to intentionally navigate away than accidentally navigate away). LLMs also aren't deterministic. If 90% of the time you navigate away from the page with text entered, the LLM warns you, then 10% of the time it doesn't, those 10% times are going to be a lot more frustrating than if the length check just warned you every single time. And from a user satisfaction perspective, it seems like a mistake to swap frustration caused by user mistakes (accidentally navigating away) with frustration caused by your design decisions (inconsistent behavior). Even if all those numbers end up falling exactly the right way to slightly make the users less frustrated overall, you're still trading users who were previously frustrated at themselves for users being frustrated at you. That seems like a bad business decision.

      Like I said, this all just seems like a solution in search of a problem.

    • Because in _what world_ do I want the computer making value judgements on what I do?

      If I want to close the tab of unsubmitted comment text, I will. I most certainly don’t need a model going “uhmmm akshually, I think you might want that later!”

    • Because the computer behaving differently in different circumstances is annoying, especially when there's no clear cue to the user what the hidden knobs that control the circumstances are.

    • What about counting words based on user's current lang, and prompting off that?

      Close enough for the issue to me and can't be more expensive than asking an LLM?

    • We went from the bullshit "internet of things" to "LLM of things", or as Sheldon from Big Bang Theory put it "everything is better with Bluetooth".

      Literally "T-shirt with Bluetooth", that's what 99.98% of "AI" stickers today advertise.

> Ideally the alert would only happen if the comment seemed important but it would readily discard short or nonsensical input

No, ideally I would be able to predict and understand how my UI behaves, and train muscle memory.

If closing a tab would mean losing valuable data, the ideal UI would allow me to undo it, not try to guess if I cared.

  • Yeah. It's the Apple OS model (we know what's right for you, this is the right way) vs the many other customisable OSes where it conforms to you.

YouTube could use AI to not recommend videos I've already watched, which is apparently a really hard problem.

  • The problem is the people like me who DO rewatch youtube videos. There are a bunch of "Comfort food" videos I turn to sometimes. Like you would rewatch a movie you really enjoy.

    But that's the real problem. You can't just average everyone and apply that result to anyone. The "average of everyone" fits exactly NO ONE.

    The US Navy figured this out long ago in a famous anecdote in fact. They wanted to fit a cockpit to the "average" pilot, took a shitload of measurements of a lot of airmen, and it ended up nobody fit.

    The actual solution was customization and accommodations.

  • It just might be that lot of users watch same videos multiple times. They must have some data on this and see that recommending same videos gets more views than recommending new ones.

    • Is there a way to tell if people are seeking out the same video or or if they are watch it because it was suggested? Especially when 90% of the recommendations are repeats?

      There isn't even an "I've watched this" or "don't suggest this video anymore" option. You can only say "I'm not interested" which I don't want to do because it will seems like it will downrank the entire channel.

      Even if that is the case, I rarely watch the same video, so the recommendation engine should be able to pick that up.

  • My favorite is the new thing where they recommend a "members only" video, from a creator that covers current events, and the video is 2 years old.

  • try disabling collecting the history about the videos you've watched in YouTube settings. There are still some recommendations after that but they are less cringe

At my current work much of our software stack is based on GOFAI techniques. Except no one calls them AI anymore, they call it a "rules engine". Rules engines, like LLMs, used to be sold standalone and promoted as miracle workers in and of themselves. We called them "expert systems" then; this term has largely faded from use.

This AI summer is really kind of a replay of the last AI summer. In a recent story about expert systems seen here on Hackernews, there was even a description of Gary Kildall from The Computer Chronicles expressing skepticism about AI that parallels modern-day AI skepticism. LLMs and CNNs will, as you describe, settle into certain applications where they'll be profoundly useful, become embedded in other software as techniques rather than an application in and of themselves... and then we won't call them AI. Winter is coming.

  • Yeah, the problem with the term "AI" is that it's far too general to be useful.

    I've seen people argue that the goalposts keep moving with respect to whether or not something is considered AI, but that's because you can argue that a lot of things computers do are artificial intelligence. Once something becomes commonplace and well understood, it's not useful to communicate about it as AI.

    I don't think the term AI will "stick" to a given technology until AGI (or something close to it).

You know what that reminds me very much of? That email client thing that asks you "did you forget to add an attachment?". That's been there for 3 decades (if not longer) before LLMs were a thing, so I'll pass on it and keep waiting for that truly amazing LLM-enabled capability that we couldn't dream of before. Any minute, now.

Using such an expensive technology to prevent someone from making a stupid mistake on a meaningless endeavor seems like a complete waste of time. Users should just be allowed to fail.

  • Amen! This is part of the overall societal decline of no failing for anyone. You gotta feel the pain to get the growth.

  • if somone from 1960 saw the quadrillions of cpu cycles we are wasting on absolutely nothing every second, they would have an aneurysm

    • As someone from 1969, but with an excellent circulatory system, I just roll my eyes and look forward to the sound of bubbles bursting whilst billionaires weep.

      2 replies →

> readily discard short or nonsensical input

When "asdfasdf" is actually a package name, and it's in reply to a request for an NPM package, and the question is formulated in a way that makes it hard for LLMs to make that connection, you will get a false positive.

I imagine this will happen more than not.

So, like, machine learning. Remember when people used to call it AI/ML? Definitely wasn't as much money being spent on it back then.

> The end result is I only have to deal with that annoying popup when I really am glad it is there.

Are you sure about that? It will trigger only for what the LLM declares important, not what you care about.

Is anyone delivering local LLMs that can actually be trained on your data? Or just pre made models for the lowest common denominator?

> For example, if you close a youtube browser tab with a comment half written it will pop up an `alert("You will lose your comment if you close this window")`. It does this if the comment is a 2 page essay or "asdfasdf". Ideally the alert would only happen if the comment seemed important but it would readily discard short or nonsensical input. That is really difficult to do in traditional software but is something an LLM could do with low effort.

I agree this would be a great use of LLMs! However, it would have to be really low latency, like on the order of milliseconds. I don't think the tech is there yet, although maybe it will be soon-ish.

It’s because “AI” isn’t a feature. “AI” without context is meaningless.

Google isn’t running ads on TV for Google Docs touting that it uses conflict-free replicated data types, or whatever, because (almost entirely) no one cares. Most people care the same amount about “AI” too.

Would that be ideal though? Adding enormous complexity to solve a trivial problem which would work I'm sure 99.999% of the time, but not 100% of the time.

Ideally, in my view, is that the browser asks you if you are sure regardless of content.

I use LLMs, but that browser "are you sure" type of integration is adding a massive amount of work to do something that ultimately isn't useful in any real way.

I want AI to do useful stuff. Like comb through eBay auctions or Cars.com. Find the exact thing I want. Look at things in photos, descriptions, etc

I don't think an NPU has that capability.

You don't need a LLM for that, a simple Markov Chain can solve that with a much smaller footprint.

No. No-no-no-no-no. I want predictability. I don't want a black box with no tuning handles and no awareness of the context to randomly change the behavior of my environment.

  • I’ve seen some thoroughly unhinged suggestions floating around the web for a UI/UX that is wholly generated and continuously adjusted by an LLM and I struggle to imagine a more nightmarish computing experience.

Bingo. Nobody uses ChatGPT because it's AI. They use it because it does their homework, or it helps them write emails, or whatever else. The story can't just be "AI PC." It has to be "hey look, it's ChatGPT but you don't have to pay a subscription fee."

Hopefully, you could make a browser extension to detect if a HTML form has unsaved changes; it should not require AI and LLM. (This will work better without the document including JavaScripts, but it is possible to work with JavaScripts too.)

I want a functioning search engine. Keep your goofy opinionated mostly wrong LLM out of my way, please.