Comment by uxhacker

2 months ago

I recently have also been thinking about Jef Raskin’s book The Humane Interface. It feels increasingly relevant to now.

Raskin was deeply concerned with how humans think in vague, associative, creative ways, while computers demand precision and predictability.

His goal was to humanize the machine through thoughtful interface design—minimizing modes, reducing cognitive load, and anticipating user intent.

What’s fascinating now is how AI, changes the equation entirely. Instead of rigid systems requiring exact input, we now have tools that themselves are fuzzy, and probabilistic.

I keep thinking that the gap Raskin was trying to bridge is closing—not just through interface, but through the architecture of the machine itself.

So AI makes Raskin’s vision more feasible than ever but also challenges his assumptions:

Does AI finally enable truly humane interfaces?

"no" .. intelligent appliance was the product that came out of Raskin's thinking..

I object to the framing of this question directly -- there is no definition of "AI" . Secondly, the humane interface is a genre that Jef Raskin shaped and re-thought over years.. A one-liner here definitely does not embody the works of Jef Raskin.

Off the top of my head, it appears that "AI" enables one-to-many broadcast, service interactions and knowledge retrieval in a way that was not possible before. The thinking of Jef Raskin was very much along the lines of an ordinary person using computers for their own purposes. "AI" in the supply-side format coming down the road, appears to be headed towards societal interactions that depersonalize and segregate individual people. It is possible to engage "AI" whatever that means, to enable individuals as an appliance. This is by no means certain at this time IMHO.

> Does AI finally enable truly humane interfaces?

Perhaps, but I don't think we're going to see evidence of this for quite a while. It would be really cool if the computer adapted to how you naturally want to use it, though, without forcing you through an interface where you talk/type to it.

"Does AI finally enable truly humane interfaces?"

I think it does; LLMs in particular. AI also enables a ton of other things, many of them inhumane, which can make it very hard to discuss these things as people fixate on the inhumane. (Which is fair... but if you are BUILDING something, I think it's best to fixate on the humane so that you conjure THAT into being.)

I think Jef Raskin's goal with a lot of what he proposed was to connect the computer interface more directly with the user's intent. An application-oriented model really focuses so much of the organization around the software company's intent and position, something that follows us fully into (most of) today's interfaces.

A magical aspect of LLMs is that they can actually fully vertically integrate with intent. It doesn't mean every LLM interface exposes this or takes advantage of this (quite the contrary!), but it's _possible_, and it simple wasn't possible in the past.

For instance: you can create an LLM-powered piece of software that collects (and allows revision) to some overriding intent. Just literally take the user's stated intent and puts it in a slot in all following prompts. This alone will have a substantial effect on the LLMs behavior! And importantly you can ask for their intent, not just their specific goal. Maybe I want to build a shed, and I'm looking up some materials... the underlying goal can inform all kinds of things, like whether I'm looking for used or new materials, aesthetic or functional, etc.

To accomplish something with a computer we often thread together many different tools. Each of them is generally defined by their function (photo album, email client, browser-that-contains-other-things, and so on). It's up to the human to figure out how to assemble these, and at each step it's easy to lose track, to become distracted or confused, to lose track of context. And again an LLM can engage with the larger task in a way that wasn't possible before.

  • Tell me, how does doing any of the things you've suggested help with the huge range of computer-driven tasks that have nothing to do with language? Video editing, audio editing, music composition, architectural and mechanical design, the list is vast and nearly endless.

    LLMs have no role to play in any of that, because their job is text generation. At best, they could generate excerpts from a half-imagined user manual ...

    • Because some LLMs are now multimodal—they can process and generate not just text, but also sound and visuals. In other words, they’re beginning to handle a broader range of human inputs and outputs, much like we do.

      3 replies →

    • Everything has to do with language! Language is a way of stating intention, of expression something before it exists, of talking about goals and criteria. Everything example you give can be described in language. You are caught up in the mechanisms of these tools, not the underlying intention.

      You can describe your intention in any of these tools. And it can be whatever you want... maybe your intention in an audio editor is "I need to finish this before the deadline in the morning but I have no idea what the client wants" and that's valid, that's something an LLM can actually work with.

      HOW the LLM is involved is an open question, something that hasn't been done very well, and may not work well when applied to existing applications. But an LLM can make sense of events and images in addition to natural language text. You can give an LLM a timestamped list of UI events and it can actually infer quite a bit about what the user is actually doing. What does it do with that understanding? We're going to have to figure that out! These are exciting times!

    • What if you could pilot your video editing tool through voice? Have a multimodal LLM convert your instructions into some structured data instruction that gets used by the editor to perform actions.

      10 replies →

> Does AI finally enable truly humane interfaces?

This is something I keep tossing over in my head. Multimodal capabilities of frontier models right now are fantastic. Rather than locking into a desktop with peripherals or hunching over a tiny screen and tapping with thumbs we finally have an easy way to create apps that interact "natively" through audio. We can finally try to decipher a user's intent rather than forcing the user to interact through an interface designed to provide precise inputs to an algorithm. I'm excited to see what we build with these things.