Comment by concinds

15 hours ago

The second half of your comment is a go-to-market concern but doesn't feel so relevant for a research prototype. It could be done with a private local model too, maybe not by Google.

But I don't think the voice problem is surmountable. I closed their image editing demo when I saw it required a mic.

It would be appealing as a Spotlight-like text pop-up interface where you type instructions, which would work in social/office environments, but that might only appeal to power users.

This will sound like another brick in the paved road to dystopia but I'm kinda bullish on equipment that can recognize subvocalization. Or at least let me have a small drawing tablet with a stylus (think etch-a-sketch or Wacom Intuos) because at this point I'd rather practice writing and do away with typing altogether (even though I enjoy typing for typing's sake via MonkeyType).

  • I've been dreaming about that for 20 years. And then use it for people to communicate while sleeping.

Yeah I think there could be something to the integration of AI in an operating system so that it can handle things going on in different applications the same way you can already copy and paste between things.

But if it's going to require phoning home to some Google/OpenAI/whoever then forget it. I don't want a constant connection to my OS from one of these companies.

It seems that if we ultimately want to "move at the speed of thought," it will require speech.

  • > It seems that if we ultimately want to "move at the speed of thought," it will require speech.

    Except for the large majority of people who read, type, and click way faster than they can talk. Especially for visual things it’s way faster to drag a rectangle than to describe what you want.

    A lot of us also aren’t linear verbal thinkers. It would take minutes to hours to verbalize concepts we can grasp visually/schematically in seconds.

    Great book on the topic: https://www.goodreads.com/book/show/60149558-visual-thinking

    • Most people speak at about 150 wpm, but very few can type that fast. But reading and gesturing are fast, which is what TFA is about, combining reading and gesturing with speech.

      2 replies →

  • There's the adage that writing is thinking, but even more accurately at least for me, editing is thinking.

    Neither typing speed nor dictation speed is a true bottleneck, but editing speech seems like it'd be harder than editing text.

    Though there may be some hybrid approach that can work well.

    • > editing is thinking.

      I hadn’t realized until just now how accurate that is for me as well. Thank you.