← Back to context

Comment by Marsymars

2 days ago

> I'm not a particularly slow typer. I can go 70-90 WPM on a typing test. However, this speed drops quickly when I need to also think about what I'm saying. Typing that fast is also kinda tiring, whereas talking/thinking at 100-120 WPM feels comfortable.

This doesn't feel relatable at all to me. If my writing speed is bottlenecked by thinking about what I'm writing, and my talking speed is significantly faster, that just means I've removed the bottleneck by not thinking about what I'm saying.

It's often better to segregate creative and inhibitive systems even if you need the inhibitive systems to produce a finished work. There's a (probably apocryphal) conversation between George RR Martin and Stephen King that goes something like:

GRRM: How do you write so many books?... Don't you ever spend hours staring at the page, agonizing over which of two words to use, and asking 'am I actually any good at this?'

SK: Of course! But not when I'm writing.

That's fair. I sometimes find myself pausing or just talking in circles as I'm deciding what I want. I think when I'm speaking, I feel freer to use less precise/formal descriptions, but the model can still correctly interpret the technical meaning

In either case, different strokes for different folks, and what ultimately matters is whether you get good results. I think the upside is high, so I broadly suggest people try it out

Alternatively: some people are just better at / more comfortable thinking in auditory mode than visual mode & vice versa.

In principle I don't see why they should have different amounts of thought. That'd be bounded by how much time it takes to produce the message, I think. Typing permits backtracking via editing, but speaking permits 'semantic backtracking' which isn't equivalent but definitely can do similar things. Language is powerful.

And importantly, to backtrack in visual media I tend to need to re-saccade through the text with physical eye motions, whereas with audio my brain just has an internal buffer I know at the speed of thought.

Typed messages might have higher _density_ of thought per token, though how valuable is that really, in LLM contexts? There are diminishing returns on how perfect you can get a prompt.

Also, audio permits a higher bandwidth mode: one can scan and speak at the same time.

It's kind of the point. If you start writing it, you'll start correcting it and moving things around and adding context and fiddling and more and more.

And your 5 minute prompt just turned I to 1/2 hour of typing

With voice you get on with it, and then start iterating, getting Claude to plan with you.

Not been impressed with agentic coding myself so far, but I did notice that using voice works a lot better imo, keeping me focused on getting on with letting the agent do the work.

I've also found it good for stopping me doing the same thing in slack messages. I ramble my general essay to ChatGPT/Claude, get them to summarize rewrite a few lines in my own voice. Stops me spending an hour crafting a slack message and tends to soften it.

I prefer writing myself, but I could see the appeal of producing a first draft of a prompt by dumping a verbal stream of consciousness into ChatGPT. That might actually be kind of fun to try while going on a walk or something.

I don’t feel restricted by my typing speed, speaking is just so much easier and convenient. The vast majority of my ChatGPT usage is on my phone and that makes s2t a no brainer.