Comment by numpad0

10 days ago

Isn't that a bit much for ASR models? Humans can't handle simultaneous multilingual dictation task either, I have to stop and reinitialize ears before switching languages between English and my primary one.

6 comments

numpad0

abdullahkhalids 10 days ago

In South Asia, it's quite common for people to speak a combination of their local language and English. Not just alternating sentences between the two languages, but in fact, constructing sentences using compound phrases from the two languages.

"Madam, please believe me, maine homework kiya ha" [I did my homework].

okwhateverdude 9 days ago
This is common in the southwestern part of the US too. My partner and her friends she grew up with will have conversations that fluidly pick phrases and vocab from either Spanish or English depending on what words happen to be the easiest to pull from their brain. It's wild to listen to.
- numpad0 9 days ago
  
  Aren't those limited to specific words or phrases in specific forms? I doubt it works for arbitrary half-sentences.

bondarchuk 10 days ago

Seems like it already has the capability somewhere in the model though - see my reply to clarionbell.

cenamus 10 days ago

Isn't that exactly what intepreters do?

numpad0 10 days ago

If they're like what I am, they seem to just coordinate constant staggered resets for sub-systems of language processing pipeline while keeping internal representations of inputs in half-text state so that input come back out through the pipeline in the other configurations.
That's how I anecdotally feel and interpret how my own brain appear to work, so it could be different from how interpreters work or how actual human brains work, but as far as I see it, professional simultaneous interpreters don't seem to be agnostic for relevant pairs of languages at all.