← Back to context

Comment by marcus_holmes

2 years ago

The actress did impersonate Her though.

It's not just a random "voice for your chatbot", it's that particularly breathy, chatty, voice that she performed for the movie.

I would agree with you completely if they'd created a completely different voice. Even if they'd impersonated a different famous actress. But it's the fact that Her was about an AI, and this is an AI, and the voices are identical. It's clearly an impersonation of her work.

> The actress did impersonate Her though.

Did she? The article claims that:

1. Multiple people agree that the casting call mentioned nothing about SJ/her

2. The voice actress claims she was not given instructions to imitate SJ/her

3. The actress's natural voice sounds identical to the AI-generated Sky voice

I don't personally think it's anywhere near "identical" to SJ's voice. It seems most likely to me that they noticed the similarity in concept afterwards and wanted to try to capitalize on it (hence later contacting SJ), opposed to the other way around.

  • >I don't personally think it's anywhere near "identical" to SJ's voice. It seems most likely to me that they noticed the similarity in concept afterwards and wanted to try to capitalize on it (hence later contacting SJ), opposed to the other way around.

    So your theory is that this was completely coincidental. But after the voice was recorded, they thought, "Wow, it sounds just like the voice of the computer in Her! We should contact that actress and capitalize on it!"

    That's what you're going with? It doesn't make sense, to me.

    • Listen to the side by side comparisons. Sky has a deeper voice overall, in the gpt4o demo Sky displays a wider pitch range because the omni model is capable of emotional intonation. Her voice slides quite a bit while emoting but notably doesn't break and when she returns to her normal speaking voice you can hear a very distinct rhotic sound, almost an over-pronounced American accent and she has a tendency towards deepening into vocal fry especially before pauses. I'd describe her voice as mostly in her chest when speaking clearly.

      Now listen to SJ's Samantha in Her and the first thing you'll notice are the voice breaks and that they break to a higher register with a distinct breathy sound, it's clearly falsetto. SJ seems to have this habit in her normal speaking voice as well but it's not as exaggerated and seems more accidental. Her voice is very much in her head or mask. The biggest commonality I can hear is that they both have a sibilant S and their regional accents are pretty close.

    • I was thinking someone thought "oh that sounds a fair bit like SJ in Her, if we can get SJ onboard, perhaps we can fine-tune what we got to sound like SJ in Her".

    • > But after the voice was recorded, they thought,

      ... that it would be even better to have a famous voice from Her than a rather generic female voice they had, but their proposal was declined. Well oops, but SJ, famous as she is, doesn't have a copyright right on all female voices other than her own.

  • No-one had to explicitly say any of that for it to still be an impersonation. Her was a very popular film, and Johansson's voice character was very compelling. They literally could have said nothing and just chosen the voice audition closest to Her unconsciously, because of the reach of the film, and that would still be an impersonation.

    • > They literally could have said nothing and just chosen the voice audition closest to Her unconsciously, because of the reach of the film, and that would still be an impersonation

      That's a very broad definition of impersonation, one that does not match the legal definition, and one that would would be incredibly worrying for voice actors whose natural voice happens to fall within a radius of a celebrity's natural voice ("their choice to cast you was unconsciously affected by similarity to a celebrity, therefore [...]")

      16 replies →

    • SJs voice has some very distinctive characteristics and she has distinctive inflections that she applies. None of that inflection, tonality, or characteristics are present in the chat bot voice. Without those elements, it can be said to be a voice with vaguely similar pitch and accent, but any reasonable “impersonation “ would at least attempt to copy the mannerisms and flairs of the voice they we’re trying to impersonate.

      Listening to them side by side, the OpenAI voice is more similar to Siri than to SJ. That Sam Altman clearly wanted SJ to do the voice acting is irrelevant, considering the timings and the voice differences.

      The phone call and tweet were awkward tho.

      1 reply →

    • I have this sinking feeling that in this whole debate, whatever anyone's position is mostly depends on whether they think it's good that OpenAI exists or not.

      1 reply →

  • It sounds more like Rashida Jones than SJ to me.

    I think part of this PR cycle is also the priming effect, where if you're primed to hear something and then listen you do great it.

  • Who’s making those claims, exactly? That will tell you a lot about their likely veracity.

    • First two claims are "according to interviews with multiple people involved in the process", direct quotes from the casting call flier, and "documents shared by OpenAI in response to questions from The Washington Post". Given the number of (non-OpenAI) people involved, I think it would be difficult to maintain a lie on these points. Third claim is a comparison carried out by The Washington Post.

  • This is why things are decided by juries. You may well truly believe this all seems unrelated and above board. But very few people will agree with you when presented with these facts, and it would be hard find them during a jury selection.

  • > The actress's natural voice sounds identical to the AI-generated Sky voice

    No it doesn't.

    • > > The article claims that: [...]

      > > 3. The actress's natural voice sounds identical to the AI-generated Sky voice

      > No it doesn't.

      That's a verbatim quote from the article (albeit based on brief recordings).

      I haven't heard the anonymous voice actress's voice myself to corroborate WP's claim, but (unless there's information I'm unaware of) neither have you to claim the opposite.

      5 replies →

  • Also sam had a one word tweet: “her.” So it looks like there was something going on.

    • It’s an obvious comparison to make for the technology, I don’t think it was meant as “it sounds like ScarJo”

> The actress did impersonate Her though

This is unclear. What is clear is OpenAI referenced Her in marketing it. That looks like it was a case of poor impulse control. But it's basis for a claim.

  • > What is clear is OpenAI referenced Her in marketing it.

    Because they're building a voice-mediated AI, duh.

How do you explain the many people saying that the voices do not sound especially similar?

"The pitch is kiiiiiind of close, but that's about it. Different cadence, different levels of vocal fry, slightly different accent if you pay close attention. Johansson drops Ts for Ds pretty frequently, Sky pronounces Ts pretty sharply. A linguist could probably break it down better than me and identify the different regions involved."

https://old.reddit.com/r/singularity/comments/1cx24sy/vocal_...

There is also a faction claiming that Sky's voice is more similar to Rashida Jones's than Scarlett Johansson's:

https://old.reddit.com/r/ChatGPT/comments/1cx9t8b/vocal_comp...

  • Given the breadth and range of female voices available, this is way too close to be just a coincidence.

    • There are approximately 4 billion women in the world. Given that I know a few people who sound very similar to me, I would say that there are (subjectively) perhaps 1,000 to 10,000 different types of women's voices in the world.

      This would mean that a celebrity could possess a voice similar to 0.5 million to 5 million other women, and potentially claim royalties if their voice is used.

      2 replies →

    • You first said that Sky is "clearly an impersonation" of Johansson. Now you say that it's not a coincidence they chose Sky's voice actress. These are two different claims. It may not be a coincidence in the sense that they may have chosen Sky's actress because she sounds similar to Johansson. But that alone doesn't constitute an impersonation. Impersonation means deliberately assuming the identity of another person with the intent to deceive. So you'd have to demonstrate more than a degree of similarity to make that case.

In the Ford case they hired an impersonator to sing one of her copyrighted songs, so it's clearly an impersonation.

In OpenAI's case the voice only sounds like her (although many disagree) but it isn't repeating some famous line of dialog from one of her movies etc, so you can't really definitively say it's impersonating SJ.

> it's that particularly breathy, chatty, voice that she performed for the movie.

Good luck proving that in court.

“You’re honor our evidence is that the audio clips both sound breathy”

  • I may be wrong, but I believe this case would be made to a jury, not to a judge.

    I think it would be hard to seat a jury that, after laying out the facts about the attempts to hire Johansen, and the tweet at the time of release, would have even one person credulous enough to be convinced this was all an honest mix-up.

    Which is why it will never in a million years go to a trial.

Can someone explain to me the outrage about mimicking a public persons voice, while half the people on hacker news argue that it's fine to steal open source code? I fail to see the logic here? Why is this more important?