Comment by oceanplexian

10 months ago

A lot of people assume that AI naturally produces this predictable style writing but as someone who has dabbled in training a number of fine tunes that's absolutely not the case.

You can improve things with prompting but can also fine tune them to be completely human. The fun part is it doesn't just apply to text, you can also do it with Image Gen like Boring Reality (https://civitai.com/models/310571/boring-reality) (Warning: there is a lot of NSFW content on Civit if you click around).

My pet theory is the BigCo's are walking a tightrope of model safety and are intentionally incorporating some uncanny valley into their products, since if people really knew that AI could "talk like Pete" they would get uneasy. The cognitive dissonance doesn't kick in when a bot talks like a drone from HR instead of a real person.

7 comments

oceanplexian

palsecam 10 months ago

> My pet theory is the BigCo's are walking a tightrope of model safety and are intentionally incorporating some uncanny valley into their products, since if people really knew that AI could "talk like Pete" they would get uneasy. The cognitive dissonance doesn't kick in when a bot talks like a drone from HR instead of a real person.

FTR, Bruce Schneier (famed cryptologist) is advocating for such an approach:

We have a simple proposal: all talking AIs and robots should use a ring modulator. In the mid-twentieth century, before it was easy to create actual robotic-sounding speech synthetically, ring modulators were used to make actors’ voices sound robotic. Over the last few decades, we have become accustomed to robotic voices, simply because text-to-speech systems were good enough to produce intelligible speech that was not human-like in its sound. Now we can use that same technology to make robotic speech that is indistinguishable from human sound robotic again. — https://www.schneier.com/blog/archives/2025/02/ais-and-robot...

MichaelDickens 10 months ago

Reminds me of the robot voice from The Incredibles[1]. It had an obviously-robotic cadence where it would pause between every word. Text-to-speech at the time already knew how to make words flow into each other, but I thought the voice from The Incredibles sounded much nicer than the contemporaneous text-to-speech bots, while also still sounding robotic.
[1] https://www.youtube.com/watch?v=_dxV4BvyV2w
momojo 10 months ago

Like adding the 'propane smell' to propane.
nyanpasu64 10 months ago

That doesn't sound like ring modulation in a musical sense (IIRC it has a modulator above 30 Hz, or inverts the signal instead of attenuating?), so much as crackling, cutting in and out, or an overdone tremolo effect. I checked in Audacity and the signal only gets cut out, not inverted.

Semaphor 10 months ago

Interestingly, it's just kinda hiding the normal AI issues, but they are all still there. I think people know about those "normal" looking pictures, but your example has many AI issues, especially with hands and background

GuinansEyebrows 10 months ago

> but can also fine tune them to be completely human

what does this mean? that it will insert idiosyncratic modifications (typos, idioms etc)?

a2128 10 months ago

If you play around with base models, they will insert typos, slang, they will generate curse words and pointless internet flamewars