Totally agree. Maybe it’s just the clips they chose, but it feels overfit on the weird conversational elements that make it impressive? Like the “oh yeahs” from the other person when someone is speaking. It is cool to see that natural flow in a conversation generated by a model, but there’s waaaay too much of it in these examples to sound natural.
And I say all that completely slackjawed that this is possible.
I love the technology, but I really don't want AI to sound like this.
Imagine being stuck on a call with this.
> "Hey, so like, is there anything I can help you with today?"
> "Talk to a person."
> "Oh wow, right. (chuckle) You got it. Well, before I connect you, can you maybe tell me a little bit more about what problem you're having? For example, maybe it's something to do with..."
That's how the DJ feature of Spotify talks and it's pretty jarring.
"How's it going. We're gonna start by taking you back to your 2022 favorites, starting with the sweet sounds of XYZ". There's very little you can tweak about it, the suggestions kinda suck, but you're getting a fake friend to introduce them to you. Yay, I guess..
I'd love to see stats on disfluency rate in conversation, podcasts, and this sample to get an idea of where it lies. It seems like they could have cranked it up, but there's also the chance that it's just the frequency illusion because we were primed to pay attention to it.
Hmm.... Scottish, Welsh, Irish (Nor'n) or English? If English, North or South? If North, which city? Brummie? Scouse? If South, London? Cockney or Multicultural London English [0]?
Need to increase your granularity a bit. I live in Wexford Town, Ireland, and the other day I was chatting to a person that told me their old schoolmates from Castlebridge are making fun of their accent changing since moving from their hometown.
When people outside the British isles (esp. Americans) say "British accent", they almost invariably mean (British) English, and usually the "received pronunciation" accent that British media generally uses.
They do not mean Irish or Scottish accents; if they did, they would have said exactly that, because those accents are quite different from standard (British) English accents. So different, in fact, that even Americans can readily tell the difference, when they frequently have some trouble telling English and Australian accents apart.
Also, to most English speakers, "English accent" doesn't make much sense, because "English" is the language. It sounds like saying a German speaker, speaking German, has a "German accent". Saying "British accent" differentiates the language (English, spoken by people worldwide) from the accent (which refers to one part of one country that uses that language).
And I say all that completely slackjawed that this is possible.