Sep 22, 17:48 in SoundGym Cafe
To all musicians out there !
We have 2 dry vocals.
First one, we call it Deep
Second one, we call it Thin.
One of the voices is real person, the other is AI.

Debate :

In this second example we have @Alexandra Escalle Vs. my trained AI voice :

PS: The AI is the same as in the first example
Marc-André Bleau
Sep 23, 12:36
It's not dry, there is reverb on one of the vocal. To compare let's put them in the same atmosphere. Without that it's not fair
Tairone Magalhaes
Sep 23, 18:10
I agree with the other comments, it is hard to compare the voices when there is a lot of processing applied to them. Anyway, here are my impressions:

- In terms of EQ-ing, the first voice has no high frequency content; it sounds like a low-pass filter was applied to it at ~8Khz or maybe even lower. That is pretty common in AI-generated audio, cause many models are trained at lower sampling rates to reduce the computational cost of the model (although this has changed a lot lately). Therefore, this characteristic might bias the listener to perceive it as the AI voice.

- The second voice spans a wider frequency range, which makes it fell more natural in terms of EQ-ing. However, there is a pronounced chorusy effect applied to it (I would guess it is an ensemble processor), which might fool a listener into thinking it was artificially generated. There is also some sibilance and a substantial amount of reverb (it could be DSP or just natural room ambience).

Despise that those differences make it harder to compare and judge which one was articially generated, my immediate impression was that the first voice was obviously generated by AI. The EQ difference probably plays a significant part in my guess, but the decisive characteristic are some weird artifacts present in it, specially at 0:57 (weird breathing sound), 1:00 (unnatural sustained vowel sound). I could be wrong, cause those artifacts sound a bit like some typical artifacts produced by auto-tune plugins, but I still think they were AI-generated.

Now, to make the comparison fair I would suggest:
- matching the tonal characteristics of both voices using some basic EQ, if the AI model is not able to produce high frequencies;
- not applying any extra effect;
- avoid to record the natural voice in an environment with significant ambience;
- avoid sibilance or pops to affect the judgement of the listener.

With that, I believe you can obtain be a more fair comparison between the voices. 😉
Sep 23, 18:14
Cymatics advertised it as dry vocal.
They probably used autotune or something similar during recording.
Deep voice is the AI , as @Cuantas Vacas said, the Thin one is real person.
On the second example, the AI sings French, just as good as Alexandra does 😄
Congrats @Mr. Question for winning the Diamond Ears Award!
Colin Aiken
Sep 23, 07:06
Srikkanth Mani
Sep 23, 07:53
Congratulations! Amazing work
Mr. Question
Sep 23, 12:58
Thank you everyone in this amazing community! I finally made it after 6 months of daily trainings.
Shall we have a party?
Sep 23, 10:23 in SoundGym Official
Congrats @GELVS . for winning the Diamond Ears Award!
Mr. Question
Sep 23, 11:34
Maggwa Joshua
Sep 23, 11:09 in SoundGym Cafe
Lookvs like there is a problem today. After my training, i can't collect coins. The tab still says continue training. But i'm using the free version of Sound gym. So i've exhausted my games for the day...
Congrats @J KRRKLA for winning the Golden Ears Award!
Lio LM
Sep 22, 17:57
Congrats!! 😃
Colin Aiken
Sep 23, 07:06

Explore New Spaces