Singing Voice to 3D Model

DistillW2N: A Lightweight One-Shot Whisper to Normal Voice Conversion Model Using Distillation of Self-Supervised Features

Abstract: Whisper to Normal voice conversion (W2N) holds great promise for assistive communication and healthcare, making it an exciting area of research and development. Recent advancements in W2N ...

IEEE

Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model

Abstract: This paper introduces PFlow-VC, a conditional flow matching voice conversion model that leverages fine-grained discrete pitch tokens and target speaker prompt information for expressive ...

Microsoft

VALL-E Family

VALL-E 2 is the latest advancement in neural codec language models that marks a milestone in zero-shot text-to-speech synthesis (TTS), achieving human parity for the first time. Building upon the ...

3don MSNOpinion

When AI recreates the female voice, it also rewrites who gets heard

Voice cloning technology platforms like ElevenLabs allow anyone to replicate a voice using just a few seconds of audio, for a ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results