Tacotron 2NVIDIA's open-source PyTorch implementation of Tacotron 2 for faster-than-realtime neural TTS with distributed training and automatic mixed precision support.(0)0
Coqui AICoqui AI was an open-source TTS and voice cloning platform featuring the XTTS model, supporting 17+ languages and voice cloning from just 3 seconds of audio.(0)0
Tortoise TTSTortoise TTS is a free, open-source multi-voice TTS system emphasizing realistic prosody and intonation. Clone voices and generate high-quality speech with this research-grade Python library.(0)0
XTTS v2XTTS v2 is an open-source text-to-speech and voice cloning model supporting 17 languages. Clone any voice with just a 6-second audio clip using the Coqui TTS library.(0)0
Nari Labs DiaDia is a 1.6B parameter open-source text-to-speech model that generates ultra-realistic dialogue with emotion control, audio conditioning, and nonverbal sounds in one pass.(0)0
OpenVoiceOpenVoice is a free, open-source voice cloning AI by MIT and MyShell that supports zero-shot cross-lingual cloning, tone color replication, and granular voice style control.(0)0
S SadTalker AISadTalker generates realistic talking head videos from a single portrait and audio clip using 3D motion coefficients. Open-source CVPR 2023 research tool.(0)0
Microsoft SoundscapeMicrosoft Soundscape uses binaural 3D audio to help users navigate and build spatial awareness of their surroundings. Now open-source from Microsoft Research.(0)0
Vocode AIVocode is an open source platform to build, deploy, and scale hyperrealistic voice AI agents using LLMs. Supports Python and Node.js SDKs with an enterprise-grade phone call API.(0)0
AntiFakeAntiFake adds imperceptible adversarial perturbations to audio files to prevent unauthorized voice cloning and deepfake speech synthesis. Open-source, Pyth(0)0