Opensourcing TADA: Fast, Reliable Speech Generation Through Text-Acoustic Synchronization
Summary
Hume AI introduces TADA, an open-source Text-Acoustic Dual Alignment approach for fast, reliable TTS by aligning text tokens with acoustic frames one-to-one. It achieves a real-time factor of 0.09, virtually zero content hallucinations, and supports on-device deployment, with 1B and 3B models and the audio tokenizer/decoder released publicly.