erm: A Local CLI That Strips Ums, Uhs, and Erms From Speech
Summary
This article introduces erm, a local CLI that automatically removes speech disfluencies (ums, uhs, er, and elongated versions) from audio using AI-driven transcription (Whisper/faster-whisper) and smart audio processing. It explains detection passes, crossfading splices, room-tone blending, and denoising considerations, plus how to run and test the tool locally. The piece provides practical implementation details suitable for developers and audio editors exploring AI-assisted audio cleanup.