Modular end-to-end AI dubbing pipeline. WhisperX speech recognition, neural translation, and voice synthesis—completely open source.
Professional-grade dubbing without the professional-grade price tag.
Complete workflow from video input to dubbed output with burned-in subtitles. Upload a video, pick your target languages, and get back a fully localized file. No external tools, no manual steps.
Video dubbing with subtitles, audio-only translation, or subtitling-only mode. Maximum flexibility for every use case.
Swap ASR, translation, and TTS models independently. Use our defaults or plug in your own.
VAD-based duration alignment and pyrubberband time-stretching for seamless voice replacement.
Professional subtitle rendering with multiple styles — Netflix, bold-desktop, or mobile-optimized. Subtitles are burned directly into the video with pixel-perfect typography and positioning.
Major world languages with automatic detection. From Mandarin to Arabic, Hindi to Portuguese.
Each stage is a swappable module — clone the repo and run the whole pipeline yourself.
WhisperX extracts speech with word-level timestamps and speaker diarization
M2M-100 or deep-translator converts text while preserving context and timing
Chatterbox voice cloning or Edge TTS generates natural speech in the target language
Intelligent audio alignment, background mixing, and subtitle burning via FFmpeg
Start dubbing your content in minutes. Self-hosted, completely free, no vendor lock-in.