Close Menu
    Facebook X (Twitter) Instagram
    Cloud Tech ReportCloud Tech Report
    • Home
    • Crypto News
      • Bitcoin
      • Ethereum
      • Altcoins
      • Blockchain
      • DeFi
    • AI News
    • Stock News
    • Learn
      • AI for Beginners
      • AI Tips
      • Make Money with AI
    • Reviews
    • Tools
      • Best AI Tools
      • Crypto Market Cap List
      • Stock Market Overview
      • Market Heatmap
    • Contact
    Cloud Tech ReportCloud Tech Report
    Home»AI News»Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs
    AI News

    Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs

    May 26, 2026
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Meet OmniVoice Studio: A Local, Open-Source Alternative to ElevenLabs
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email
    aistudios


    OmniVoice Studio — How to Use It
    01 / 08

    What Is OmniVoice Studio?

    OmniVoice Studio is an open-source desktop application for voice cloning, video dubbing, real-time dictation, and speaker diarization. Everything runs locally on your machine. No API keys, no cloud account, no subscription required.

    • 646 languages supported for TTS via the default OmniVoice engine
    • 99 languages for transcription via WhisperX
    • Available on macOS, Windows, and Linux
    • GPU is optional — full pipeline runs on CPU
    • Free for personal, educational, and research use (FSL-1.1-ALv2)

    OmniVoice Studio — How to Use It
    02 / 08

    binance

    System Requirements

    A GPU is optional. Without one, TTS runs approximately 3× slower on CPU. With ≤8 GB VRAM, TTS automatically offloads to CPU during transcription — no config needed.

    ComponentMinimumRecommended

    OSWin 10 / macOS 12+ / Ubuntu 20.04+Any modern 64-bit OS
    RAM8 GB16 GB+
    VRAM4 GB (auto-offloads)8 GB+ (RTX 3060+)
    Disk10 GB free20 GB+ SSD
    Python3.10+3.11–3.12
    GPUOptionalCUDA / MPS / ROCm

    OmniVoice Studio — How to Use It
    03 / 08

    Installation

    The project recommends running from source. Install three prerequisites first: ffmpeg, Bun (JS runtime), and uv (Python package manager).

    git clone https://github.com/debpalash/OmniVoice-Studio.git
    cd OmniVoice-Studio
    uv sync
    bun install
    bun dev

    Frontend loads at http://localhost:5173  |  API runs on port 8000.Model weights download automatically on first generation.

    Pre-built installers available: macOS DMG, Windows MSI, Linux AppImage and .deb — see the Releases page on GitHub.

    OmniVoice Studio — How to Use It
    04 / 08

    Voice Cloning

    Voice cloning uses zero-shot learning — it clones a voice from a clip as short as 3 seconds, without prior training on that voice. The default OmniVoice engine conditions a diffusion-based TTS model on the reference audio.

    • Go to the Voice Clone tab in the UI
    • Upload or record a 3-second audio clip of the target voice
    • Enter your text and select a target language (646 available)
    • Click Generate — output is saved to your project library

    Voice Gallery: Search YouTube, browse categories, and download reference clips directly inside the app to build your voice library.

    OmniVoice Studio — How to Use It
    05 / 08

    Video Dubbing

    The full dubbing pipeline runs locally: transcribe → translate → synthesize → mux. Demucs isolates vocals so the original background audio is preserved in the final export.

    • Go to the Dub tab — paste a YouTube URL or upload a local file
    • WhisperX transcribes speech with word-level alignment
    • Select a target language; translation runs automatically
    • TTS engine re-voices the transcript; Demucs preserves background audio
    • Export the final MP4 with dubbed audio mixed in

    Batch Queue: Drop up to 50 videos and walk away. Each job has its own progress bar tracking through the full pipeline.

    OmniVoice Studio — How to Use It
    06 / 08

    Dictation & Speaker Diarization

    Dictation works system-wide from any application. Diarization identifies individual speakers in a multi-speaker audio file using Pyannote + WhisperX.

    • Press ⌘+⇧+Space (macOS) to open the floating dictation widget
    • Speech streams via WebSocket and auto-pastes into the active input field
    • Upload a multi-speaker file to the Diarization tab
    • Pyannote identifies who said what; each speaker gets an auto-extracted voice profile
    • Assign a TTS voice per speaker for per-speaker dubbing

    Hugging Face token required for Pyannote diarization. See docs/setup/huggingface-token.md in the repo.

    OmniVoice Studio — How to Use It
    07 / 08

    TTS Engines

    Six TTS engines are built in. Switch via Settings → TTS Engine or the env var:OMNIVOICE_TTS_BACKEND=cosyvoice

    EngineLanguagesClonePlatform

    OmniVoice (default)600+✓CUDA / MPS / CPU
    CosyVoice 39 + 18 dialects✓CUDA / MPS / CPU
    MLX-AudioMultiVariesApple Silicon only
    VoxCPM230✓CUDA / MPS / CPU
    MOSS-TTS-Nano20✓CUDA / CPU
    KittenTTSEnglish✗CPU only

    Custom engine: Subclass TTSBackend in backend/services/tts_backend.py and add it to _REGISTRY. ~50 lines of Python.

    OmniVoice Studio — How to Use It
    08 / 08

    MCP Server & Resources

    OmniVoice Studio ships a built-in MCP Server, exposing voice and dubbing capabilities to any MCP-compatible client — Claude, Cursor, or your own tooling — without opening the desktop UI.

    • MCP Server starts alongside the FastAPI backend on bun dev
    • Point your MCP client at the local server to access all endpoints
    • AudioSeal (Meta) embeds an invisible neural watermark in all generated audio for AI provenance
    • GitHub: github.com/debpalash/OmniVoice-Studio
    • Install docs: docs/install/ (macos / windows / linux / docker)
    • Troubleshooting: docs/install/troubleshooting.md
    • Discord: discord.gg/bzQavDfVV9



    Source link

    synthesia
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    A Coding Implementation on Microsoft SkillOpt for Instrumented Prompt Optimization, Skill Evolution Analysis, and Baseline Comparison

    June 10, 2026

    The consequences of relying on AI for accurate news | MIT News

    June 9, 2026

    Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information

    June 8, 2026

    How C3 AI agents will automate predictive maintenance for Shell

    June 7, 2026

    Google’s New Colab CLI Lets Developers and AI Agents Run Python on Remote Colab GPUs and TPUs From the Terminal

    June 6, 2026

    The crucial human component in computing and AI | MIT News

    June 5, 2026
    binance
    Latest Posts

    Pepsi Fired 41 Truckers for AI… Buy THESE 7 Stocks NOW

    June 10, 2026

    A Coding Implementation on Microsoft SkillOpt for Instrumented Prompt Optimization, Skill Evolution Analysis, and Baseline Comparison

    June 10, 2026

    How Claude AI Helped Me Make $1000 in One Weekend (Step by Step)

    June 10, 2026

    PewDiePie’s Odysseus AI — Beginners Guide, Best Models & Honest Review (7 Days Later)

    June 10, 2026

    Botanix Shuts Down as Bitcoin Defi Demand Falls Short

    June 10, 2026
    aistudios
    LEGAL INFORMATION
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Top Insights

    Dragonfly’s Rob Hadick Says Stablecoins Could Grow 10x as Payments Adoption Expands

    June 11, 2026

    XRP Demand Falls 91.5% As Traders Eye $0.63 Support

    June 11, 2026
    livechat
    Facebook X (Twitter) Instagram Pinterest
    © 2026 CloudTechReport.com - All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.

    bitcoin
    Bitcoin (BTC) $ 62,576.00
    ethereum
    Ethereum (ETH) $ 1,639.90
    tether
    Tether (USDT) $ 0.998854
    bnb
    BNB (BNB) $ 598.37
    usd-coin
    USDC (USDC) $ 0.999807
    xrp
    XRP (XRP) $ 1.11
    solana
    Solana (SOL) $ 65.47
    tron
    TRON (TRX) $ 0.316377
    figure-heloc
    Figure Heloc (FIGR_HELOC) $ 1.03
    staked-ether
    Lido Staked Ether (STETH) $ 2,265.05