Documentation Index
Fetch the complete documentation index at: https://docs.enconvo.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
EnConvo integrates with 13+ Text-to-Speech providers, giving you access to hundreds of natural-sounding voices across dozens of languages. Whether you want to listen to AI responses, read selected text aloud, or convert documents to audio files, EnConvo makes it seamless.Supported TTS Providers
OpenAI TTS
High-quality voices (alloy, echo, fable, onyx, nova, shimmer) with tts-1 and tts-1-hd models
ElevenLabs
Industry-leading voice synthesis with hundreds of voices and multiple models including flash v2.5
Microsoft Azure TTS
Enterprise-grade TTS with multilingual neural voices like EmmaMultilingualNeural
Edge TTS
Free Microsoft Edge voices — no API key required
Google Cloud TTS
Google’s neural TTS with WaveNet and Neural2 voices
Gemini TTS
Google Gemini-powered TTS with style control and multi-speaker support
MiniMax TTS
High-quality multilingual speech with turbo and HD models
xAI TTS
Expressive voices (Eve, Ara, Rex, Sal, Leo) with speech tag support for 20+ languages
Speechify TTS
Simba model family — Base, English, Multilingual, and Turbo variants
Straico TTS
Access ElevenLabs and OpenAI TTS through a single Straico API key
macOS System TTS
Built-in macOS
say command — works offline with system voicesKokoro TTS (Local)
Fully local, privacy-first TTS powered by the Kokoro model via FluidAudio
Provider Comparison
| Provider | Quality | Speed | Offline | Free Tier | Languages |
|---|---|---|---|---|---|
| OpenAI TTS | Excellent | Fast | No | No | 50+ |
| ElevenLabs | Excellent | Fast | No | Limited | 29+ |
| Microsoft Azure | Excellent | Fast | No | Free via Enconvo Cloud | 140+ |
| Edge TTS | Good | Fast | No | Yes (free) | 90+ |
| Google Cloud | Excellent | Fast | No | No | 40+ |
| Gemini TTS | Excellent | Medium | No | No | 24+ |
| MiniMax | Excellent | Fast | No | No | 17+ |
| xAI TTS | Good | Fast | No | No | 20+ |
| Speechify | Good | Fast | No | No | 20+ |
| macOS System | Basic | Fast | Yes | Yes (free) | 30+ |
| Kokoro (Local) | Good | Fast | Yes | Yes (free) | Multiple |
Getting Started
Open Settings
Navigate to the TTS command settings in EnConvo. You can find TTS configurations under the TTS extension.
Choose a TTS Provider
Select your preferred provider from the TTS Provider dropdown. Options include cloud providers (OpenAI, ElevenLabs, Azure) and free options (Edge TTS, System TTS).
Configure Credentials
For cloud providers, set up your API key through the Credential Provider setting. If you are on the Enconvo Cloud Plan, Microsoft TTS and MiniMax TTS are available without your own API key.
Select a Voice
Each provider offers different voices. Browse the Voice dropdown to find one that fits your needs. You can preview voices using the built-in preview feature.
Usage Methods
Read Aloud
The most common way to use TTS — have EnConvo read text aloud in real time.- Select any text on your screen
- Trigger the Read Aloud command from the SmartBar or PopBar
- EnConvo streams the audio as it generates, so playback starts almost immediately
Text to Audio File
Convert text into a saved audio file for later use.- Open the TTS (Text To Speech) command
- Enter or paste the text you want to convert
- The audio file is saved to your specified output directory
- Use the Show in Finder or Save As actions to manage the file
Gemini TTS (Style-Controlled)
Gemini TTS offers unique style-controlled speech generation. Single Speaker:Convert SRT Subtitles to Audio
Convert subtitle files to audio:- Provide your
.srtsubtitle file - EnConvo processes each subtitle segment with the configured TTS provider
- Outputs a synchronized audio file
Voice Configuration by Provider
- OpenAI
- ElevenLabs
- Microsoft Azure
- Edge TTS
- xAI
Models: tts-1 (fast), tts-1-hd (high quality)Voices:
You can also add custom voices if available through the OpenAI API.
| Voice | Character |
|---|---|
| alloy | Neutral, balanced |
| echo | Warm, conversational |
| fable | Expressive, storytelling |
| onyx | Deep, authoritative |
| nova | Friendly, upbeat |
| shimmer | Clear, gentle |
Speed Settings
All providers support adjustable speech speed:| Speed | Use Case |
|---|---|
| 0.5x — 0.75x | Careful listening, language learning |
| 1.0x | Normal speaking pace |
| 1.2x (default) | Slightly faster, efficient listening |
| 1.5x — 2.0x | Speed listening, familiar content |
| 2.5x — 4.0x | Scanning content, experienced listeners |
Offline TTS Options
For privacy-first or no-internet scenarios, EnConvo offers two fully offline options:macOS System TTS
macOS System TTS
Uses the built-in macOS
say command. No downloads required — works with any voice installed in System Settings > Accessibility > Spoken Content > System Voice.- Supports M4A output format
- Adjustable speed
- Works completely offline
- Quality varies by voice; download enhanced voices for better quality
Kokoro TTS (via FluidAudio)
Kokoro TTS (via FluidAudio)
A local neural TTS model bundled with EnConvo’s FluidAudio framework. Runs entirely on your Mac using Apple Silicon acceleration.
- High-quality neural speech
- No internet connection required
- No API key needed
- Your text never leaves your device
Using TTS in Chat
EnConvo integrates TTS directly into the AI chat experience:- Manual playback: Click the speaker icon on any AI response to hear it read aloud
- Auto-TTS: Enable automatic TTS in chat settings to have every response read aloud
- Playback controls: Pause, resume, or stop playback at any time
Using TTS with Translation
The Translate command can automatically play TTS audio of the translated result:- Open the Translator command settings
- Enable Automatically Play TTS Audio under the Text-to-Speech group
- Select your preferred TTS provider
- Every translation result is now read aloud automatically
Sound Effects Generation
EnConvo can also generate sound effects from text descriptions:- Use the Text To Sound Effect command
- Describe the sound you want in English (up to 200 characters)
- The more detailed the description, the better the result
Troubleshooting
No audio playing
No audio playing
- Check that your Mac’s audio output is working (System Settings > Sound)
- Verify the TTS provider is configured with valid credentials
- Try switching to Edge TTS or System TTS to rule out API issues
- Check the console logs for error messages
Voice sounds robotic or low quality
Voice sounds robotic or low quality
- Switch to a higher-quality provider like ElevenLabs or OpenAI TTS HD
- If using OpenAI, switch from
tts-1totts-1-hd - Adjust the speed — very high speeds can reduce quality
- Try a different voice; some voices perform better than others
TTS is too slow
TTS is too slow
- Use a turbo/flash model variant when available (e.g., ElevenLabs flash v2.5, MiniMax turbo)
- Edge TTS and System TTS are typically the fastest options
- Check your internet connection for cloud providers
- Try reducing the text length for faster initial playback
Wrong language pronunciation
Wrong language pronunciation
- For xAI TTS, explicitly set the language instead of using Auto Detect
- Use a multilingual voice like Azure’s EmmaMultilingualNeural
- Ensure the text language matches the voice’s supported languages
Enconvo Cloud Plan
The Enconvo Cloud Plan includes built-in access to several TTS providers without needing your own API keys:| Provider | Via Cloud Plan |
|---|---|
| Microsoft Azure TTS | Included |
| MiniMax TTS | Included |
| xAI TTS | Included |
Cloud Plan TTS usage consumes your Enconvo points. Check Settings > Usage for your current balance.
Related Features
Dictation
Convert speech to text
AI Chat
Chat with AI and listen to responses
Translation
Translate and listen to results
Speech Recognition
Advanced speech-to-text providers