Chatterbox TTS Server

Generate Speech

Active Model:

Checking model...

Text to synthesize

Enter the text you want to convert to speech. For audiobooks, you can paste long chapters.

0 Characters

Split text into chunks

Voice Mode:

Predefined Voices Voice Cloning (Reference)

Select Predefined Voice:

Load Example Preset:

Loading presets...

Generation Parameters

Temperature (0.8)

Exaggeration (0.5)

CFG Weight (0.5)

Speed Factor (1.0)

Generation Seed

Integer for reproducible results. Some engines use 0 or -1 for random.

Output Format

MP3 is recommended for smaller file sizes (e.g., audiobooks).

Server Configuration

These settings are loaded from config.yaml via an API call. Restart the server to apply changes to Host, Port, Model, or Path settings if modified here or directly in the file.

Server Host

Server Port

TTS Device

Default Voice ID

Model Cache Path

Predefined Voices Path

Reference Audio Path

Output Path

Audio Output Format

Audio Sample Rate

Tips & Tricks

For Audiobooks, use MP3 format, enable Split text, and set a chunk size of ~250-500.
Use Predefined Voices for consistent, high-quality output. You can import new ones.
For Voice Cloning, upload clean reference audio (.wav/.mp3). Quality of reference is key.
Experiment with Temperature and other generation parameters to fine-tune output.
Adjusting Speed Factor away from 1.0 is experimental and may cause echo.
When using Turbo model, you can insert paralinguistic tags like [laugh] or [sigh] for natural vocal reactions.
Check the /docs endpoint for API details.

Generate Speech

Tips & Tricks

Chunking Voice Consistency Warning

Generation Quality Notice