Skip to main content

Overview

ResembleAITTSService provides high-quality text-to-speech synthesis using Resemble AI’s streaming WebSocket API with word-level timestamps and audio context management for handling multiple simultaneous synthesis requests with proper interruption support.

Resemble AI TTS API Reference

Pipecat’s API methods for Resemble AI TTS integration

Example Implementation

Complete example with interruption handling

Resemble AI Documentation

Official Resemble AI API documentation

Sign up

Sign up for a Resemble AI account

Installation

To use Resemble AI services, install the required dependencies:
pip install "pipecat-ai[resemble]"

Prerequisites

Resemble AI Account Setup

Before using Resemble AI TTS services, you need:
  1. Resemble AI Account: Sign up at Resemble AI
  2. API Key: Generate an API key from your account settings
  3. Voice Selection: Choose or create voice UUIDs from your voice library

Required Environment Variables

  • RESEMBLE_API_KEY: Your Resemble AI API key for authentication

Configuration

ResembleAITTSService

api_key
str
required
Resemble AI API key for authentication.
voice_id
str
required
deprecated
Voice UUID to use for synthesis. Deprecated in v0.0.105. Use settings=ResembleAITTSService.Settings(voice=...) instead.
settings
ResembleAITTSService.Settings
default:"None"
Runtime-configurable settings. See Settings below.
url
str
default:"wss://websocket.cluster.resemble.ai/stream"
WebSocket URL for Resemble AI TTS API.
precision
str
default:"PCM_16"
PCM bit depth. Options: PCM_32, PCM_24, PCM_16, or MULAW.
output_format
str
default:"wav"
Audio output format (wav or mp3).
sample_rate
int
default:"22050"
Audio sample rate in Hz. Options: 8000, 16000, 22050, 32000, or 44100.

Settings

Runtime-configurable settings passed via the settings constructor argument using ResembleAITTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstrNoneModel identifier. (Inherited.)
voicestrNoneVoice identifier. (Inherited.)
languageLanguage | strNoneLanguage for synthesis. (Inherited.)

Usage

Basic Setup

from pipecat.services.resembleai import ResembleAITTSService

tts = ResembleAITTSService(
    api_key=os.getenv("RESEMBLE_API_KEY"),
    settings=ResembleAITTSService.Settings(
        voice="your-voice-uuid",
    ),
)

With Custom Settings

from pipecat.services.resembleai import ResembleAITTSService

tts = ResembleAITTSService(
    api_key=os.getenv("RESEMBLE_API_KEY"),
    settings=ResembleAITTSService.Settings(
        voice="your-voice-uuid",
    ),
    sample_rate=16000,
    precision="PCM_16",
    output_format="wav",
)
The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

  • Word-level timestamps: Resemble AI provides word-level timing information, enabling synchronized text highlighting and precise interruption handling.
  • Jitter buffering: The service buffers approximately 1 second of audio before starting playback to absorb network latency gaps (Resemble AI sends audio in bursts with 300-450ms gaps).
  • Audio context management: Supports multiple simultaneous synthesis requests with proper context tracking and interruption handling.
  • Default sample rate: Defaults to 22050 Hz. Supported rates are 8000, 16000, 22050, 32000, and 44100 Hz.

Event Handlers

Resemble AI TTS supports the standard service connection events:
EventDescription
on_connectedConnected to Resemble AI WebSocket
on_disconnectedDisconnected from Resemble AI WebSocket
on_connection_errorWebSocket connection error occurred
@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Resemble AI")