OpenAI

Overview

OpenAITTSService provides high-quality text-to-speech synthesis using OpenAI’s TTS API with multiple voice models including traditional TTS models and advanced GPT-based models. The service outputs 24kHz PCM audio with streaming capabilities for real-time applications.

OpenAI TTS API Reference

Pipecat’s API methods for OpenAI TTS integration

Example Implementation

Complete example with voice customization

OpenAI Documentation

Official OpenAI TTS API documentation

Voice Samples

Listen to available voice options

Installation

To use OpenAI services, install the required dependencies:

pip install "pipecat-ai[openai]"

Prerequisites

OpenAI Account Setup

Before using OpenAI TTS services, you need:

OpenAI Account: Sign up at OpenAI Platform
API Key: Generate an API key from your API keys page
Voice Selection: Choose from available voice options (alloy, ash, ballad, cedar, coral, echo, fable, marin, nova, onyx, sage, shimmer, verse)

Required Environment Variables

OPENAI_API_KEY: Your OpenAI API key for authentication

Configuration

OpenAITTSService

api_key

str

default:"None"

OpenAI API key for authentication. If None, uses the OPENAI_API_KEY environment variable.

base_url

str

default:"None"

Custom base URL for OpenAI API. If None, uses the default OpenAI endpoint.

voice

str

default:"alloy"

deprecated

Voice ID to use for synthesis. Options: alloy, ash, ballad, cedar, coral, echo, fable, marin, nova, onyx, sage, shimmer, verse.Deprecated in v0.0.105. Use settings=OpenAITTSService.Settings(...) instead.

model

str

default:"gpt-4o-mini-tts"

deprecated

TTS model to use.Deprecated in v0.0.105. Use settings=OpenAITTSService.Settings(...) instead.

sample_rate

int

default:"None"

Output audio sample rate in Hz. If None, uses OpenAI’s default 24kHz. OpenAI TTS only supports 24kHz output.

params

InputParams

default:"None"

deprecated

Runtime-configurable voice and generation settings. See InputParams below.Deprecated in v0.0.105. Use settings=OpenAITTSService.Settings(...) instead.

settings

OpenAITTSService.Settings

default:"None"

Runtime-configurable settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using OpenAITTSService.Settings(...). These can be updated mid-conversation with TTSUpdateSettingsFrame. See Service Settings for details.

Parameter	Type	Default	Description
`model`	`str`	`None`	TTS model identifier. (Inherited from base settings.)
`voice`	`str`	`None`	Voice identifier. (Inherited from base settings.)
`language`	`Language \| str`	`None`	Language for synthesis. (Inherited from base settings.)
`instructions`	`str`	`NOT_GIVEN`	Instructions to guide voice synthesis behavior (e.g. affect, tone, pacing).
`speed`	`float`	`NOT_GIVEN`	Voice speed control (0.25 to 4.0).

Usage

Basic Setup

from pipecat.services.openai import OpenAITTSService

tts = OpenAITTSService(
    api_key=os.getenv("OPENAI_API_KEY"),
    settings=OpenAITTSService.Settings(
        voice="nova",
    ),
)

With Voice Customization

from pipecat.services.openai import OpenAITTSService

tts = OpenAITTSService(
    api_key=os.getenv("OPENAI_API_KEY"),
    settings=OpenAITTSService.Settings(
        voice="coral",
        model="gpt-4o-mini-tts",
        instructions="Speak in a warm, friendly tone with moderate pacing.",
        speed=1.1,
    ),
)

Updating Settings at Runtime

Voice settings can be changed mid-conversation using TTSUpdateSettingsFrame:

from pipecat.frames.frames import TTSUpdateSettingsFrame
from pipecat.services.openai.tts import OpenAITTSSettings

await task.queue_frame(
    TTSUpdateSettingsFrame(
        delta=OpenAITTSSettings(
            instructions="Now speak more formally.",
            speed=0.9,
        )
    )
)

The InputParams / params= pattern is deprecated as of v0.0.105. Use Settings / settings= instead. See the Service Settings guide for migration details.

Notes

Fixed sample rate: OpenAI TTS always outputs audio at 24kHz. Using a different sample rate may cause issues.
Model selection: The gpt-4o-mini-tts model supports the instructions parameter for controlling voice affect and tone, which traditional TTS models do not support.
HTTP-based service: OpenAI TTS uses HTTP streaming, so it does not have WebSocket connection events.

API Reference

Services

Utilities

Frameworks

Pipeline

Overview

OpenAI TTS API Reference

Example Implementation

OpenAI Documentation

Voice Samples

Installation

Prerequisites

OpenAI Account Setup

Required Environment Variables

Configuration

OpenAITTSService

Settings

Usage

Basic Setup

With Voice Customization

Updating Settings at Runtime

Notes

API Reference

Services

Utilities

Frameworks

Pipeline

​Overview

OpenAI TTS API Reference

Example Implementation

OpenAI Documentation

Voice Samples

​Installation

​Prerequisites

​OpenAI Account Setup

​Required Environment Variables

​Configuration

​OpenAITTSService

​Settings

​Usage

​Basic Setup

​With Voice Customization

​Updating Settings at Runtime

​Notes

Overview

Installation

Prerequisites

OpenAI Account Setup

Required Environment Variables

Configuration

OpenAITTSService

Settings

Usage

Basic Setup

With Voice Customization

Updating Settings at Runtime

Notes