Skip to content

Text-to-Speech (TTS)

Convert text into natural sounding speech.

API Details

Endpoint: POST /v1/audio/speech

Description: Generates audio from input text. Supports various models, voices, and output formats.

Authentication: Bearer Token

http
Authorization: Bearer YOUR_API_TOKEN

Request Parameters

Header Parameters

ParameterTypeRequiredDescriptionExample
AuthorizationstringYesBearer Token authenticationBearer sk-xxx...
Content-TypestringYesContent typeapplication/json

Body Parameters

ParameterTypeRequiredDefaultDescriptionExample
modelstringYes-The ID of the model to usetts-1, tts-1-hd
inputstringYes-The text to generate audio forHello, welcome to TTS service.
voicestringYes-The voice to use when generating the audioalloy, echo, fable, onyx, nova, shimmer
response_formatstringNomp3The format to audio inmp3, opus, aac, flac, wav, pcm
speednumberNo1.0The speed of the generated audio0.25 to 4.0

Response Parameters

Response Content: Returns the binary audio file on success.

Content-Type: Determined by response_format, e.g., audio/mpeg.


Code Examples

Python (using OpenAI SDK)

python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.ezmodel.cloud/v1"
)

response = client.audio.speech.create(
    model="tts-1",
    voice="alloy",
    input="Hello, welcome to TTS service.",
)

response.stream_to_file("speech.mp3")

Curl Example

bash
curl https://api.ezmodel.cloud/v1/audio/speech \
  -H "Authorization: Bearer $YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello, welcome to TTS service.",
    "voice": "alloy"
  }' \
  --output speech.mp3

OpenAPI Specification

yaml
openapi: 3.0.1
info:
  title: ''
  description: ''
  version: 1.0.0
paths:
  /v1/audio/speech:
    post:
      summary: Text-to-Speech
      description: Convert text into natural sounding speech.
      requestBody:
        content:
          application/json:
            schema:
              type: object
              required:
                - model
                - input
                - voice
              properties:
                model:
                  type: string
                input:
                  type: string
                voice:
                  type: string
      responses:
        '200':
          description: Audio generated successfully
          content:
            audio/mpeg:
              schema:
                type: string
                format: binary

企业合作联系:service@ezmodel.cloud