Audio API

The Audio API provides powerful Text-to-Speech (TTS) capabilities, supporting multi-language, high-quality natural voice synthesis.

Endpoint

Text-to-Speech (TTS)


POST https://aiapi.services/v1/audio/speech

Authentication

All requests must include your API key in the HTTP header:


Authorization: Bearer YOUR_API_KEY

Supported Models

Text-to-Speech (TTS)

text-to-speech-multilingual - Multilingual TTS supporting natural voice synthesis in multiple languages
text-to-speech-neural - Neural network TTS with high-quality natural voice synthesis
text-to-speech-001 - Standard TTS model for basic text-to-speech functionality
text-to-speech-standard - Standard TTS version with stable voice synthesis service

See Available Models for the complete model list.

Text-to-Speech

Request Parameters

Required Parameters

Parameter	Type	Description
`model`	string	Model ID, e.g., `text-to-speech-001`
`input`	string	Text content to convert to speech
`voice`	string	Voice type: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`

Optional Parameters

Parameter	Type	Default	Description
`response_format`	string	`mp3`	Output format: `mp3`, `opus`, `aac`, `flac`, `wav`, `pcm`
`speed`	number	1.0	Speech speed (0.25 - 4.0)

Code Examples

cURL


curl https://aiapi.services/v1/audio/speech \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-to-speech-001",
    "input": "The weather is nice today, perfect for a walk.",
    "voice": "alloy",
    "speed": 1.0
  }' \
  --output speech.mp3

Python


import requests
 
response = requests.post(
  'https://aiapi.services/v1/audio/speech',
  headers={
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  json={
    'model': 'text-to-speech-001',
    'input': 'The weather is nice today, perfect for a walk.',
    'voice': 'alloy',
    'speed': 1.0
  }
)
 
with open('speech.mp3', 'wb') as f:
  f.write(response.content)
 
print('Speech file saved as speech.mp3')

JavaScript


const response = await fetch('https://aiapi.services/v1/audio/speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'text-to-speech-001',
    input: 'The weather is nice today, perfect for a walk.',
    voice: 'alloy',
    speed: 1.0
  })
});
 
const audioBlob = await response.blob();
const url = URL.createObjectURL(audioBlob);
 
// Play audio
const audio = new Audio(url);
audio.play();

Go


package main
 
import (
  "bytes"
  "encoding/json"
  "fmt"
  "io"
  "net/http"
  "os"
)
 
func main() {
  url := "https://aiapi.services/v1/audio/speech"
 
  payload := map[string]interface{}{
    "model": "text-to-speech-001",
    "input": "The weather is nice today, perfect for a walk.",
    "voice": "alloy",
    "speed": 1.0,
  }
 
  jsonData, _ := json.Marshal(payload)
  req, _ := http.NewRequest("POST", url, bytes.NewBuffer(jsonData))
  req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
  req.Header.Set("Content-Type", "application/json")
 
  client := &http.Client{}
  resp, _ := client.Do(req)
  defer resp.Body.Close()
 
  out, _ := os.Create("speech.mp3")
  defer out.Close()
  io.Copy(out, resp.Body)
  fmt.Println("Speech file saved as speech.mp3")
}

Rust


use reqwest::header::{HeaderMap, HeaderValue, AUTHORIZATION, CONTENT_TYPE};
use serde_json::json;
use std::fs::File;
use std::io::Write;
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = reqwest::Client::new();
 
    let mut headers = HeaderMap::new();
    headers.insert(AUTHORIZATION, HeaderValue::from_static("Bearer YOUR_API_KEY"));
    headers.insert(CONTENT_TYPE, HeaderValue::from_static("application/json"));
 
    let payload = json!({
        "model": "text-to-speech-001",
        "input": "The weather is nice today, perfect for a walk.",
        "voice": "alloy",
        "speed": 1.0
    });
 
    let response = client
        .post("https://aiapi.services/v1/audio/speech")
        .headers(headers)
        .json(&payload)
        .send()
        .await?;
 
    let audio_data = response.bytes().await?;
 
    let mut file = File::create("speech.mp3")?;
    file.write_all(&audio_data)?;
 
    println!("Speech file saved as speech.mp3");
 
    Ok(())
}

PHP


<?php
 
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://aiapi.services/v1/audio/speech");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    "Authorization: Bearer YOUR_API_KEY",
    "Content-Type: application/json"
]);
 
$data = [
    "model" => "text-to-speech-001",
    "input" => "The weather is nice today, perfect for a walk.",
    "voice" => "alloy",
    "speed" => 1.0
];
 
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
$response = curl_exec($ch);
curl_close($ch);
 
file_put_contents('speech.mp3', $response);
echo "Speech file saved as speech.mp3\n";
?>

Ruby


require 'net/http'
require 'json'
 
uri = URI('https://aiapi.services/v1/audio/speech')
request = Net::HTTP::Post.new(uri)
request['Authorization'] = 'Bearer YOUR_API_KEY'
request['Content-Type'] = 'application/json'
 
request.body = {
  model: 'text-to-speech-001',
  input: 'The weather is nice today, perfect for a walk.',
  voice: 'alloy',
  speed: 1.0
}.to_json
 
response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
  http.request(request)
end
 
File.write('speech.mp3', response.body)
puts "Speech file saved as speech.mp3"

Response Format

Success Response

The response is binary audio data (not JSON format). HTTP response headers include:


Content-Type: audio/mpeg              # MP3 format
Content-Type: audio/opus              # Opus format
Content-Type: audio/aac               # AAC format
Content-Type: audio/flac              # FLAC format
Content-Type: audio/wav               # WAV format
Content-Type: audio/pcm               # PCM format

Content-Length: 45678                 # File size (bytes)

Usage:

Save to File


# Save as file
with open('output.mp3', 'wb') as f:
    f.write(response.content)

Stream Playback


// Play in browser
const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();

In-Memory Processing


# Process in memory
from io import BytesIO
import pygame
 
audio_bytes = BytesIO(response.content)
pygame.mixer.init()
pygame.mixer.music.load(audio_bytes)
pygame.mixer.music.play()

Audio Format Comparison

Format	File Size	Quality	Compatibility	Recommended Use
mp3	Medium	Good	Excellent	General purpose, default
opus	Smallest	Excellent	Good	Bandwidth-limited, real-time
aac	Medium	Excellent	Good	iOS/Mac applications
flac	Large	Lossless	Fair	High-quality audio needs
wav	Largest	Lossless	Excellent	Professional audio
pcm	Largest	Lossless	Poor	Low-level audio development

File Size Estimation

Approximate relationship between text length and audio file size (MP3 format):

Text Length	Audio Duration	MP3 File Size
100 chars	~10 seconds	~20KB
500 chars	~50 seconds	~100KB
1000 chars	~100 seconds	~200KB
4096 chars (max)	~400 seconds	~800KB

Error Response

When requests fail, JSON-formatted error is returned. See Error Handling documentation for details.


{
  "code": "invalid_request_error",
  "message": "Invalid parameter: input text too long",
  "data": null
}

Common Errors:

input_too_long - Text exceeds maximum length (4096 characters)
invalid_voice - Unsupported voice type
quota_not_enough - Insufficient quota

Voice Types

Voice Type	Characteristics	Use Cases
alloy	Neutral, clear	General purpose
echo	Male, steady	Business, news
fable	Warm, friendly	Storytelling
onyx	Deep, authoritative	Formal occasions
nova	Female, energetic	Advertising, marketing
shimmer	Soft, elegant	Assistant, customer service

Best Practices

Performance Optimization:

Recommended maximum text length per request: 4096 characters
For longer texts, process in segments
Use appropriate speech speed; default 1.0 is most natural

Important Notes:

Generated audio file size is proportional to text length
Generation time may vary slightly between voice types
Use HTTPS to ensure secure audio data transmission

Authentication - Learn how to get and use API keys
Available Models - View complete model list and pricing
Music API - AI music generation features

Audio API

Endpoint

Text-to-Speech (TTS)

Authentication

Supported Models

Text-to-Speech (TTS)

Text-to-Speech

Request Parameters

Required Parameters

Optional Parameters

Code Examples

cURL

Python

JavaScript

Go

Rust

PHP

Ruby

Response Format

Success Response

Save to File

Stream Playback

In-Memory Processing

Audio Format Comparison

File Size Estimation

Error Response

Voice Types

Best Practices

Related Resources