API Reference

Voice API

Text-to-speech (TTS) for natural voice output and speech-to-text (STT) for fast transcription with language detection.

Base URL

https://api.dyva.ai/v1

Text-to-Speech

Convert text to speech. Response is a raw binary audio stream ( audio/mpeg Content-Type).

POST/v1/voice/ttsAuth Required

Synthesize speech from text. Returns an audio/mpeg binary stream.

Request Body

Name	Type	Required	Description
`text`	string	Required	The text to synthesize into speech. Maximum 5,000 characters per request.
`voice_id`	string	Optional	The voice to use for synthesis. Defaults to "alloy" (Dyva's default voice). See the voice table below for all available options.

Examples

$_cURL

PyPython

JSJavaScript

curl -X POST https://api.dyva.ai/v1/voice/tts \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello from Dyva!", "voice_id": "alloy"}' \
  --output speech.mp3

NoteNo JSON wrapper. The body is raw MP3 binary. Stream directly to a file or audio player.

Speech-to-Text

Transcribe audio to text. Upload as multipart/form-data with the field name audio.

POST/v1/voice/sttAuth Required

Transcribe an audio file to text. Accepts multipart/form-data.

Request Body

Name	Type	Required	Description
`audio`	file	Required	The audio file to transcribe. Supported formats: mp3, wav, ogg, webm. Maximum file size: 25 MB.

Response

{
  "text": "Hello from Dyva!",
  "confidence": 0.97,
  "language": "en"
}

Examples

$_cURL

PyPython

JSJavaScript

curl -X POST https://api.dyva.ai/v1/voice/stt \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "audio=@recording.mp3"

Available Voices

Pass any of these values as the voice_id parameter in the TTS endpoint.

Voice ID	Description
`alloy`	Neutral and balanced — Dyva's default voice
`echo`	Warm and resonant
`fable`	Expressive and animated
`nova`	Bright and energetic
`onyx`	Deep and authoritative
`shimmer`	Soft and clear

Supported Audio Formats

STT accepts the following formats. TTS returns MP3.

.mp3

.wav

.ogg

.webm

Rate Limits

Endpoint	Free Tier	Pro Tier
`/v1/voice/tts`	100 requests / minute	1,000 requests / minute
`/v1/voice/stt`	60 requests / minute	600 requests / minute

Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) are included in every response. Exceeding the limit returns a 429 Too Many Requests response.

Streaming TTS Example

Stream the audio response for low-latency playback instead of waiting for the full download.

PyPython

JSJavaScript

import requests

response = requests.post(
    "https://api.dyva.ai/v1/voice/tts",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={"text": "Stream me in real time.", "voice_id": "shimmer"},
    stream=True,
)

with open("stream.mp3", "wb") as f:
    for chunk in response.iter_content(chunk_size=4096):
        f.write(chunk)

API Reference

Voice API

Text-to-speech (TTS) for natural voice output and speech-to-text (STT) for fast transcription with language detection.

Base URL

https://api.dyva.ai/v1

Text-to-Speech

Convert text to speech. Response is a raw binary audio stream ( audio/mpeg Content-Type).

POST/v1/voice/ttsAuth Required

Synthesize speech from text. Returns an audio/mpeg binary stream.

Request Body

Name	Type	Required	Description
`text`	string	Required	The text to synthesize into speech. Maximum 5,000 characters per request.
`voice_id`	string	Optional	The voice to use for synthesis. Defaults to "alloy" (Dyva's default voice). See the voice table below for all available options.

Examples

$_cURL

PyPython

JSJavaScript

curl -X POST https://api.dyva.ai/v1/voice/tts \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello from Dyva!", "voice_id": "alloy"}' \
  --output speech.mp3

NoteNo JSON wrapper. The body is raw MP3 binary. Stream directly to a file or audio player.

Speech-to-Text

Transcribe audio to text. Upload as multipart/form-data with the field name audio.

POST/v1/voice/sttAuth Required

Transcribe an audio file to text. Accepts multipart/form-data.

Request Body

Name	Type	Required	Description
`audio`	file	Required	The audio file to transcribe. Supported formats: mp3, wav, ogg, webm. Maximum file size: 25 MB.

Response

{
  "text": "Hello from Dyva!",
  "confidence": 0.97,
  "language": "en"
}

Examples

$_cURL

PyPython

JSJavaScript

curl -X POST https://api.dyva.ai/v1/voice/stt \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "audio=@recording.mp3"

Available Voices

Pass any of these values as the voice_id parameter in the TTS endpoint.

Voice ID	Description
`alloy`	Neutral and balanced — Dyva's default voice
`echo`	Warm and resonant
`fable`	Expressive and animated
`nova`	Bright and energetic
`onyx`	Deep and authoritative
`shimmer`	Soft and clear

Supported Audio Formats

STT accepts the following formats. TTS returns MP3.

.mp3

.wav

.ogg

.webm

Rate Limits

Endpoint	Free Tier	Pro Tier
`/v1/voice/tts`	100 requests / minute	1,000 requests / minute
`/v1/voice/stt`	60 requests / minute	600 requests / minute

Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) are included in every response. Exceeding the limit returns a 429 Too Many Requests response.

Streaming TTS Example

Stream the audio response for low-latency playback instead of waiting for the full download.

PyPython

JSJavaScript

import requests

response = requests.post(
    "https://api.dyva.ai/v1/voice/tts",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={"text": "Stream me in real time.", "voice_id": "shimmer"},
    stream=True,
)

with open("stream.mp3", "wb") as f:
    for chunk in response.iter_content(chunk_size=4096):
        f.write(chunk)

Voice API

Text-to-Speech

POST /v1/voice/tts

Request Body

Examples

Speech-to-Text

POST /v1/voice/stt

Request Body

Response

Examples

Available Voices

Supported Audio Formats

Rate Limits

Streaming TTS Example

Voice API

Text-to-Speech

POST /v1/voice/tts

Request Body

Examples

Speech-to-Text

POST /v1/voice/stt

Request Body

Response

Examples

Available Voices

Supported Audio Formats

Rate Limits

Streaming TTS Example