Text-to-speech (TTS) for natural voice output and speech-to-text (STT) for fast transcription with language detection.
https://api.dyva.ai/v1Convert text to speech. Response is a raw binary audio stream ( audio/mpeg Content-Type).
/v1/voice/ttsAuth RequiredSynthesize speech from text. Returns an audio/mpeg binary stream.
| Name | Type | Required | Description |
|---|---|---|---|
text | string | Required | The text to synthesize into speech. Maximum 5,000 characters per request. |
voice_id | string | Optional | The voice to use for synthesis. Defaults to "alloy" (Dyva's default voice). See the voice table below for all available options. |
curl -X POST https://api.dyva.ai/v1/voice/tts \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "Hello from Dyva!", "voice_id": "alloy"}' \
--output speech.mp3Transcribe audio to text. Upload as multipart/form-data with the field name audio.
/v1/voice/sttAuth RequiredTranscribe an audio file to text. Accepts multipart/form-data.
| Name | Type | Required | Description |
|---|---|---|---|
audio | file | Required | The audio file to transcribe. Supported formats: mp3, wav, ogg, webm. Maximum file size: 25 MB. |
{
"text": "Hello from Dyva!",
"confidence": 0.97,
"language": "en"
}curl -X POST https://api.dyva.ai/v1/voice/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "audio=@recording.mp3"Pass any of these values as the voice_id parameter in the TTS endpoint.
| Voice ID | Description |
|---|---|
alloy | Neutral and balanced — Dyva's default voice |
echo | Warm and resonant |
fable | Expressive and animated |
nova | Bright and energetic |
onyx | Deep and authoritative |
shimmer | Soft and clear |
STT accepts the following formats. TTS returns MP3.
| Endpoint | Free Tier | Pro Tier |
|---|---|---|
/v1/voice/tts | 100 requests / minute | 1,000 requests / minute |
/v1/voice/stt | 60 requests / minute | 600 requests / minute |
Rate limit headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) are included in every response. Exceeding the limit returns a 429 Too Many Requests response.
Stream the audio response for low-latency playback instead of waiting for the full download.
import requests
response = requests.post(
"https://api.dyva.ai/v1/voice/tts",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={"text": "Stream me in real time.", "voice_id": "shimmer"},
stream=True,
)
with open("stream.mp3", "wb") as f:
for chunk in response.iter_content(chunk_size=4096):
f.write(chunk)