Models - Trelis Router

Available Models

Speech-to-text and text-to-speech models with transparent pricing. Bring your own API keys.

Sync transcription via POST /api/v1/transcribe. Models with the Batch badge also support async jobs via POST /api/v1/jobs.

Trelis Models

ASR Trelis Batch Trelis

$0.59/hr

Multi-speaker Transcription

Premium multilingual speaker-attributed transcription built for 4+ concurrent speakers from a single audio input. Returns timestamped, speaker-labelled segments. Supports automatic language detection, optional language hints, and long-form audio with consistent speaker labels throughout. Trelis-hosted (no BYOK).

Native 4+ speaker diarization Automatic language detection 99+ languages Word-/segment-level timestamps Long-form audio support Consistent speakers across long recordings

Provider	Trelis
Cost	$0.59 / hour of audio
Input Formats	mp3, wav, flac, m4a, ogg, webm, aac, mp4
Output Formats	json, vtt, srt, text
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 10 more
Size Limits	Max 400 MB · Max 3 hrs
Rate Limits	Managed by Trelis; contact us for high-volume or dedicated-capacity workloads

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "trelis/chorus-pro"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "trelis/chorus-pro"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR Trelis Batch Trelis

$0.25/hr

Whisper Hinglish (Preview)

Whisper-large-v3 fine-tune specialised for Hinglish (Hindi-English code-switched) speech, with strong pure-Hindi and English transcription. Trelis-hosted on Modal. Supports an optional code-switch mode (<|mixedcode|>) that keeps English words in Latin script; on by default.

Hindi-English code-switching Code-switch token (<|mixedcode|>) keeps English in Latin script Automatic energy-based chunking for long audio Hindi & English

Provider	Trelis · API Docs
Cost	$0.25 / hour of audio
Input Formats	mp3, wav, flac, m4a, ogg, webm, aac, mp4
Output Formats	json, txt
Languages	hi, en
Size Limits	Max 1000 MB
Rate Limits	Managed by Trelis; contact us for high-volume.

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "trelis/whisper-hinglish"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "trelis/whisper-hinglish"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR Trelis Trelis

$0.19/hr

OmniASR 1B

Trelis-hosted OmniASR LLM Unlimited 1B for long-tail multilingual speech recognition across 1,600+ languages. You can use automatic language detection or provide a language hint for better accuracy. No word-level timestamps.

1,600+ languages Automatic language detection Optional language hints Optimized managed serving

Provider	Trelis · API Docs
Cost	$0.19 / hour of audio
Input Formats	mp3, wav, flac, m4a, ogg, webm, aac, mp4
Output Formats	json, txt
Languages	auto, 1,600+ supported language codes
Size Limits	Max 1000 MB
Rate Limits	Managed by Trelis; contact us for high-volume or dedicated-capacity workloads

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "trelis/omniasr-1b"},
)
print(resp.json()["text"])

BYOK Models

ASR BYOK Batch AssemblyAI

$0.21/hr

Universal-3 Pro

High-accuracy batch speech-to-text with word-level timestamps and prompting capabilities. Supports 6 languages with automatic language detection.

Word-level timestamps Auto language detection 6 languages

Provider	AssemblyAI · API Docs
Cost	$0.21 / hour of audio
Input Formats	mp3, wav, flac, m4a, ogg, webm, aac, mp4
Output Formats	json, srt, vtt
Languages	en, es, pt, fr, de, it
Size Limits	Max 5000 MB · Max 10 hrs
Rate Limits	Free: 5 concurrent; Paid: 200+ concurrent (auto-scaling); Global: 20,000 req/5min

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "assemblyai/universal-3-pro"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "assemblyai/universal-3-pro"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR BYOK Batch ElevenLabs

$0.40/hr

Scribe v2

High-accuracy batch speech-to-text with word-level timestamps. Supports 99 languages with automatic language detection.

Word-level timestamps Auto language detection 99 languages

Provider	ElevenLabs · API Docs
Cost	$0.40 / hour of audio
Input Formats	mp3, wav, m4a, flac, ogg, webm, aac, opus
Output Formats	json, srt, txt, html, docx, pdf, segmented_json
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 10 more
Size Limits	Max 3000 MB · Max 10 hrs
Rate Limits	Concurrency by tier: Free=8, Starter=12, Creator=20, Pro=40, Scale/Business=60; files >8min chunked into up to 4 segments; multi-channel limited to 1hr/5ch

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "elevenlabs/scribe-v2"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "elevenlabs/scribe-v2"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR BYOK Batch Speechmatics

$0.75/hr

Ursa 2 Enhanced

Enterprise-grade batch transcription powered by Ursa 2 in enhanced accuracy mode. Best-in-class accuracy across 70 languages.

Word-level timestamps 70 languages

Provider	Speechmatics · API Docs
Cost	$0.75 / hour of audio
Input Formats	mp3, wav, m4a, flac, ogg, aac, amr, mp4, mpeg
Output Formats	json-v2, txt, srt
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 20 more
Size Limits	Max 1000 MB
Rate Limits	10 new jobs/s, 50 status requests/s, max 20,000 concurrent jobs; monthly quota: Free=2hrs, Paid=6,000hrs

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "speechmatics/ursa-2-enhanced"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "speechmatics/ursa-2-enhanced"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR BYOK Batch Deepgram

$0.55/hr

Nova-3

Deepgram's latest and most accurate speech-to-text model with word-level timestamps. Supports 50+ languages with automatic language detection.

Word-level timestamps Auto language detection 50+ languages

Provider	Deepgram · API Docs
Cost	$0.55 / hour of audio
Input Formats	mp3, wav, flac, m4a, ogg, opus, webm, aac, mp4, mp2, pcm
Output Formats	json
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 24 more
Rate Limits	Free: $200 credits; Pay-as-you-go: no documented rate limits; per-second billing

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "deepgram/nova-3"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "deepgram/nova-3"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR BYOK Batch Google

$0.24/hr

Gemini 2.5 Pro

Google's most capable multimodal model used for audio transcription via prompted generation. Supports 50+ languages with automatic language detection. No word-level timestamps. Audio duration and cost are estimated from token count (32 tokens/second), not reported by the API. Hard 20 MB file size limit — use AssemblyAI, ElevenLabs, or Deepgram for larger files.

Auto language detection 50+ languages

Provider	Google · API Docs
Cost	$0.24 / hour of audio
Input Formats	wav, mp3, aac, ogg, flac, m4a
Output Formats	json
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 26 more
Size Limits	Max 20 MB · Max 10 hrs
Rate Limits	Free: 250K TPM, varies by model; Paid: higher limits; see ai.google.dev/gemini-api/docs/rate-limits

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "google/gemini-2.5-pro"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "google/gemini-2.5-pro"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR BYOK Batch Google

$0.15/hr

Gemini 2.5 Flash

Google's fast, cost-efficient multimodal model used for audio transcription via prompted generation. Supports 50+ languages with automatic language detection. No word-level timestamps. Audio duration and cost are estimated from token count (32 tokens/second), not reported by the API. Hard 20 MB file size limit — use AssemblyAI, ElevenLabs, or Deepgram for larger files.

Auto language detection 50+ languages

Provider	Google · API Docs
Cost	$0.15 / hour of audio
Input Formats	wav, mp3, aac, ogg, flac, m4a
Output Formats	json
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 26 more
Size Limits	Max 20 MB · Max 10 hrs
Rate Limits	Free: 250K TPM, varies by model; Paid: higher limits; see ai.google.dev/gemini-api/docs/rate-limits

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "google/gemini-2.5-flash"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "google/gemini-2.5-flash"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR BYOK Batch Google

$0.37/hr

Gemini 3.1 Pro (Preview)

Google's most capable Gemini 3 series model used for audio transcription via prompted generation. Supports 50+ languages with automatic language detection. No word-level timestamps. Audio duration and cost are estimated from token count (32 tokens/second), not reported by the API. Hard 20 MB inline file size limit — use AssemblyAI, ElevenLabs, or Deepgram for larger files. Preview model: replaces the discontinued gemini-3-pro-preview.

Auto language detection 50+ languages

Provider	Google · API Docs
Cost	$0.37 / hour of audio
Input Formats	wav, mp3, aac, ogg, flac, m4a
Output Formats	json
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 26 more
Size Limits	Max 20 MB · Max 8 hrs
Rate Limits	Paid only (no free tier); standard input $2/1M (≤200k ctx), $4/1M (>200k); see ai.google.dev/gemini-api/docs/rate-limits

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "google/gemini-3.1-pro-preview"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "google/gemini-3.1-pro-preview"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR BYOK Batch Google

$0.15/hr

Gemini 3 Flash (Preview)

Google's fast, cost-efficient Gemini 3 series model used for audio transcription via prompted generation. Supports 50+ languages with automatic language detection. No word-level timestamps. Audio duration and cost are estimated from token count (32 tokens/second), not reported by the API. Hard 20 MB inline file size limit — use AssemblyAI, ElevenLabs, or Deepgram for larger files.

Auto language detection 50+ languages

Provider	Google · API Docs
Cost	$0.15 / hour of audio
Input Formats	wav, mp3, aac, ogg, flac, m4a
Output Formats	json
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 26 more
Size Limits	Max 20 MB · Max 8 hrs
Rate Limits	Audio input $1/1M tokens, text output $3/1M; see ai.google.dev/gemini-api/docs/rate-limits

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "google/gemini-3-flash-preview"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "google/gemini-3-flash-preview"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

ASR BYOK Sarvam

$0.36/hr

Saaras v3

Sarvam AI's flagship multilingual ASR optimized for 22 Indian languages plus Indian English. Uses Saaras v3 REST mode by default (`transcribe`) and supports provider option `mode` for `transcribe`, `translate`, `verbatim`, `translit`, and `codemix`. Sync REST API only — single-request limit is 30 seconds of audio. Use AssemblyAI, ElevenLabs, or Speechmatics for longer files.

Auto language detection 22 Indian languages + English Saaras modes: transcribe/translate/verbatim/translit/codemix

Provider	Sarvam · API Docs
Cost	$0.36 / hour of audio
Input Formats	mp3, wav, aac, aiff, ogg, opus, flac, mp4, m4a, amr, wma, webm
Output Formats	json
Languages	en, hi, bn, kn, ml, mr, od, pa, ta, te + 13 more
Size Limits	Max 0 hrs
Rate Limits	Per-plan RPM: Starter=60, Pro=200, Business=1000; ₹1,000 free credits on every plan; 5xx on overload

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "sarvam/saaras-v3"},
)
print(resp.json()["text"])

ASR BYOK Batch Together

$0.09/hr

Whisper Large V3

OpenAI Whisper Large V3 hosted on Together AI with word-level timestamps. Supports 99+ languages with automatic language detection.

Word-level timestamps Auto language detection 99+ languages

Provider	Together · API Docs
Cost	$0.09 / hour of audio
Input Formats	mp3, wav, m4a, webm, flac
Output Formats	json, verbose_json
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 10 more
Size Limits	Max 1000 MB
Rate Limits	100 requests per rate-limit window (from x-ratelimit-limit header); no other documented limits

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/transcribe",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "together/whisper-large-v3"},
)
print(resp.json()["text"])

Batch API Call

import httpx, time

# Submit batch job
resp = httpx.post(
    "https://router.trelis.com/api/v1/jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    files={"file": open("audio.mp3", "rb")},
    data={"model": "together/whisper-large-v3"},
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)
print(r.get("text", r.get("error")))

Sync synthesis via POST /api/v1/synthesize. Models with the Batch badge also support async jobs via POST /api/v1/tts-jobs.

TTS BYOK Batch Cartesia

$30/M chars

Sonic 3

Cartesia's flagship TTS model with ultra-low latency and high naturalness. 42 languages. Voices specified by UUID — use the Cartesia playground to find voice IDs.

42 languages Ultra-low latency Voice cloning Emotion control

Provider	Cartesia · API Docs
Cost	$30 / million characters (~$0.03 / 1K chars)
Output Formats	mp3_44100_32, mp3_44100_64, mp3_44100_96, mp3_44100_128, mp3_44100_192, wav_44100, wav_22050, pcm_f32le, pcm_s16le, pcm_mulaw, pcm_alaw
Languages	en, fr, de, es, pt, zh, ja, hi, it, ko + 32 more
Rate Limits	Concurrency by plan: Free=2, Pro=3, Startup=5, Scale=15; 429 on exceed

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/synthesize",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Hello, world!",
        "model": "cartesia/sonic-3",
    },
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)
print(f"Audio: {len(resp.content)} bytes")
print(f"Cost: ${resp.headers['X-Cost-Dollars']}")

Batch API Call

import httpx, time

# Submit batch TTS job
resp = httpx.post(
    "https://router.trelis.com/api/v1/tts-jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Your text here...",
        "model": "cartesia/sonic-3",
    },
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)

# Download audio
if r["status"] == "completed":
    audio = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}?download=true",
        headers={"Authorization": "Bearer trr_your_key"},
    )
    with open("output.mp3", "wb") as f:
        f.write(audio.content)

TTS BYOK Batch Google

$0.125/M chars

Gemini 2.5 Flash TTS

Google's fast, cost-efficient TTS model via Gemini API. 30 HD voices across 100+ languages. Auto-detects language. Always returns PCM 24kHz mono audio.

100+ languages Auto language detection 30 HD voices Steerable via instructions

Provider	Google · API Docs
Cost	$0.125 / million characters (~$0.000125 / 1K chars)
Output Formats	pcm_24000
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 26 more
Rate Limits	150 QPM (free tier); paid tier limits vary — see ai.google.dev/gemini-api/docs/rate-limits

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/synthesize",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Hello, world!",
        "model": "google/gemini-2.5-flash-tts",
    },
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)
print(f"Audio: {len(resp.content)} bytes")
print(f"Cost: ${resp.headers['X-Cost-Dollars']}")

Batch API Call

import httpx, time

# Submit batch TTS job
resp = httpx.post(
    "https://router.trelis.com/api/v1/tts-jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Your text here...",
        "model": "google/gemini-2.5-flash-tts",
    },
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)

# Download audio
if r["status"] == "completed":
    audio = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}?download=true",
        headers={"Authorization": "Bearer trr_your_key"},
    )
    with open("output.mp3", "wb") as f:
        f.write(audio.content)

TTS BYOK Batch Google

$0.25/M chars

Gemini 2.5 Pro TTS

Google's highest-quality TTS model via Gemini API. Studio-quality audio with natural prosody. 30 HD voices across 100+ languages. Always returns PCM 24kHz mono audio.

100+ languages Auto language detection 30 HD voices Studio-quality prosody Steerable via instructions

Provider	Google · API Docs
Cost	$0.25 / million characters (~$0.00025 / 1K chars)
Output Formats	pcm_24000
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 26 more
Rate Limits	125 QPM (free tier); paid tier limits vary — see ai.google.dev/gemini-api/docs/rate-limits

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/synthesize",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Hello, world!",
        "model": "google/gemini-2.5-pro-tts",
    },
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)
print(f"Audio: {len(resp.content)} bytes")
print(f"Cost: ${resp.headers['X-Cost-Dollars']}")

Batch API Call

import httpx, time

# Submit batch TTS job
resp = httpx.post(
    "https://router.trelis.com/api/v1/tts-jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Your text here...",
        "model": "google/gemini-2.5-pro-tts",
    },
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)

# Download audio
if r["status"] == "completed":
    audio = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}?download=true",
        headers={"Authorization": "Bearer trr_your_key"},
    )
    with open("output.mp3", "wb") as f:
        f.write(audio.content)

TTS BYOK Batch OpenAI

$0.6/M chars

GPT-4o Mini TTS

OpenAI's fast, affordable TTS model. 13 voices, 57+ languages. Supports voice style instructions. Max ~8,000 characters per request.

57+ languages 13 voices Style instructions Multiple output formats

Provider	OpenAI · API Docs
Cost	$0.6 / million characters (~$0.0006 / 1K chars)
Output Formats	mp3, opus, aac, flac, wav, pcm
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 25 more
Rate Limits	Tier-dependent; see platform.openai.com/docs/guides/rate-limits

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/synthesize",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Hello, world!",
        "model": "openai/gpt-4o-mini-tts",
    },
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)
print(f"Audio: {len(resp.content)} bytes")
print(f"Cost: ${resp.headers['X-Cost-Dollars']}")

Batch API Call

import httpx, time

# Submit batch TTS job
resp = httpx.post(
    "https://router.trelis.com/api/v1/tts-jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Your text here...",
        "model": "openai/gpt-4o-mini-tts",
    },
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)

# Download audio
if r["status"] == "completed":
    audio = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}?download=true",
        headers={"Authorization": "Bearer trr_your_key"},
    )
    with open("output.mp3", "wb") as f:
        f.write(audio.content)

TTS BYOK Batch ElevenLabs

$300/M chars

Eleven Multilingual v2

Most lifelike TTS model with rich emotional expression. 10,000 character limit per request (~10 min audio). Supports 29 languages with consistent voice quality.

29 languages Multiple voices Voice settings control High emotional range

Provider	ElevenLabs · API Docs
Cost	$300 / million characters (~$0.3 / 1K chars)
Output Formats	mp3_44100_128, mp3_44100_96, mp3_44100_64, mp3_44100_32, mp3_22050_32, pcm_16000, pcm_22050, pcm_24000, pcm_44100, pcm_48000, opus_48000_32, opus_48000_64, opus_48000_96, opus_48000_128, opus_48000_192, ulaw_8000, alaw_8000
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 19 more
Rate Limits	Concurrency by tier: Free=2, Starter=3, Creator=5, Pro=10, Scale=15, Business=15; character quotas vary by plan

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/synthesize",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Hello, world!",
        "model": "elevenlabs/eleven-multilingual-v2",
    },
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)
print(f"Audio: {len(resp.content)} bytes")
print(f"Cost: ${resp.headers['X-Cost-Dollars']}")

Batch API Call

import httpx, time

# Submit batch TTS job
resp = httpx.post(
    "https://router.trelis.com/api/v1/tts-jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Your text here...",
        "model": "elevenlabs/eleven-multilingual-v2",
    },
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)

# Download audio
if r["status"] == "completed":
    audio = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}?download=true",
        headers={"Authorization": "Bearer trr_your_key"},
    )
    with open("output.mp3", "wb") as f:
        f.write(audio.content)

TTS BYOK Batch ElevenLabs

$300/M chars

Eleven v3

Latest ElevenLabs model with 70+ languages, audio tags ([laughs], [whispers], etc.), and rich emotional expressiveness. 5,000 character limit per request (~5 min audio).

70+ languages Multiple voices Audio tags High emotional range Voice settings control

Provider	ElevenLabs · API Docs
Cost	$300 / million characters (~$0.3 / 1K chars)
Output Formats	mp3_44100_128, mp3_44100_96, mp3_44100_64, mp3_44100_32, mp3_22050_32, pcm_16000, pcm_22050, pcm_24000, pcm_44100, pcm_48000, opus_48000_32, opus_48000_64, opus_48000_96, opus_48000_128, opus_48000_192, ulaw_8000, alaw_8000
Languages	en, es, fr, de, it, pt, nl, ja, ko, zh + 63 more
Rate Limits	Concurrency by tier: Free=2, Starter=3, Creator=5, Pro=10, Scale=15, Business=15; character quotas vary by plan

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/synthesize",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Hello, world!",
        "model": "elevenlabs/eleven_v3",
    },
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)
print(f"Audio: {len(resp.content)} bytes")
print(f"Cost: ${resp.headers['X-Cost-Dollars']}")

Batch API Call

import httpx, time

# Submit batch TTS job
resp = httpx.post(
    "https://router.trelis.com/api/v1/tts-jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Your text here...",
        "model": "elevenlabs/eleven_v3",
    },
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)

# Download audio
if r["status"] == "completed":
    audio = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}?download=true",
        headers={"Authorization": "Bearer trr_your_key"},
    )
    with open("output.mp3", "wb") as f:
        f.write(audio.content)

TTS BYOK Batch Mistral

$16/M chars

Voxtral Mini TTS

Mistral's TTS model with zero-shot voice cloning and ~100ms latency. 9 languages. Voice determines language. 10 preset voices available; custom voices via Mistral Voices API.

9 languages Zero-shot voice cloning Low latency (~100ms) 10 preset voices

Provider	Mistral · API Docs
Cost	$16 / million characters (~$0.016 / 1K chars)
Output Formats	mp3, wav, pcm, flac, opus
Languages	en, fr, de, es, nl, pt, it, hi, ar
Rate Limits	Workspace-level limits managed at admin.mistral.ai/plateforme/limits; no public per-model rate limits documented

Sample API Call

import httpx

resp = httpx.post(
    "https://router.trelis.com/api/v1/synthesize",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Hello, world!",
        "model": "mistral/voxtral-mini-tts-2603",
    },
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)
print(f"Audio: {len(resp.content)} bytes")
print(f"Cost: ${resp.headers['X-Cost-Dollars']}")

Batch API Call

import httpx, time

# Submit batch TTS job
resp = httpx.post(
    "https://router.trelis.com/api/v1/tts-jobs",
    headers={"Authorization": "Bearer trr_your_key"},
    data={
        "text": "Your text here...",
        "model": "mistral/voxtral-mini-tts-2603",
    },
)
job_id = resp.json()["job_id"]

# Poll for result
while True:
    r = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}",
        headers={"Authorization": "Bearer trr_your_key"},
    ).json()
    if r["status"] in ("completed", "failed"):
        break
    time.sleep(3)

# Download audio
if r["status"] == "completed":
    audio = httpx.get(
        f"https://router.trelis.com/api/v1/tts-jobs/{job_id}?download=true",
        headers={"Authorization": "Bearer trr_your_key"},
    )
    with open("output.mp3", "wb") as f:
        f.write(audio.content)

Available Models

Trelis Models

Multi-speaker Transcription

Whisper Hinglish (Preview)

OmniASR 1B

BYOK Models

Universal-3 Pro

Scribe v2

Ursa 2 Enhanced

Nova-3

Gemini 2.5 Pro

Gemini 2.5 Flash

Gemini 3.1 Pro (Preview)

Gemini 3 Flash (Preview)

Saaras v3

Whisper Large V3

Sonic 3

Gemini 2.5 Flash TTS

Gemini 2.5 Pro TTS

GPT-4o Mini TTS

Eleven Multilingual v2

Eleven v3

Voxtral Mini TTS

Welcome to Trelis Router!