Available Models
Speech-to-text models with transparent pricing. Bring your own API keys.
ASR
BYOK
ElevenLabs
$0.40/hr
Scribe v2
High-accuracy batch speech-to-text with word-level timestamps. Supports 99 languages with automatic language detection.
Word-level timestamps
Character-level timestamps
Auto language detection
99 languages
Speaker diarization
Entity detection
Keyterm prompting
Audio event tagging
Multi-channel transcription
ASR
BYOK
Speechmatics
$0.24/hr
Ursa 2
Enterprise-grade batch transcription powered by Ursa 2 with standard and enhanced accuracy modes. Supports 70 languages with custom dictionaries.
Word-level timestamps
Custom dictionaries
Speaker diarization
Channel diarization
70 languages
Entity detection
Language identification
Sentiment analysis
Summarization
Topic detection