Available Models

Speech-to-text models with transparent pricing. Bring your own API keys.

ASR BYOK ElevenLabs
$0.40/hr
Scribe v2

High-accuracy batch speech-to-text with word-level timestamps. Supports 99 languages with automatic language detection.

Word-level timestamps Character-level timestamps Auto language detection 99 languages Speaker diarization Entity detection Keyterm prompting Audio event tagging Multi-channel transcription
ASR BYOK Speechmatics
$0.24/hr
Ursa 2

Enterprise-grade batch transcription powered by Ursa 2 with standard and enhanced accuracy modes. Supports 70 languages with custom dictionaries.

Word-level timestamps Custom dictionaries Speaker diarization Channel diarization 70 languages Entity detection Language identification Sentiment analysis Summarization Topic detection