Transcribe audio files into text using state-of-the-art speech recognition models. Supports multiple languages, speaker diarization, and various audio formats. Returns synchronous results for Whisper/Wizper models and asynchronous job IDs for Elevenlabs-STT.
Audio transcription parameters. Use multipart/form-data for file uploads or application/json for URL-based requests.
The body is of type object
.
Synchronous transcription response (Whisper/Wizper models)
The response is of type object
.