transcription

OpenAI logoWhisper V3 Turbo

A low-latency Whisper V3 Turbo deployment optimized for shorter audio clips

Model details

Example usage

The model accepts a single URL to an audio file, such as a .mp3 or .wav. The audio file should contain clearly audible speech. This example transcribes a ten-second snippet of a recitation of the Gettysburg address.

The JSON output includes the auto-detected language, transcription segments with timestamps, and the complete transcribed text.

Input
1import requests
2import os
3
4# Model ID for production deployment
5model_id = ""
6# Read secrets from environment variables
7baseten_api_key = os.environ["BASETEN_API_KEY"]
8
9# Call model endpoint
10resp = requests.post(
11    f"https://model-{model_id}.api.baseten.co/production/predict",
12    headers={"Authorization": f"Api-Key {baseten_api_key}"},
13    json={
14      "url": "https://www2.cs.uic.edu/~i101/SoundFiles/gettysburg10.wav",
15    }
16)
17
18print(resp.content.decode("utf-8"))
JSON output
1{
2    "segments": [
3        {
4            "start": 0,
5            "end": 9.8,
6            "text": "Four score and seven years ago, our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal."
7        }
8    ],
9    "language_code": "en"
10}

transcription models

See all
OpenAI logo
Transcription

Whisper (best performance)

V3 - H100 MIG 40GB
OpenAI logo
Transcription

WhisperX

L4
OpenAI logo
Transcription

Whisper V3

V3 - H100 MIG 40GB

OpenAI models

See all
OpenAI logo
Transcription

Whisper (best performance)

V3 - H100 MIG 40GB
OpenAI logo
Transcription

WhisperX

L4
OpenAI logo
Transcription

Whisper V3

V3 - H100 MIG 40GB

🔥 Trending models