transcription

OpenAI logoWhisperX

Audio diarization model

Model details

View repository

Example usage

1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "audio_file": "https://www2.cs.uic.edu/~i101/SoundFiles/gettysburg10.wav"
10}
11
12
13# Call model endpoint
14res = requests.post(
15    f"https://model-{model_id}.api.baseten.co/environments/production/predict",
16    headers={"Authorization": f"Api-Key {baseten_api_key}"},
17    json=data
18)
19
20# Print the output of the model
21print(res.json())
Input
JSON output
1[
2    {
3        "start": 0,
4        "end": 9.8,
5        "text": "Four score and seven years ago, our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal.",
6        "speaker": "SPEAKER_01"
7    }
8]

transcription models

See all
OpenAI logo
Transcription

Whisper Large V3 (best performance)

V3 - H100 MIG 40GB
OpenAI logo
Transcription

Whisper Streaming Large v3

H100 MIG 40GB
OpenAI logo
Transcription

Whisper Streaming Large v3 Turbo

H100 MIG 40GB

OpenAI models

See all
OpenAI logo
Transcription

Whisper Large V3 (best performance)

V3 - H100 MIG 40GB
OpenAI logo
Model API
LLM

GPT OSS 120B

MoE
OpenAI logo
Transcription

Whisper Streaming Large v3

H100 MIG 40GB

🔥 Trending models