transcription

OpenAI logoWhisperX

Audio diarization model

Model details

View repository

Example usage

1import requests
2import os
3
4# Replace the empty string with your model id below
5model_id = ""
6baseten_api_key = os.environ["BASETEN_API_KEY"]
7
8data = {
9    "audio_file": "https://www2.cs.uic.edu/~i101/SoundFiles/gettysburg10.wav"
10}
11
12
13# Call model endpoint
14res = requests.post(
15    f"https://model-{model_id}.api.baseten.co/environments/production/predict",
16    headers={"Authorization": f"Api-Key {baseten_api_key}"},
17    json=data
18)
19
20# Print the output of the model
21print(res.json())
Input
JSON output
1[
2    {
3        "start": 0,
4        "end": 9.8,
5        "text": "Four score and seven years ago, our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal.",
6        "speaker": "SPEAKER_01"
7    }
8]

transcription models

See all
OpenAI logo
Transcription

Whisper (best performance)

V3 - H100 MIG 40GB
OpenAI logo
Transcription

WhisperX

L4
OpenAI logo
Transcription

Whisper V3

V3 - H100 MIG 40GB

OpenAI models

See all
OpenAI logo
Transcription

Whisper (best performance)

V3 - H100 MIG 40GB
OpenAI logo
Transcription

WhisperX

L4
OpenAI logo
Transcription

Whisper V3

V3 - H100 MIG 40GB

🔥 Trending models