‌

Model library

Browse our library of open source models that are ready to deploy behind an API endpoint in seconds.

Deploy your own model

All LLM Transcription Text to speech Image generation Embedding Image processing Streaming

All

🔥 Trending models

Model API

LLM

NVIDIA Nemotron 3 Super

Super

Model API

LLM

MiniMax M2.5

M2.5

Model API

LLM

GLM 5

Model API

LLM

Kimi K2.5

2.5

Model API

LLM

GPT OSS 120B

MoE

Transcription

Whisper Large V3

V3 - H100 MIG 40GB

‌

large language models

See all

LLM

GPT OSS 20B

MoE

LLM

Qwen3.5 9B Latency

V1 - Latency - vLLM - H100

LLM

Qwen3.5 35B-A3B Latency

V1 - Latency - vLLM - H100

LLM

Qwen3.5 122B-A10B Latency

V1 - Latency - vLLM - H100

Model API

LLM

NVIDIA Nemotron 3 Super

Super

Model API

LLM

GPT OSS 120B

MoE

text to speech models

See all

Text to speech

Orpheus 3B WebSockets

TRT-LLM - H100 MIG 40GB

Text to speech

MARS8-Flash

V8 - MARS-Flash - A10G

Text to speech

Orpheus TTS

TRT-LLM - H100 MIG 40GB

Text to speech

MARS6

V6 - L4

Text to speech

Qwen3 TTS 12Hz Base Streaming 1.7B

TTS - 12Hz Base

Text to speech

Qwen3 TTS 12Hz Base Streaming 0.6B

TTS - 12Hz Base

transcription models

See all

Transcription

Whisper Large V3 (Streaming)

V3 - H100 MIG 40GB

Transcription

Voxtral Mini 4B Realtime 2602

2602 - Mini - H100 MIG 40GB

Transcription

Whisper Large V3

V3 - H100 MIG 40GB

Transcription

Whisper Large V3 Turbo

V3 - Turbo - H100 MIG 40GB

Transcription

Whisper Large V2

V2 - H100 MIG 40GB

Transcription

Ultravox v0.6 70B

v0.6 - H100

image generation models

See all

Image generation

Qwen Image

Text-to-Image

Image generation

Stable Diffusion XL

XL 1.0 - L4

Image generation

ZenCtrl

Custom Server - H100

Image generation

SDXL Lightning

1.0 - Lightning - A100

Image generation

Stable Diffusion 3 Medium

3 - A100

Image generation

flux-dev

dev - bloat16 - H100 MIG 40GB

embedding models

See all

Embedding

Qwen3 8B Reranker

BEI - H100 MIG 40GB

Embedding

EmbeddingGemma

Embedding

Qwen3 8B Embedding

BEI - H100 MIG 40GB

Embedding

Tulu 3 8B Reward

V3 - Reward - BEI - H100 MIG 40GB

Embedding

BGE Reranker M3

BEI - H100

Embedding

BGE Embedding ICL

BEI - H100

DeepSeek models

See all

LLM

DeepSeek V3.2

V3.2 - B200

Model API

LLM

DeepSeek V3.1

V3.1 - B200

LLM

DeepSeek-R1 Llama 70B

R1 - Llama - TRT-LLM - H100

LLM

DeepSeek-R1 Qwen 32B

R1 - Qwen - TRT-LLM - H100

LLM

DeepSeek-R1 Qwen 7B

R1 - Qwen - TRT-LLM - H100 MIG 40GB

LLM

DeepSeek R1 0528

R1 - 0528 - B200

Qwen models

See all

Image generation

Qwen Image

Text-to-Image

LLM

Qwen3.5 9B Latency

V1 - Latency - vLLM - H100

LLM

Qwen3.5 35B-A3B Latency

V1 - Latency - vLLM - H100

LLM

Qwen3.5 122B-A10B Latency

V1 - Latency - vLLM - H100

Embedding

Qwen3 8B Reranker

BEI - H100 MIG 40GB

LLM

Qwen3 235B 2507

2507

Meta models

See all

LLM

Llama 3.3 70B Instruct

3.3 - TRT-LLM - H100

LLM

Llama 3.1 8B Instruct

3.1 - Instruct - TRT-LLM - H100

LLM

Llama 3.1 405B Instruct

3.1 - Instruct - H100

LLM

Llama 3.2 11B Vision Instruct

3.2 - Vision - A100

LLM

Llama 4 Scout

V4.0 - Instruct - vLLM - H100

LLM

Llama 4 Maverick

V4.0 - Instruct - vLLM - B200

Explore Baseten today

Start deploying Talk to an engineer