‌

Model library

Browse our library of open source models that are ready to deploy behind an API endpoint in seconds.

Deploy your own model

All LLM Transcription Text to speech Image generation Embedding Image processing Streaming

All

11 vLLM models

LLM

Qwen3.5 9B Latency

V1 - Latency - vLLM - H100

LLM

Qwen3.5 35B-A3B Latency

V1 - Latency - vLLM - H100

LLM

Qwen3.5 122B-A10B Latency

V1 - Latency - vLLM - H100

LLM

Mistral Small 3.1

3.1 - vLLM - H100

LLM

Gemma 3 27B IT

3 - Instruct - vLLM - H100

LLM

Llama 4 Scout

V4.0 - Instruct - vLLM - H100

LLM

Qwen3.5 4B Latency

V1 - Latency - vLLM - H100

LLM

Llama 4 Maverick

V4.0 - Instruct - vLLM - B200

LLM

Seed OSS 36B Instruct

Seed OSS 36B Instruct - Instruct - vLLM - H100

LLM

Pixtral 12B

Pixtral - vLLM - H100

LLM

Phi 3.5 Mini Instruct

3.5 - 128k - vLLM - A10G

🔥 Trending models

Model API

LLM

NVIDIA Nemotron 3 Super

Super

Model API

LLM

MiniMax M2.5

M2.5

Model API

LLM

GLM 5

Model API

LLM

Kimi K2.5

2.5

‌

Explore Baseten today

Start deploying Talk to an engineer