"Inference Engineering" is now available. Get your copy here

Model library

Browse our library of open source models that are ready to deploy behind an API endpoint in seconds.

🔥 Trending models

large language models

See all
Qwen Logo
LLM

Qwen3.5 9B Latency

V1 - Latency - vLLM - H100
Qwen Logo
LLM

Qwen3.5 35B-A3B Latency

V1 - Latency - vLLM - H100
Qwen Logo
LLM

Qwen3.5 122B-A10B Latency

V1 - Latency - vLLM - H100
NVIDIA logo
Model API
LLM

NVIDIA Nemotron 3 Super

Super
OpenAI logo
Model API
LLM

GPT OSS 120B

MoE

text to speech models

See all
Canopy Labs Logo
Text to speech

Orpheus 3B WebSockets

TRT-LLM - H100 MIG 40GB
three triangles with the bottom edge missing inside each other
Text to speech

MARS8-Flash

V8 - MARS-Flash - A10G
Canopy Labs Logo
Text to speech

Orpheus TTS

TRT-LLM - H100 MIG 40GB
three triangles with the bottom edge missing inside each other
Text to speech

MARS6

V6 - L4
Qwen Logo
Text to speech

Qwen3 TTS 12Hz Base Streaming 1.7B

TTS - 12Hz Base
Qwen Logo
Text to speech

Qwen3 TTS 12Hz Base Streaming 0.6B

TTS - 12Hz Base

transcription models

See all
OpenAI logo
Transcription

Whisper Large V3 (Streaming)

V3 - H100 MIG 40GB
Mistral AI logo
Transcription

Voxtral Mini 4B Realtime 2602

2602 - Mini - H100 MIG 40GB
OpenAI logo
Transcription

Whisper Large V3

V3 - H100 MIG 40GB
OpenAI logo
Transcription

Whisper Large V3 Turbo

V3 - Turbo - H100 MIG 40GB
OpenAI logo
Transcription

Whisper Large V2

V2 - H100 MIG 40GB
Fixie Logo
Transcription

Ultravox v0.6 70B

v0.6 - H100

image generation models

See all
Qwen Logo
Image generation

Qwen Image

Text-to-Image
Stability AI logo
Image generation

Stable Diffusion XL

XL 1.0 - L4
Fotographer AI
Image generation

ZenCtrl

Custom Server - H100
ByteDance logo
Image generation

SDXL Lightning

1.0 - Lightning - A100
Stability AI logo
Image generation

Stable Diffusion 3 Medium

3 - A100
black forest labs logo
Image generation

flux-dev

dev - bloat16 - H100 MIG 40GB

embedding models

See all
Qwen Logo
Embedding

Qwen3 8B Reranker

BEI - H100 MIG 40GB
google logo
Embedding

EmbeddingGemma

Embedding
Qwen Logo
Embedding

Qwen3 8B Embedding

BEI - H100 MIG 40GB
Allen AI
Embedding

Tulu 3 8B Reward

V3 - Reward - BEI - H100 MIG 40GB
BAAI
Embedding

BGE Reranker M3

BEI - H100
BAAI
Embedding

BGE Embedding ICL

BEI - H100

DeepSeek models

See all
DeepSeek Logo
LLM

DeepSeek V3.2

V3.2 - B200
DeepSeek Logo
Model API
LLM

DeepSeek V3.1

V3.1 - B200
DeepSeek Logo
LLM

DeepSeek-R1 Llama 70B

R1 - Llama - TRT-LLM - H100
DeepSeek Logo
LLM

DeepSeek-R1 Qwen 32B

R1 - Qwen - TRT-LLM - H100
DeepSeek Logo
LLM

DeepSeek-R1 Qwen 7B

R1 - Qwen - TRT-LLM - H100 MIG 40GB
DeepSeek Logo
LLM

DeepSeek R1 0528

R1 - 0528 - B200

Qwen models

See all
Qwen Logo
Image generation

Qwen Image

Text-to-Image
Qwen Logo
LLM

Qwen3.5 9B Latency

V1 - Latency - vLLM - H100
Qwen Logo
LLM

Qwen3.5 35B-A3B Latency

V1 - Latency - vLLM - H100
Qwen Logo
LLM

Qwen3.5 122B-A10B Latency

V1 - Latency - vLLM - H100
Qwen Logo
Embedding

Qwen3 8B Reranker

BEI - H100 MIG 40GB

Meta models

See all
Meta logo
LLM

Llama 3.3 70B Instruct

3.3 - TRT-LLM - H100
Meta logo
LLM

Llama 3.1 8B Instruct

3.1 - Instruct - TRT-LLM - H100
Meta logo
LLM

Llama 3.1 405B Instruct

3.1 - Instruct - H100
Meta logo
LLM

Llama 3.2 11B Vision Instruct

3.2 - Vision - A100
Meta logo
LLM

Llama 4 Scout

V4.0 - Instruct - vLLM - H100
Meta logo
LLM

Llama 4 Maverick

V4.0 - Instruct - vLLM - B200