"Inference Engineering" is now available. Get your copy here

Model library

Browse our library of open source models that are ready to deploy behind an API endpoint in seconds.

6 Gemma models

gemma
LLM

Gemma 4 E2B IT

4 - Latency - H100
gemma
LLM

Gemma 4 E4B IT

4 - Latency - H100
gemma
LLM

Gemma 4 26B A4B IT

4 - Latency - H100
gemma
LLM

Gemma 4 31B IT

4 - Latency - H100
google logo
Embedding

EmbeddingGemma

Embedding
google logo
LLM

Gemma 3 27B IT

3 - Instruct - vLLM - H100

🔥 Trending models