"Inference Engineering" is now available. Get your copy here

Gemma 4

Family of LLMs developed by Google

Publisher details

Gemma is a family of generative artificial intelligence models and you can use them in a wide variety of generation tasks, including question answering, summarization, and reasoning. Gemma models are provided with open weights and permit responsible commercial use, allowing you to tune and deploy them in your own projects and applications.


Gemma 4 model family spans three distinct architectures tailored for specific hardware requirements:

  • Small Sizes: 2B and 4B effective parameter models built for ultra-mobile, edge, and browser deployment (e.g., Pixel, Chrome).

  • Dense: A powerful 31B parameter dense model that bridges the gap between server-grade performance and local execution.

  • Mixture-of-Experts: A highly efficient 26B MoE model designed for high-throughput, advanced reasoning.

Gemma 4 | Model library