🚀 Announcing the launch of Chains!

Baseten / Blog / GPU guides

GPU guides

Topics

Latest Model performance Hacks & projects GPU guides ML models Glossary Community Product News

Mar 28, 2024

Using fractional H100 GPUs for efficient model serving

Multi-Instance GPUs enable splitting a single H100 GPU across two model serving instances for performance that matches or beats an A100 GPU at a 20% lower cost.

Philip Kiely

Vlad Shulman

Matt Howard

3 others

Prompt: Two tron-style motorcycles racing on an empty highway

Nov 28, 2023

NVIDIA A10 vs A10G for ML model inference

The A10, an Ampere-series GPU, excels in tasks like running 7B parameter LLMs. AWS's A10G variant, similar in GPU memory & bandwidth, is mostly interchangeable.

Philip Kiely

Prompt: a glowing solarpunk treehouse

Sep 15, 2023

NVIDIA A10 vs A100 GPUs for LLM and Stable Diffusion inference

This article compares two popular GPUs—the NVIDIA A10 and A100—for model inference and discusses the option of using multi-GPU instances for larger models.

Philip Kiely

NVIDIA A10 vs A100

May 23, 2023

Understanding NVIDIA’s Datacenter GPU line

This guide helps you navigate NVIDIA’s datacenter GPU lineup and map it to your model serving needs.

Philip Kiely

Prompt: A glowing cyberpunk GPU embedded in a field

May 22, 2023

Comparing GPUs across architectures and tiers

So what are reliable metrics for comparing GPUs across architectures and tiers? We’ll consider core count, FLOPS, VRAM, and TDP.

Philip Kiely

Prompt: A glowing solarpunk GPU in a forest

Apr 27, 2023

Comparing NVIDIA GPUs for AI: T4 vs A10

Comparing NVIDIA T4 vs. A10 GPUs for AI training/art: We analyze price & specs to determine the best GPU for ML.

Philip Kiely

Prompt: two glowing solarpunk GPUs

Jan 19, 2023

Choosing the right horizontal scaling setup for high-traffic models

Horizontal scaling via replicas with load balancing is an important technique for handling high traffic to an ML model.

Philip Kiely

Prompt: A glowing GPU in the mountains

Jan 18, 2023

How to choose the right instance size for your ML models

This post simplifies instance sizing with heuristics to choose an optimal size for your model, balancing performance and compute cost.

Philip Kiely

Prompt: A glowing GPU in the mountains