Improve performance and reduce cost with fractional H100 GPUs

Baseten now offers model inference on NVIDIA H100mig GPUs, available for all customers starting at $0.08250/minute. 

The H100mig family of instances runs on a fractional share of an H100 GPU using Nvidia’s Multi-Instance GPU (MIG) virtualization technology. We were the first inference provider to offer H100s back in February 2024, unlocking an 18 to 45 percent improvement in price to performance vs. equivalent workloads using two or more A100s. With the H100mig GPUs now available on Baseten, customers can take advantage of these performance and cost improvements for smaller workloads, including those currently using a single A100 instance.

H100 pricing and instance types

Baseten currently offers one H100mig and four H100 instance types. See our instance type reference for more details.

Run your model on H100 GPUs

We’ve opened up access to H100 and H100mig GPUs for all customers and plan to aggressively scale our capacity to meet our customers’ needs. Get in touch and tell us about your use case and we’ll help you achieve big performance improvements and cost savings using H100 GPUs for model inference.