Baseten Blog | Page 3

Glossary

How to benchmark image generation models like Stable Diffusion XL

Benchmarking Stable Diffusion XL performance across latency, throughput, and cost depends on factors from hardware to model variant to inference config.

Glossary

Understanding performance benchmarks for LLM inference

This guide helps you interpret LLM performance metrics to make direct comparisons on latency, throughput, and cost.

Product

New in December 2023

Faster Mixtral inference, Playground v2 image generation, and ComfyUI pipelines as API endpoints.

Model performance

Faster Mixtral inference with TensorRT-LLM and quantization

Mixtral 8x7B structurally has faster inference than similarly-powerful Llama 2 70B, but we can make it even faster using TensorRT-LLM and int8 quantization.

ML models

Playground v2 vs Stable Diffusion XL 1.0 for text-to-image generation

Playground v2, a new text-to-image model, matches SDXL's speed & quality with a unique AAA game-style aesthetic. Ideal choice varies by use case & art taste.

Hacks & projects

How to serve your ComfyUI model behind an API endpoint

This guide details deploying ComfyUI image generation pipelines via API for app integration, using Truss for packaging & production deployment.

Product

New in November 2023

Switching to open source ML, a guide to model inference math, and Stability.ai's new generative AI image-to-video model.

GPU guides

NVIDIA A10 vs A10G for ML model inference

The A10, an Ampere-series GPU, excels in tasks like running 7B parameter LLMs. AWS's A10G variant, similar in GPU memory & bandwidth, is mostly interchangeable.

ML models

Stable Video Diffusion now available

Stability AI announced the release of Stable Video Diffusion, marking a huge leap forward for open source novel video synthesis