Baseten / Blog

Baseten Blog | Page 1

Topics

Latest Model performance Hacks & projects GPU guides ML models Glossary Community Product News

1 2 3…11

Hacks & projects

Jul 25, 2024

Deploying custom ComfyUI workflows as APIs

Easily package your ComfyUI workflow to use any custom node or model checkpoint.

Het Trivedi

1 other

Community

Jul 25, 2024

Ten reasons to join Baseten

Baseten is a Series B startup building infrastructure for AI. We're actively hiring for many roles — here are ten reasons to join the Baseten team.

Dustin Michaels

1 other

Model performance

Jul 23, 2024

How to serve 10,000 fine-tuned LLMs from a single GPU

LoRA swapping with TRT-LLM supports in-flight batching and loads LoRA weights in 1-2 ms, enabling each request to hit a different fine-tune.

Pankaj Gupta

1 other

Prompt: Different-colored friendly robots standing in a field

Product

Jul 11, 2024

Using Asynchronous Inference in Production

Learn how async inference works, protects against common inference failures, is applied in common use cases, and more.

Samiksha Pal

2 others

The overall asynchronous inference workflow.

Product

Jul 2, 2024

Baseten Chains Explained: Building Multi-Component AI Workflows at Scale

A Delightful Developer Experience for Building and Deploying Compound ML Inference Workflows

Marius Killinger

1 other

News

Jun 27, 2024

Introducing Baseten Chains

Learn about Baseten's new Chains framework for deploying complex ML inference workflows across compound AI systems using multiple models and components

Bola Malek

4 others

Glossary

Jun 14, 2024

Comparing few-step image generation models

Few-step image generation models like LCMs, SDXL Turbo, and SDXL Lightning can generate images fast, but there's a tradeoff when it comes to speed vs quality.

Rachel Rapp

An AI-generated image of wooden steps in a futuristic setting surrounded by plants, symbolizing few-step image generation.

Glossary

Jun 4, 2024

How latent consistency models work

Latent Consistency Models (LCMs) improve on generative AI methods to produce high-quality images in just 2-4 steps, taking less than a second for inference.

Rachel Rapp

Two trees slightly different in size and color represent how latent consistency models ensure consistency between images.

Product

Jun 3, 2024

New in May 2024

AI events, multicluster model serving architecture, tokenizer efficiency, and forward-deployed engineering

Baseten

Prompt: A solarpunk pier for a futuristic water taxi

Community

May 31, 2024

What I learned as a forward-deployed engineer working at an AI startup

My first six months at Baseten exposed me to a huge range of exciting engineering challenges as I learned how to make an impact as a forward-deployed engineer.

Het Trivedi

Prompt: a software engineer building a bridge out of glowing code

1 2 3…11