Blog

Blog

Expert guides and engineering deep dives to help you ship faster, scale easier, and learn along the way.

‌

All Model performance AI engineering Infrastructure News Community Foundations

Model performance

Kimi K2 Thinking at 140+ TPS on NVIDIA Blackwell

Abu Qader

Tri Dao

Philip Kiely

Abu Qader

2 others

Kimi K2 Thinking 140+ TPS

AI engineering

Tool Calling in Inference

Kenzie Amack

Bryce Dubayah

Kenzie Amack

1 other

Tool Calling in Inference

News

Train AI Models When You Want. Deploy on Ultra Performant Infrastructure. Baseten Training Is GA.

Raymond Cano

Raymond Cano

1 other

Training GA

AI engineering

High-performance agents for financial services with NVIDIA Nemotron on Baseten

Philip Kiely

Philip Kiely

NVIDIA Nemotron on Baseten

Model performance

How we made the fastest GPT-OSS on NVIDIA GPUs 60% faster

Tri Dao

Abu Qader

Philip Kiely

Tri Dao

2 others

650+ TPS on GPT OSS 120B

AI engineering

DeepSeek-OCR and the Unreasonable Usefulness of Compression

Alex Ker

Alex Ker

1 other

DeepSeek-OCR & the Unreasonable Usefulness of Compression

Model performance

How Baseten achieved 2x faster inference with NVIDIA Dynamo

Abu Qader

Michael Feil

Abu Qader

2 others

2x faster inference with Nvidia Dynamo

AI engineering

From Sketch to 3D Model: Building a flower card generator with open source AI

Alex Ker

Alex Ker

From Sketch to 3D Model using Truss, Netlify, Autodesk

Community

Building AI agents, open code, and open source: A conversation with Dax

Madison Kanna

Dax Raad, creator of OpenCode and Zen, discusses the launch of Zen, the philosophy behind building in open source, and why terminal-based workflows matter.

1 2 3...15