"Inference Engineering" is now available. Get your copy here
Resources

Learn, Build, Deploy

Aabhas Sharma logo

The reason why we came to Baseten in the first place was the latency requirements. With our bursty workloads, we got queued for our requests similar to any other user of AI. And our customers don't care about who's queuing you.

Aabhas Sharma
CTO
Infrastructure
Gregory Kofman
2 others
How the Baseten Delivery Network (BDN) makes cold starts fast
Model performance
Michael Feil
1 other
Sub-3 millisecond named entity recognition (NER) inference