Baseten provides all the infrastructure you need to deploy and serve ML models performantly, scalable, and cost-efficiently.
Feb 22, 2024
Using NVIDIA TensorRT to optimize each component of the SDXL pipeline, we improved SDXL inference latency by 40% and throughput by 70% on NVIDIA H100 GPUs.
Oct 18, 2022Revised Nov 7, 2023
Deploy OpenAI Whisper for free on Baseten instantly from our model library. Or stick around to learn how to package and deploy Whisper with Truss.