"Inference Engineering" is now available. Get your copy here

Timur Abishev

Model performance

Faster Mixtral inference with TensorRT-LLM and quantization

Pankaj Gupta

Philip Kiely

Pankaj Gupta

2 others

Faster Mixtral inference

Explore Baseten today

Start deploying Talk to an engineer