Philip Kiely

Lead Developer Advocate

Philip Kiely

AI engineering

Private, secure DeepSeek-R1 in production in US & EU data centers

Amir Haghighat

Philip Kiely

Yineng Zhang

2 others

DeepSeek R1

Model performance

How we built production-ready speculative decoding with TensorRT-LLM

Pankaj Gupta

Philip Kiely

Pankaj Gupta

2 others

Speculative Decoding with TensorRT-LLM

Model performance

A quick introduction to speculative decoding

Pankaj Gupta

Philip Kiely

Pankaj Gupta

2 others

Intro to Speculative Decoding

Infrastructure

Evaluating NVIDIA H200 Tensor Core GPUs for LLM inference

Pankaj Gupta

Philip Kiely

Pankaj Gupta

1 other

NVIDIA H200

News

Export your model inference metrics to your favorite observability tool

Helen Yang

Nicolas Gere-lamaysouette

Philip Kiely

Helen Yang

2 others

Export your inference metrics

Community

Building high-performance compound AI applications with MongoDB Atlas and Baseten

Philip Kiely

Philip Kiely

MongoDB + Baseten

Model performance

How to build function calling and JSON mode for open-source and fine-tuned LLMs

Bryce Dubayah

Philip Kiely

Bryce Dubayah

1 other

JSON Mode

News

Introducing function calling and structured output for open-source and fine-tuned LLMs

Bryce Dubayah

Philip Kiely

Bryce Dubayah

1 other

Function calling + JSON Mode

AI engineering

The best open-source image generation model

Philip Kiely

Philip Kiely

Best image generation models

1 2 3...7

Explore Baseten today

Start deploying

Talk to an engineer