Hacks & projects

Streaming real-time text to speech with XTTS V2

In this tutorial, we'll build a streaming endpoint for the XTTS V2 text to speech model with real-time narration and 200 ms time to first chunk.

How to serve your ComfyUI model behind an API endpoint

This guide details deploying ComfyUI image generation pipelines via API for app integration, using Truss for packaging & production deployment.

GPT vs Mistral: Migrate to open source LLMs seamlessly

Use ChatCompletions API to test open-source LLMs like Mistral 7B in your AI app with just three minor code modifications.

Build your own open-source ChatGPT with Llama 2 and Chainlit

Llama 2 rivals GPT-3.5 in quality and powers ChatGPT. Chainlit helps build ChatGPT-like interfaces. This guide shows creating such interfaces with Llama 2.

Build a chatbot with Llama 2 and LangChain

Build a ChatGPT-style chatbot with open-source Llama 2 and LangChain in a Python notebook.

Three techniques to adapt LLMs for any use case

Prompt engineering, embeddings, vector databases, and fine-tuning are ways to adapt Large Language Models (LLMs) to run on your data for your use case

Serving four million Riffusion requests in two days

Riffusion is a fine-tuned version of Stable Diffusion. Baseten served Riffusion over four million times in a couple of days, serving top-of-hacker-news traffic.