NVIDIA A10 vs A100 GPUs for LLM and Stable Diffusion inference
This article compares two popular GPUs—the NVIDIA A10 and A100—for model inference and discusses the option of using multi-GPU instances for larger models.
New in August 2023
The latest version of Truss brings new solutions for the most common pain points in packaging and serving ML models. Plus, learn how to optimize Stable Diffusion XL inference to run in as little as 3 seconds and build your own open-source version of ChatGPT with Llama 2 and Chainlit.
SDXL inference in under 2 seconds: the ultimate guide to Stable Diffusion optimization
Out of the box, Stable Diffusion XL 1.0 (SDXL) takes 8-10 seconds to create a 1024x1024px image from a prompt on an A100 GPU. Here’s everything I did to cut SDXL invocation to as fast as 1.92 seconds on an A100.
Build your own open-source ChatGPT with Llama 2 and Chainlit
Llama 2 is an open-source LLM that is competitive on results quality with GPT-3.5, which powers ChatGPT. Chainlit is an open-source tool for creating a ChatGPT-style interface. This tutorial shows you how to build a ChatGPT-style interface for your favorite open source LLMs like Llama 2.
AudioGen: deploy and build today!
AudioGen, part of the AudioCraft family of models from Meta AI, is now available in the Baseten model library.
New in July 2023
Llama 2 and SDXL shake up foundation model leaderboards (plus: Langchain, autoscaling, and more)
AI infrastructure: build vs. buy
AI infrastructure, ML infrastructure, build vs. buy, model deployment
Build a chatbot with Llama 2 and LangChain
Build a ChatGPT-style chatbot with open-source Llama 2 and LangChain in a Python notebook.
Deploying and using Stable Diffusion XL 1.0
Deploy Stable Diffusion XL 1.0 for free to generate images from text prompts and invoke Stable Diffusion with the Baseten Python client.
Models We Love: July 2023
An in-depth look at open source foundation models, both LLMs and image models: Llama 2 from Meta and Microsoft, FreeWilly1 and FreeWilly2 from Stability AI, SDXL 1.0 (Stable Diffusion XL) also from Stability AI, LayoutLM Document QA from Inspira, and NSQL 350M from Number Station.