DreamCanvas: a FigJam plugin for fine-tuning Stable Diffusion

We’ve all seen the excitement that Stable Diffusion and its ecosystem have created over the last few months. Give it some text, get back an image, repeat — great.

But what we have only started to see is how these models will get incorporated into existing and greenfield creative workflows. And this will be an exciting step in the evolution of AI-powered tools over the next months and years.

A few of us spent a late evening sketching out a project called DreamCanvas recently. It took a couple of days to build, but even the Proof of Concept feels like a powerful way of interacting with these new tools.

Exploring new creative interfaces with AI

We've seen lots of front-ends for training and fine-tuning Stable Diffusion with Dreambooth: often clunky, and generally not very exploratory. Feedback loops tend to be broken because training is arduous, manual, and infra-intensive. And validating data and incorporating outputs back into an existing tool makes for a slow and interruption-prone workflow unsuited for open-ended exploration.

In the past, I lost steam on these types of projects because not only did I need to figure out how the front-end and workflow would work, but I also had to figure out how to host and scale the machine learning part of the equation; a challenge even for demos.

Starting in FigJam

FigJam widgets allow you to build new powers directly into the user's canvas.  I’ve always liked using directly manipulated tactile interfaces.  Let the user collect data onto the canvas, fine-tune a model and keep that tuned model directly on the canvas to generate new images.

This is a rough sketch of the workflow for the user perspective:

  • Collect training images in a section like a moodboard

  • Add the DreamCanvas widget to the section and hit Train

  • Bring the trained model widget into other sections with prompts to generate images

  • Play!

FigJam's Plugin API is powerful, so building the moodboard-like sections and uploading was fairly straightforward (shout out to Replit for making spin up small bespoke services as needed).

Fine-tuning with Blueprint

Now that we have the skeleton for gathering and uploading training data, I needed an easy way to fine-tune models without having to have a second side project of maintaining a bunch of machine-learning infrastructure. I found a ton of tutorials, but nothing that just worked or properly outsourced the tedium of dealing with AWS/GCP APIs.

Blueprint does just this:

  • Provides me an API to fine-tune Stable Diffusion. I can fine-tune using Dreambooth or the full Stable Diffusion model, and they also gave me a bunch of credits to get started.

  • Takes care of deployment of these fine-tuned models; also via an API. It took one line of Python to trigger a fine-tuning job, and have it be auto-deployed.

  • Gives me a performant API to access the model

Blueprint also provides a way to define quick endpoints around the model. I would have done this in Flask or Repl.it, but not having to deal with standing up a new service, CORS, etc. made it pretty quick for me to build my API and saved me from unnecessary rabbit holes. 

The end result

Demo of the DreamCanvas UI showing fine-tuned model results

With a few simple API endpoints (/train, /status, /imagine), I made a multiplayer-enabled (!) canvas that had live-trained ML models living in it. Many people can come together and try out the model, you can alt-drag trained models to try out explorations without losing your history, you can mark it up with pencil drawings and stickies and do anything else you've gotten used to in FigJam and Figma.

Here's the demo!

Want to play with it?

DreamCanvas is not ready for primetime just yet (I need to add auth to make sure I don't end up footing a massive compute bill, and to make some performance improvements), but you can reach out to me on Twitter @msfeldstein if you want to play with it.

Machine learning infrastructure that just works

Baseten provides all the infrastructure you need to deploy and serve ML models performantly, scalable, and cost-efficiently.

Machine Learning

NVIDIA A10 vs A100 GPUs for LLM and Stable Diffusion inference

This article compares two popular GPUs—the NVIDIA A10 and A100—for model inference and discusses the option of using multi-GPU instances for larger models.

Philip Kiely

September 15, 2023

Machine Learning

SDXL inference in under 2 seconds: the ultimate guide to Stable Diffusion optimization

Out of the box, Stable Diffusion XL 1.0 (SDXL) takes 8-10 seconds to create a 1024x1024px image from a prompt on an A100 GPU. Here’s everything I did to cut SDXL invocation to as fast as 1.92 seconds on an A100.

Varun Shenoy

August 30, 2023