We’ve all seen the excitement that Stable Diffusion and its ecosystem have created over the last few months. Give it some text, get back an image, repeat — great.
But what we have only started to see is how these models will get incorporated into existing and greenfield creative workflows. And this will be an exciting step in the evolution of AI-powered tools over the next months and years.
A few of us spent a late evening sketching out a project called DreamCanvas recently. It took a couple of days to build, but even the Proof of Concept feels like a powerful way of interacting with these new tools.
Exploring new creative interfaces with AI
We've seen lots of front-ends for training and fine-tuning Stable Diffusion with Dreambooth: often clunky, and generally not very exploratory. Feedback loops tend to be broken because training is arduous, manual, and infra-intensive. And validating data and incorporating outputs back into an existing tool makes for a slow and interruption-prone workflow unsuited for open-ended exploration.
In the past, I lost steam on these types of projects because not only did I need to figure out how the front-end and workflow would work, but I also had to figure out how to host and scale the machine learning part of the equation; a challenge even for demos.
Starting in FigJam
FigJam widgets allow you to build new powers directly into the user's canvas. I’ve always liked using directly manipulated tactile interfaces. Let the user collect data onto the canvas, fine-tune a model and keep that tuned model directly on the canvas to generate new images.
This is a rough sketch of the workflow for the user perspective:
- Collect training images in a section like a moodboard
- Add the DreamCanvas widget to the section and hit Train
- Bring the trained model widget into other sections with prompts to generate images
FigJam's Plugin API is powerful, so building the moodboard-like sections and uploading was fairly straightforward (shout out to Replit for making spin up small bespoke services as needed).
Fine-tuning with Blueprint
Now that we have the skeleton for gathering and uploading training data, I needed an easy way to fine-tune models without having to have a second side project of maintaining a bunch of machine-learning infrastructure. I found a ton of tutorials, but nothing that just worked or properly outsourced the tedium of dealing with AWS/GCP APIs.
Blueprint does just this:
- Provides me an API to fine-tune Stable Diffusion. I can fine-tune using Dreambooth or the full Stable Diffusion model, and they also gave me a bunch of credits to get started.
- Takes care of deployment of these fine-tuned models; also via an API. It took one line of Python to trigger a fine-tuning job, and have it be auto-deployed.
- Gives me a performant API to access the model
Blueprint also provides a way to define quick endpoints around the model. I would have done this in Flask or Repl.it, but not having to deal with standing up a new service, CORS, etc. made it pretty quick for me to build my API and saved me from unnecessary rabbit holes.
The end result
With a few simple API endpoints (/train, /status, /imagine), I made a multiplayer-enabled (!) canvas that had live-trained ML models living in it. Many people can come together and try out the model, you can alt-drag trained models to try out explorations without losing your history, you can mark it up with pencil drawings and stickies and do anything else you've gotten used to in FigJam and Figma.
Here's the demo!
Want to play with it?
DreamCanvas is not ready for primetime just yet (I need to add auth to make sure I don't end up footing a massive compute bill, and to make some performance improvements), but you can reach out to me on Twitter @msfeldstein if you want to play with it.
Fine-tune FLAN-T5 on Blueprint today!
You can now fine-tune FLAN-T5, an instruction-tuned text-to-text transformer model developed by Google, on Blueprint!