Jul 6, 2023

Models We Love: June 2023

Prompt: A set of old fashioned measuring scales on a window overlooking the city

TL;DR

The release of new open source foundation models is happening at an incredible pace, and in this new series we want to share some of our recent favorites. Each of the models in this post can be deployed in a couple of clicks from the Baseten Model Library. Once deployed to Baseten your model is ready to handle production traffic, with features including model logs, health metrics, and robust autoscaling.

Open source models we love 😍

Falcon-7B and Falcon-40B

The Falcon model is a fully open source large language model

Falcon is an open source foundational LLM with 40 billion parameters trained on one trillion tokens, and was developed with only a fraction of compute used to train similar models. Falcon was created by the Technology Innovation Institute (TII), and has been a darling of the HuggingFace Open LLM Leaderboard of late, with the fine-tuned Falcon-40B-Instruct variant dominating the top spot and outperforming other popular text generation models such as RedPajama and LLaMA.

Falcon uses high quality data for high quality results

While there’s a lot to love about the Falcon-40B model (and its smaller variant, Falcon-7B), we’d be remiss not to mention its creation and use of the RefinedWeb dataset. RefinedWeb consists primarily of publicly available web data that has undergone extensive document preparation, filtering and deduplication, and when used in zero shot generalization benchmarks, models trained on RefinedWeb outperform those trained on curated datasets like The Pile. By investing in the thorough cleaning of publicly available data, RefinedWeb represents a breakthrough for high quality data at scale, and an indicator of great things to come!

Table adapted from The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Falcon is fantastic for text generation and problem solving

There are a multitude of uses for large language models, and similar to many LLMs, Falcon is ideal for both problem-solving and creative text generation through applications such as chatbots, virtual assistants, and content generators.

We haven’t seen too many Falcon projects in the wild yet, but two worth mentioning are:

Falcon-Chat chatbot demo from the HuggingFace H4 (helpful, honest, harmless, and huggy) RLHF team.
FalconLLM with LangChain Chat UI from Misbah Syed, who also has an excellent video tutorial to help walk you through the project.

Deploying the Falcon-7B and Falcon-40B models on Baseten

You can deploy the Falcon-7B variant on Baseten in a couple of clicks when you deploy directly from the Baseten Model Library. If you’d like to work with the Falcon-40B variant, we’ve packaged the model for you using Truss, our open source model deployment framework, and have a great tutorial for deploying Falcon-40B that walks you through each step.

Sample Falcon-7B prompt

Write a sitcom plot about two falcons trying to take over the world

import baseten
model = baseten.deployed_model_id('YOUR_MODEL_ID')

model.predict({
    "prompt": "Write a sitcom plot about two falcons trying to take over the world",
    "do_sample": True, "max_new_tokens": 300})

Sample Falcon-7B response

The plot follows two falcons, Alex and Leo, who, after being rejected by their own kind, band together to take over the world. With their incredible flying skills they start by knocking over the world's top criminals, one by one. Along the way, they learn the true meaning of friendship and teamwork...and that falcons have a lot more in common than they thought.

MusicGen

High quality music samples with MusicGen

MusicGen is an open source text to music language model developed by Meta AI that takes text or melodic inputs and generates a high quality music sample. By utilizing text descriptions or melodic inputs, MusicGen gives greater control over the generated output. There aren’t many generative music models out there yet, in part because working with music is more challenging due to the need to sample audio at a higher kHz frequency than speech as well as its inherently complex harmonic patterns. What the team at Meta has created is truly incredible, and you can listen to a variety of samples on their demo site.

Follow the creation of MusicGen projects on Twitter

We’ve become obsessed with the #musicgen hashtag on Twitter, where people are posting their MusicGen creations, many of which combine multiple large models into stunning multimedia compositions. Some of our recent favorites include:

Bollywood style music from text inputs, created by @nikhilaravi
A whistle-to-symphony orchestra based on melodic inputs, created by @fabianstelzer
A multimedia piece combining Midjourney, MusicGen, ElevenLabs, and ChatGPT from @kaigani
You can also check out the GitHub repository for MusicGen, which contains music samples, code, and models

Deploying MusicGen on Baseten

You can deploy MusicGen on Baseten in a couple of clicks when you deploy directly from the Baseten Model Library.

Sample MusicGen prompt

Artificially intelligent bop

MusicGen output

WizardLM

WizardLM introduces Evol-Instruct

WizardLM, a foundation model based on LLaMA, is a text generation model from Microsoft and Peking University. Similar to the Falcon model, WizardLM has introduced a new training dataset, Evol-Instruct. Evol-Instruct was created by taking an initial set of human-generated instructions which were then re-written by an LLM to be more complex before being used to fine-tune LLaMA. Although Wizard LM does not yet outperform models such as ChatGPT, it does indicate a viable path for AI-evolved instructions in the creation of LLM datasets.

Image adapted from WizardLM: Empowering Large Language Models to Follow Complex Instructions

Deploying WizardLM on Baseten

You can deploy WizardLM on Baseten in a couple of clicks when you deploy directly from the Baseten Model Library.

Sample WizardLM prompt

What is the difference between a wizard and a sorcerer?

import baseten
model = baseten.deployed_model_id('YOUR_MODEL_ID')

model.predict({"prompt": "What is the difference between a wizard and a sorcerer?"})

Sample WizardLM output

In the context of fantasy and mythology, the terms "wizard" and "sorcerer" are often used interchangeably to describe a magical practitioner who uses magic to cast spells and perform magical feats. However, there are some subtle differences between the two terms:

Origin: The term "wizard" typically comes from European folklore and mythology, while the term "sorcerer" has its roots in Middle Eastern and African mythology.

Appearance: In many depictions, wizards are portrayed as old, bearded men with long, flowing robes, while sorcerers are often depicted as younger, more physically fit individuals with shorter hair and less formal attire.

Magic style: Wizards are often associated with the use of potions, incantations, and complex spells, while sorcerers may focus more on physical manipulation of magic and the elements.

Goals: Wizards may be more focused on knowledge and power, while sorcerers may be more focused on practical applications of magic for personal gain or to help others.

Overall, while there are some differences between the two terms, they are largely interchangeable and can be used to describe a magical practitioner with similar abilities and goals.

MPT-7B

MPT-7B Base goes the distance

Developing large language models is expensive, both in terms of time and cost, with models taking weeks–and in many cases months–to train. The compute to train LLMs can quickly rise into the millions of dollars, which places considerable constraints on those who don’t have access to extensive resources.

Enter MPT-7B Base, a seven billion parameter model trained from scratch on one trillion tokens of text and code, created by MosaicML. MPT-7B Base was trained in under 10 days and cost approximately $200K, and it matches the quality of LLaMA-7B. Available for commercial use, MPT-7B can also handle extremely long inputs through its replacement of positional embeddings with Attention with Linear Biases (ALiBi).

You can access the entire codebase, in addition to model checkpoints, in the MosaicML LLM Foundry on GitHub.

Fine-tuned variants of MPT-7B cover all the bases

MPT-7B-Storywriter-65k+ can both read and write stories, and leverages long context lengths by finetuning MPT-7B with a context length of 65k tokens. This means that the model could accommodate 65k tokens, giving the fine-tuned variant a richer background from which to generate responses.
MPT-7B-Instruct focuses on short-form instruction following.
MPT-7B-Chat generates dialogue for use in chatbots.

Deploying MPT-7B on Baseten

You can deploy MPT-7B on Baseten in a couple of clicks when you deploy directly from the Baseten Model Library.

Sample MPT-7B prompt

Today I inspected the engine mounting equipment. I found a problem in the brackets so

import baseten
model = baseten.deployed_model_id('YOUR_MODEL_ID')

request = {
    "prompt": "Today I inspected the engine mounting equipment. I found a problem in the brackets so",
    "temperature": 0.75,
    "max_tokens": 200
}

model.predict(request)

Sample MPT-7B output

I needed to order more parts. I also had to order more engine mount bolts. The bolts are not easy to find. The holes in the engine are all different sizes so they only make one size bolt for this type of engine. I have to keep an eye out for more of the bolts. I might be able to find them at the local hardware store.

Talk to us!

If there are open source foundation models you’d like to see in the Baseten Model Library, let us know on Twitter or on Threads. And if you have any questions or run into issues while deploying your model on Baseten, please reach out to us at support@baseten.co. We’d love to hear from you!