From datasets to deployed models: How Oxen helps companies train faster

Lower inference time

50%

Training job scheduled

< 1 min

Inference cost decrease

5x

Company overview

Oxen AI is a platform focused on helping companies accelerate the path from raw datasets to fine-tuned, production-ready AI models. They combine deep expertise in dataset curation and model fine-tuning with a streamlined management layer that abstracts away infrastructure complexity. 

By focusing on the software and dataset layers, Oxen enables customers to quickly unlock real-world AI use cases without having to manage GPUs, environments, or model-serving stacks themselves.

Challenges

Oxen wanted to provide its customers with an end-to-end workflow for training and deploying AI models without taking on the burden of building and maintaining GPU training infrastructure.

To enable them to double down on their expertise, the Oxen team was looking for an infrastructure partner that could:

  • Enable them to maintain their strategic focus: Oxen’s vision was to own the dataset and software management layer, not to become an infrastructure provider.

  • Close their infrastructure gap: Oxen needed reliable, elastic GPU provisioning and inference capabilities, but didn’t want to split focus and risk “failing at one of the two.”

  • Empower them to deliver fast, low-lift results: Deliver fast fine-tuning and flexible AI model deployment options while keeping the complexity of infrastructure invisible to end users.

This led Oxen to partner with Baseten to handle the infrastructure layer while they continued to innovate on the customer-facing side.

"Our goal at Oxen has always been to get customers from raw datasets to production-ready models as fast as possible. But building and managing GPU infrastructure ourselves would have pulled us away from where we add the most value. Whenever I’ve seen a platform try to do both hardware and software, they usually fail at one of them. That’s why partnering with Baseten to handle infrastructure was the obvious choice. It lets us stay focused on delivering the best possible experience for our customers." — Greg Schoeninger, CEO, Oxen AI

Solutions

Oxen built its customer experience on top of Baseten’s infrastructure using the Baseten CLI as its primary interface. Key aspects of the implementation include:

  • Oxen’s training workflows are executed programmatically through the Baseten CLI, enabling automated job orchestration without manual setup. 

  • GPU provisioning and deprovisioning are handled automatically, eliminating the need for users to manage infrastructure or manually stop instances.

  • For high-performance workloads, Oxen accesses multi-GPU clusters, including configurations such as 8xH100, to accelerate training throughput and support larger models.

The abstraction layer fully conceals the Baseten UI, allowing customers to initiate and monitor training runs through Oxen’s interface alone. Automating GPU provisioning also reduces operational overhead and unnecessary compute costs.

For models destined for inference, Oxen manages dependencies across diverse model permutations, including Docker container configurations, library versions, and transformer architectures. Model weights are versioned and accessible through a unified interface, enabling flexible deployment across multiple environments and use cases. The system supports both asynchronous and dedicated inference modes, allowing horizontal scaling to efficiently handle workloads ranging from batch processing to millions of concurrent inference requests.

This tight integration means Oxen customers can go from dataset to fine-tuned, production-ready models quickly and without touching infrastructure.

"Baseten was a delight to integrate with. We never have to worry about GPU capacity, and can give our customers reliable and fast fine-tuning." — Greg Schoeninger, CEO, Oxen AI

Results for Oxen meant that inference time dropped by 50% through a combination of using a fine-tuned model and other model optimizations such as lightning LoRA and model graph compilation. And training jobs get scheduled in under a minute, without having to worry about managing the machines yourself.

Results

As a result, Oxen customers saw:

  • 50% lower inference latency through a combination of using a fine-tuned model and other model optimizations, such as lightning LoRA and model graph compilation.

  • <1 minute training scheduling with no infrastructure overhead, for faster experimentation and iterations.

Company highlight: AlliumAI’s experience with Oxen + Baseten

AlliumAI was founded to bring order and clarity to messy, inconsistent, and incomplete retail data. They transform retail data into structured, trustworthy information and are creating the foundation for better products, smarter recommendations, and more satisfying consumer experiences.

AlliumAI found that combining Oxen’s data management platform with Baseten’s training and inference infrastructure created a uniquely powerful workflow. With Oxen, AlliumAI was able to version-control and manage massive, heterogeneous datasets quickly. And when integrated with Baseten’s training and AI model deployment layer, the process of fine-tuning and serving custom LoRAs became seamless.

For AlliumAI, the partnership between Oxen and Baseten eliminated infrastructure overhead such as manual GPU provisioning, CUDA configuration, and idle server costs while dramatically accelerating iteration cycles. In addition, AlliumAI was able to reduce costs from tens of thousands of dollars to just a couple of thousand dollars compared to when they tried to use ChatGPT or even Nano Banana. And all these cost savings came without sacrificing performance. As a result, they can now train, fine-tune, and deploy AI models at scale in a fraction of the time previously required, enabling their team to focus on innovation rather than infrastructure management.

Key results:

  • 84% cost savings: The total savings for inference were from $46,800 with Nano Banana to $7,530 with a fine-tuned Qwen-Image-Edit.

  • The cost savings during training also come from "not having to wake up at 4 am to turn off machines." AlliumAI’s CEO, Daniel, can now happily go to sleep knowing the job will spin down when it is finished and not pinch pennies.

"Training custom LoRAs has always been one of the most effective ways to leverage open-source models, but it often came with infrastructure headaches. With Oxen and Baseten, that complexity disappears. We can train and deploy models at massive scale without ever worrying about CUDA, which GPU to choose, or shutting down servers after training. It just works. The time saved in development and iteration is incredible." — Daniel Demillard, CEO, AlliumAI

What’s next

Looking ahead, Oxen plans to expand its partnership with Baseten by:

  • Scaling fine-tuning and inference workflows for additional new customers.

  • Adding new modalities such as image, audio, and video fine-tuning.

  • Continuing to abstract away infrastructure while providing deeper model customization and dataset management tools.

  • Building repeatable playbooks for multiple industries where large-scale generative tasks demand both cost efficiency and speed.

Check out Oxen AI to access easy to use tooling for building high quality datasets and models.