Try Mercury 2 now
Mercury 2 is a diffusion LLM that breaks the latency ceiling on standard GPUs by generating tokens in parallel. It delivers the same quality as Haiku and GPT-5 mini. It is available on Baseten today.
If you want to try it out, complete this form and we will get back to you soon.