Introducing Baseten Self-hosted

TL;DR

Baseten Self-hosted empowers companies and enterprises with unparalleled control over AI model deployments. Gain granular management over data locality, align with organizational or industry compliance standards, excel at meeting specific performance and latency requirements, and leverage any existing cloud commitments.

After working with countless AI builders across different industries, we’ve learned that sometimes, the best solution is for companies to run model inference in their own cloud. We consistently heard the request for a self-hosted solution that lets you:

  • Flexibly meet strict data residency requirements

  • Better align with organizational or industry compliance standards

  • Use credits from different cloud providers, like AWS and GCP

That's why we built a solution that leverages our state-of-the-art AI inference expertise while providing companies with enhanced control over data localization, regulatory adherence, custom hardware utilization, and existing cloud investments.

With Baseten Self-hosted model inference runs on your VPC.

How Baseten Self-hosted works

Baseten Self-hosted offers complete control over AI infrastructure and data, while guaranteeing the SLAs, reliability, and scale we specialize in.

Our Self-hosted solution uses Truss, a reliable, fast, and convenient solution for packaging and deploying models. Data and compute live on your cloud, and compute usage can be charged against any credits offered by cloud providers. Model inference inputs and outputs go directly to your compute—they never touch our premises. 

Baseten Self-hosted enables:

  • Freedom to optimize your GPU usage

  • Enhanced control in meeting specific security and compliance needs

  • Optimal leveraging of existing cloud commitments and resources

With Self-hosted, our engineering team regularly manages application and infrastructure updates, including those to your VPC. We offer zero-downtime deployments and no-disruption updates.

Inference for custom-built LLMs could be a major headache. Thanks to Baseten, we’re getting cost-effective high-performance model serving without any extra burden on our internal engineering teams.

Writer

Self-hosted use cases

Three of the main use cases for Baseten Self-hosted are:

  • Meeting specific security needs. Strict data residency requirements, IP protection, requirements from customers, or regulations mandate that inference be run on your cloud.

  • Customizing hardware. You want to leverage existing GPU allocation or buy custom hardware to manage meeting specific performance and latency requirements. 

  • Utilizing existing cloud credits. You have existing cloud investments and want to fully consume credits across clouds, such as AWS and GCP.

  • Compound AI systems. Baseten Self-hosted works with any setup, including compound AI.

A compound AI system built with Baseten Chains

Our Self-hosted solution is ideal for enterprises and mid-sized companies with restrictive data residency, compliance, or security requirements, and for all companies looking to fully customize hardware or maximally utilize their existing investments across different cloud platforms.

“The great thing about Baseten is they’re able to get something up and working for you really, really quickly. Seeing a substantial piece of work functioning within a couple of days is such an impressive, promising work rate.”

Patreon

Baseten Self-hosted vs. Baseten Cloud

With Basten Cloud, you don’t have to figure out all of your hardware requirements ahead of time. We offer single- and multi-tenant solutions that provide our customers with enhanced security, performance, and reliability, and we have teams dedicated to getting companies up and running quickly. Baseten Cloud is an excellent option for companies and AI-native startups that want:

  • Rapid setup times

  • Blazing-fast cold starts 

  • GPU availability and elasticity

  • First-rate security and compliance

  • Multi-cloud, multi-region availability

On the other hand, Baseten Self-hosted offers ultimate control and customization. It’s an excellent choice for companies and enterprises that need:

  • Custom hardware 

  • Specific security measures and isolation

  • Optimal use of existing resource investments

  • Optimized horizontal scaling across clouds and regions

  • Saved engineering time by partnering with industry experts

While both Baseten Cloud and Self-hosted offer solutions for security and compliance, utilizing existing cloud commitments, customizability, and support, they differ in some aspects.

We love to support our customers with highly performant, reliable, and secure infrastructure for AI models. Baseten Self-hosted enables you to do inference in your own cloud, manage stringent security requirements, and fulfill your cloud commitments. If Baseten Self-hosted can help you meet your security and compliance needs, provide necessary control over hardware, or help you leverage your existing resources, get in touch!