Introducing Baseten Self-hosted
TL;DR
Baseten Self-hosted empowers companies and enterprises with unparalleled control over AI model deployments. Gain granular management over data locality, align with organizational or industry compliance standards, excel at meeting specific performance and latency requirements, and leverage any existing cloud commitments.
After working with countless AI builders across different industries, we’ve learned that sometimes, the best solution is for companies to run model inference in their own cloud. We consistently heard the request for a self-hosted solution that lets you:
Flexibly meet strict data residency requirements
Better align with organizational or industry compliance standards
Use credits from different cloud providers, like AWS and GCP
That's why we built a solution that leverages our state-of-the-art AI inference expertise while providing companies with enhanced control over data localization, regulatory adherence, custom hardware utilization, and existing cloud investments.
How Baseten Self-hosted works
Baseten Self-hosted offers complete control over AI infrastructure and data, while guaranteeing the SLAs, reliability, and scale we specialize in.
Our Self-hosted solution uses Truss, a reliable, fast, and convenient solution for packaging and deploying models. Data and compute live on your cloud, and compute usage can be charged against any credits offered by cloud providers. Model inference inputs and outputs go directly to your compute—they never touch our premises.
Baseten Self-hosted enables:
Freedom to optimize your GPU usage
Enhanced control in meeting specific security and compliance needs
Optimal leveraging of existing cloud commitments and resources
With Self-hosted, our engineering team regularly manages application and infrastructure updates, including those to your VPC. We offer zero-downtime deployments and no-disruption updates.
Inference for custom-built LLMs could be a major headache. Thanks to Baseten, we’re getting cost-effective high-performance model serving without any extra burden on our internal engineering teams.
Self-hosted use cases
Three of the main use cases for Baseten Self-hosted are:
Meeting specific security needs. Strict data residency requirements, IP protection, requirements from customers, or regulations mandate that inference be run on your cloud.
Customizing hardware. You want to leverage existing GPU allocation or buy custom hardware to manage meeting specific performance and latency requirements.
Utilizing existing cloud credits. You have existing cloud investments and want to fully consume credits across clouds, such as AWS and GCP.
Compound AI systems. Baseten Self-hosted works with any setup, including compound AI.
Our Self-hosted solution is ideal for enterprises and mid-sized companies with restrictive data residency, compliance, or security requirements, and for all companies looking to fully customize hardware or maximally utilize their existing investments across different cloud platforms.
“The great thing about Baseten is they’re able to get something up and working for you really, really quickly. Seeing a substantial piece of work functioning within a couple of days is such an impressive, promising work rate.”
Baseten Self-hosted vs. Baseten Cloud
With Basten Cloud, you don’t have to figure out all of your hardware requirements ahead of time. We offer single- and multi-tenant solutions that provide our customers with enhanced security, performance, and reliability, and we have teams dedicated to getting companies up and running quickly. Baseten Cloud is an excellent option for companies and AI-native startups that want:
Rapid setup times
Blazing-fast cold starts
GPU availability and elasticity
First-rate security and compliance
Multi-cloud, multi-region availability
On the other hand, Baseten Self-hosted offers ultimate control and customization. It’s an excellent choice for companies and enterprises that need:
Custom hardware
Specific security measures and isolation
Optimal use of existing resource investments
Optimized horizontal scaling across clouds and regions
Saved engineering time by partnering with industry experts
While both Baseten Cloud and Self-hosted offer solutions for security and compliance, utilizing existing cloud commitments, customizability, and support, they differ in some aspects.
We love to support our customers with highly performant, reliable, and secure infrastructure for AI models. Baseten Self-hosted enables you to do inference in your own cloud, manage stringent security requirements, and fulfill your cloud commitments. If Baseten Self-hosted can help you meet your security and compliance needs, provide necessary control over hardware, or help you leverage your existing resources, get in touch!
Subscribe to our newsletter
Stay up to date on model performance, GPUs, and more.