Pricing
The fastest, most scalable model inference in our cloud or yoursBasic
Included in Basic:
Best-in-class autoscaling
Fast cold starts
Up to 5 replicas per model
Model observability & logging
SOC 2 Type II & HIPAA compliant
Email & in-app chat support
Pro
Everything in Basic plus:
Unlimited number of replicas per model
Volume discounts on compute
Bring your whole team
Priority access to high demand infrastructure including A100s and H100s
Dedicated support on Slack and Zoom
Enterprise
Everything in Pro plus:
Dedicated SLAs
Advanced security and compliance
Custom model management
Early roadmap previews
Custom regions
The best hardware on the market
Only pay for the compute you use, down to the minute
Best-in-class model performance, effortless autoscaling, and blazing fast cold starts mean you get the most out of each GPU, saving money along the way.
Select an instance type
T4x4x16
16 GiB VM, 4 vCPUs, 16 GiB
$0.01052
L4x4x16
24 GiB VRAM, 4 vCPUs, 16 GiB
$0.01414
A10Gx4x16
24 GiB VM, 4 vCPUs, 16 GiB
$0.02012
A100x12x144
80 GiB VRAM, 12 vCPUs, 144 GiB
$0.10240
H100x26x234
80 GiB VRAM, 26 vCPUs, 234 GiB
$0.16640
H100MIG:3x13x117
40 GiB VRAM, 13 vCPUs, 117 GiB
$0.08250
1x2
1 vCPU, 2GiB RAM
$0.00058
1x4
1 vCPU, 4GiB RAM
$0.00086
2x8
2 vCPUs, 8GiB RAM
$0.00173
4x16
4 vCPUs, 16GiB RAM
$0.00346
8x32
8 vCPUs, 32GiB RAM
$0.00691
16x64
16 vCPUs, 64GiB RAM
$0.01382
Commonly asked questions
- You can deploy open source and custom models on Baseten. Start with an off-the-shelf model from our model library. Or deploy any model using Truss, our open source standard for packaging and serving models built in any framework. You can deploy open source and custom models on Baseten. Start with an off-the-shelf model from our model library. Or deploy any model using Truss, our open source standard for packaging and serving models built in any framework.
- You have control over what GPUs your models use. See our instance type reference for a full list of the GPUs currently available on Baseten. Reach out to us to request additional GPU types.
- Yes, new Baseten accounts come with $30 of free credit so that you can start running models for free.
- Yes, Baseten is SOC 2 Type II certified and HIPAA compliant. You can read more about our SOC 2 Type II certification here. And you can read more about our HIPAA compliance here.
- No, you do not pay for idle time – you only pay for the time your model is using compute on Baseten. This includes the time your model is actively deploying, scaling up or down, or making predictions. And you have full control over how your model scales up or down.
- Customer support levels vary by plan. We offer email, in-app chat, Slack, and Zoom support. We also offer dedicated forward-deployed engineering support. Talk to our Sales team to figure out a customer support level that works for your needs.
- Yes, discounts on compute can be negotiated as part of our Pro plan. Talk to our Sales team to learn more.
- Yes, you can self-host Baseten in order to manage security and use your own cloud commitments. Talk to our Sales team to learn more.
Explore Baseten today
We love partnering with companies developing innovative AI products by providing the most customizable model deployment with the lowest latency.