Changelog

See our latest feature releases, product improvements and bug fixes

Oct 31, 2024

Changes to instance type management

As part of ongoing improvements to Baseten’s infrastructure platform, we’re working on giving you more flexibility in how resources are provisioned for each model deployment. In the interim, we’re...

Oct 15, 2024

Introducing canary deployments for seamless promotions

We're excited to introduce canary deployments on Baseten, designed to phase in new deployments with minimal impact on production latency and uptime. When enabled for a model, Baseten gradually shifts...

Oct 11, 2024

Create custom environments for model release management

Today we’re excited to introduce custom environments to help manage your model’s release cycles. Environments provide a way to ensure quality, stability, and scalability before your model reaches end...

Oct 9, 2024

New request metrics

We've introduced three new request metrics to enhance model monitoring. You can now view percentiles and averages for the following: - Request size: Tracks the distribution of request sizes, serving...

Oct 2, 2024

Export model inference metrics to your favorite observability tools

You can now easily export model inference metrics to your favorite observability platforms, including Prometheus , Datadog , Grafana Cloud , and New Relic ! Seamlessly sync inference request counts,...

Sep 26, 2024

Introducing Baseten Hybrid

Today we introduced early access to Baseten Hybrid , a multi-cloud solution that enables you to self-host inference with seamless flex capacity on Baseten Cloud. Check out our announcement blog to...

Sep 16, 2024

Deploy vLLM models with our OpenAI Bridge

Our OpenAI Bridge is now compatible with vLLM models out of the box! Deploy your vLLM model with Truss and let the docs guide you to an easy integration using the OpenAI completions SDK.

Sep 15, 2024

Promote Chains to production

As of Truss version 0.9.34, you can now promote Chains to a production environment, bringing the same deployment workflow used for Truss Models to Chains. To promote a Chain, simply use the --promote...

Sep 13, 2024

Seamless remote development with Truss watch

We improved truss watch for more reliable live reloads, so you can test changes on production hardware in seconds without manually pushing via truss push or creating new deployments each time. Adding...

Sep 12, 2024

Structured output and function calling support

Models deployed with the TensorRT-LLM Engine Builder now support function calling (aka tool use) and structured output (aka JSON mode). Learn more: Launch announcement blog post Engineering deep dive...