A checklist for switching to open source ML models

Prompt: Luggage on a trolley in a historic train station

Switching from a closed source ecosystem where you consume ML models from API endpoints to the world of open source ML models can seem intimidating. But this checklist will give you all of the resources you need to make the leap.

Checklist for switching to open source ML models

Pick an open source model

The biggest advantage of the open source ecosystem in ML is the sheer number and variety of models to choose from. But that amount of choice can be overwhelming. Here are some alternatives to closed-source models to get you started:

Large language models (LLMs):
- Closed source: GPT, Claude
- Open source: Llama, DeepSeek
Text embedding models:
- Closed source: OpenAI text-embedding-3
- Open source: BAAI text embedding models
Speech to text (audio transcription) models:
- Closed source: Whisper from the Audio API
- Open source: Whisper on your own infra
Text to speech (audio generation) models:
- Closed source: Audio API text to speech endpoint
- Open source: Orpheus TTS

Choose a GPU for model inference

Inference for most generative models like LLMs requires GPUs. Picking the right GPU is essential: you want the least expensive GPU powerful enough to run the model with acceptable performance.

For a 70 billion parameter LLM like Llama 3.3 70B, you need 2-4 H100 GPUs, but for the largest LLMs like DeepSeek-R1, you'll need H200 GPUs or multi-node inference. Partial H100 GPUs via multi-instance GPU (MIG) or smaller, cheaper L4 GPUs give great performance for smaller models like Whisper and embedding models.

Here are some buyer’s guides to GPUs:

Find optimizations relevant to your use case

If you’re just experimenting with open source models or you need to get something in production yesterday, you can skip this step. But one of the most powerful things that switching to open source models unlocks is the ability to optimize a balance of latency, throughput, quality, and cost to align with your use case.

Get started with:

Deploy your model

Once you have your model and hardware configuration, it’s time to deploy. You can deploy a curated selection of models from our model library in just a couple of clicks or use Truss, our open source model packaging framework, to get any model up and running behind an API endpoint.

Dive into deployment with:

Open source models in the Baseten model library.
A quickstart guide for Truss, an open source model packaging framework.

Integrate your new model endpoint

Once you’ve deployed your model, you’ll need to use the model endpoint to integrate your model into your application.

Baseten has guides for:

If you want to dive deeper, check out our guide to open source alternatives for ML models. Wherever you are in your journey from evaluation to adoption for open source ML models, we’re here to help at support@baseten.co.

Subscribe to our newsletter

Stay up to date on model performance, GPUs, and more.

‌

A checklist for switching to open source ML models

Pick an open source model

Choose a GPU for model inference

Find optimizations relevant to your use case

Deploy your model

Integrate your new model endpoint

Subscribe to our newsletter

Related ML models posts

The best open-source embedding models

Private, secure DeepSeek-R1 in production in US & EU data centers

The best open-source image generation model