Lead Developer Advocate

Philip Kiely

About

Philip Kiely is a software developer and author based out of Chicago. Originally from Clive, Iowa, he graduated from Grinnell College with honors in Computer Science. Philip joined Baseten in January 2022 and works across documentation, technical content, and developer experience. Outside of work, he's a lifelong martial artist, voracious reader, and, unfortunately, a Bears fan.

GPU guides

Accelerating inference with NVIDIA B200 GPUs

NVIDIA B200 GPUs improve cost, throughput, and latency for use cases like code generation, search, reasoning, agents, and more.

Philip Kiely

High-performance AI inference with NVIDIA B200 GPUs

Community

Building performant embedding workflows with Chroma and Baseten

Integrate Chroma’s open-source vector database with Baseten’s fast inference engine for efficient, real-time embedding inference in your AI-native apps.

Philip Kiely

Build performant embedding workflows with Chroma and Baseten

ML models

The best open-source embedding models

Discover the best open-source embedding models for search, RAG, and recommendations—curated picks for performance, speed, and cost-efficiency.

Philip Kiely

Model performance

How we built high-throughput embedding, reranker, and classifier inference with TensorRT-LLM

Discover how we optimized embedding, reranker, and classifier inference using TensorRT-LLM, doubling throughput and achieving ultra-low latency at scale.