New request metrics
We've introduced three new request metrics to enhance model monitoring. You can now view percentiles and averages for the following: -
Request size: Tracks the distribution of request sizes, serving as a proxy for input tokens.
Response size: Monitors the distribution of response sizes, acting as a proxy for generated tokens.
Time to first byte: Measures the time-to-first-byte (TTFB), including any queuing and routing time, offering insights into overall latency.
✕