Track both active and starting up replicas
The replica count chart on the model metrics page is now broken out into “active” and “starting up” replicas.
An active replica has loaded the model for inference and is actively responding to traffic.
A replica is starting up if it’s been created by the autoscaler to handle additional traffic, but isn’t yet ready to respond to requests.
Once a replica finishes starting up, it becomes active. When no longer needed, it will deactivate.
✕