replica_policy
section in the service YAML file.
replica_policy
section in your service configuration YAML file. Below is a detailed explanation of each parameter:
min_replicas
(required)
min_replicas
can be set to 0
if you want your service to scale to zero.min_replicas: 1
max_replicas
(required)
max_replicas: 3
target_qps_per_replica
(optional)
target_qps_per_replica: 5
upscale_delay_seconds
(optional)
upscale_delay_seconds: 300
downscale_delay_seconds
(optional)
downscale_delay_seconds: 1200
min_replicas: 1
).max_replicas: 10
).target_qps_per_replica: 5
).upscale_delay_seconds: 300
).downscale_delay_seconds: 1200
).target_qps_per_replica
and max_replicas
settings.downscale_delay_seconds
, the platform will gradually reduce the number of replicas down to zero, if appropriate.min_replicas
is set to a value that covers your base workload. Use max_replicas
to prevent overprovisioning.target_qps_per_replica
, upscale_delay_seconds
, and downscale_delay_seconds
based on actual usage patterns.replica_policy
correctly, you can ensure your services are both responsive to demand and cost-effective.
Feel free to reach out if you have any questions or need further assistance!