Scaling


Applications experience variable load. You can handle increased load by upgrading your service to have more resources or by scaling it to have more instances.

Render supports the following scaling methods:

  • Manual scaling: Scale your service by a specific number of instances.
  • Autoscaling: Automatically scale your service based on a target CPU utilization.

Manual scaling

Manual scaling is the simplest way to scale out your service. Your service will consistently have the desired number of instances. Open the Scaling tab in your service dashboard, enter the desired number of instances in the Manual Scaling section, and click Save Changes. Your service will immediately start to scale.

Manual scaling settings

Autoscaling

With autoscaling, Render scales your service up and down based on average CPU utilization so that you don’t have to estimate and overprovision resources in anticipation of peak traffic.

By default, autoscaling is disabled. You can enable autoscaling by opening the Scaling tab in your service dashboard and toggling the autoscaling switch on. After autoscaling is enabled, you can change the target CPU utilization and set the minimum and maximum number of instances for your service.

Autoscaling settings

How Autoscaling Works

Once autoscaling is enabled, Render periodically monitors the current CPU utilization and compares it against the target CPU utilization. We calculate the desired number of instances by multiplying the current number of instances by the ratio of the current CPU utilization and target CPU utilization.

For example, for a service with 2 instances, if the current CPU utilization is 80% and the desired CPU utilization is 60%, Render will scale this service up to 3 instances since ceil[2 * 80%/60%] = 3.

We have different time windows for scaling up and down. The scale up window is short, so that services are quickly able to respond to traffic bursts. The scale down window, on the other hand, is much longer to prevent the number of instances from fluctuating up and down too much in a short amount of time.

Monitoring Autoscaling

Autoscaling events are created when autoscaling configuration changes or when scaling up/down happens. These can be found in the Events tab in your service dashboard.

Autoscaling events

You can view the number of instances and the average CPU utilization of those instances over time in the Scaling tab in your service dashboard.

Autoscaling metrics

Billing

Your service is billed for the actual number of instances up at every second multiplied by the plan rate. There is no extra cost to enable autoscaling. Here are some examples to clarify.

If you have 2 instances running at all times (manual scaling), you’ll be billed 2x your plan’s rate. If your service is on the Starter plan for the whole month, you’ll be billed $7 × 2 = $14.

Suppose you have a Starter service with autoscaling enabled and, every day, your service scales to 2 instances for 6 hours and back to 1 instance for the remaining 18 hours, the actual number of instances of is (2 × 6 + 1 × 18) / 24 = 1.25, and you’ll be billed $7 × 1.25 = $8.75.

You can find the exact number of instance hours for your service for the month on the billing page and in the monthly invoices.

Application Considerations

  • Services with disks can only have a single instance and cannot be manually or automatically scaled. Consider moving persistent state out of services that you want to scale.
  • There is a load balancer in front of service instances to evenly distribute network request traffic to scaled web services.