Zero Downtime Deploys

Render makes sure your applications never go down, even when your build breaks.

We also restart your apps automatically if they become unresponsive or start returning errors. We do this by utilizing user-defined health checks, described below.

Health Checks

The core primitive behind health checks is a path in your app (say /healthz) that returns a successful HTTP response if your app is healthy, and a failure response if it isn’t.

What you return on your health check path is up to you, but we recommend running quick sanity checks (like a simple database query) and returning an “OK” 200 response or an empty 204 response if the app is healthy.

A health check is considered successful when the health check path returns a response code between 200 and 399. Any other code (or a timeout) causes it to fail.

How Render Uses Health Check Paths

Defining health check paths is optional, but if you define a health check path for your app (and you really should), here’s how we use it for zero-downtime deploys:

  • When your app is first deployed, we issue a GET request to your health check path after running your start command.

    Your deploy is marked live if the health check passes, which means it returns an HTTP response code between 200 and 399. Any other response code (or a timeout) results in a deploy failure.

  • When a new version of your app is deployed, we keep the existing version up and continue to send user requests to it. At the same time, we bring up a new instance of your app with the new version. We then issue health check requests on the new instance, and depending on the response, do one of two things:

    • If the response is unsuccessful (say a 403 or a 500 error code), we consider the new version unhealthy, and mark the deploy failed. Nothing changes as far as your users are concerned, because your existing version is still around and serving requests. You can now start figuring out why the health check failed by looking at the logs.
    • If the response is successful (say a 200 response code), we mark the deploy live and start directing user requests to the shiny new version of your app.
    • We terminate the old version at this point by sending your app a SIGTERM signal. Most web servers automatically intercept SIGTERM and shut down gracefully. There is a grace period of 30 seconds to shut everything down. If your app is still up after 30 seconds, it is shut down via a SIGKILL signal.

    This is how you get zero downtime deploys without needing to maintain two versions of your app at all times.

  • After your app is live, we continue to monitor its health by running a health check every few seconds. If your app starts failing the health check, we restart it automatically. As with most software, this tends to fix to the issue, but if it doesn’t we mark the deploy failed and send you a notification so you can fix things.

This is how we achieve maximum uptime for your app, with minimal effort on your part. Right now, health checks are available to use for all web services, including Docker deployments). If you need to use health checks for other service types, please reach out to us in our community.