Two patterns get people in trouble most often: persistent disks (because they look like normal storage but aren’t) and scaling (because the wrong setting can be invisibly expensive). Worth knowing both - and worth knowing they don’t combine.
Disks: the basics
services: - type: web name: cms runtime: node plan: starter startCommand: npm start disk: name: uploads mountPath: /var/data/uploads sizeGB: 10Three fields:
| Field | Notes |
|---|---|
name | Logical disk identifier (Render Dashboard label). Stable across redeploys. |
mountPath | Where the disk appears on the instance filesystem. Use a path your app already writes to. |
sizeGB | Disk size. Renders larger plans support bigger disks; check the pricing page for current limits. |
Render mounts an SSD volume at mountPath. Whatever your app writes there persists across deploys, restarts, and instance moves. Whatever it writes outside mountPath is on the instance’s ephemeral filesystem and gets blown away on the next deploy.
flowchart LR app["Your app"] mount["/var/data/uploads<br/>(disk)"] ephemeral["Everything else<br/>(ephemeral)"] app -->|"persists across deploys"| mount app -->|"gone on next deploy"| ephemeral
When you actually want a disk
| Scenario | Disk? |
|---|---|
| Storing CMS-uploaded images on a single-instance app | Yes |
| SQLite as a primary store for a low-traffic app | Yes |
| Caching CDN downloads to skip re-fetches across deploys | Yes |
| Storing user-uploaded files on a multi-instance API | No - use object storage (S3/R2) |
| Holding session data | No - use Key Value or your DB |
| Backing your primary Postgres | No - use Render Postgres |
The split is about scaling. Disks are excellent when one instance owns the data; awful when multiple instances need to share it. The next section is why.
The big constraint: disks pin you to one instance
This trips up teams every time. The mental model:
flowchart LR
subgraph withDisk [With disk attached]
direction TB
d1[Instance #1]
d2["disk (SSD)"]
d1 --> d2
note1["1 instance only<br/>downtime on deploy"]
end
subgraph withoutDisk [No disk]
direction TB
n1[Instance #1]
n2[Instance #2]
n3[Instance #3]
note2["N instances<br/>zero-downtime deploys"]
end
A disk is a lease on a single attached volume. You can’t autoscale a service with a disk, and you can’t run two replicas. If you need both - disk-backed and horizontally scaled - the right answer is to move the data to a service that’s designed for it: Postgres for relational data, object storage for blobs, Key Value for caches.
Scaling: two knobs, one decision
Render gives you two ways to set instance count:
services: - type: web name: api runtime: node plan: starter numInstances: 3services: - type: web name: api runtime: node plan: starter scaling: minInstances: 2 maxInstances: 10 targetCPUPercent: 70 targetMemoryPercent: 75Pick one or the other.
| Field | Lives under | Use when |
|---|---|---|
numInstances | service root | You want a predictable cost and steady traffic |
scaling.minInstances | scaling: | Floor for autoscale; what previews and idle traffic run at |
scaling.maxInstances | scaling: | Ceiling for autoscale; cost cap |
scaling.targetCPUPercent | scaling: | Add an instance when CPU exceeds this for a few minutes |
scaling.targetMemoryPercent | scaling: | Add an instance when memory exceeds this for a few minutes |
Either or both target percentages work - Render scales when any target is breached. A common starting point: minInstances: 2, maxInstances: 10, targetCPUPercent: 70. Two instances handle a single-instance failure, and the CPU target leaves headroom for spikes.
When numInstances makes sense
Autoscaling sounds like the right answer for everything. It isn’t. Reach for numInstances when:
- Traffic is steady. A backoffice tool serving 50 employees doesn’t need to scale.
- Cold-start times are painful. Adding an instance during a spike helps only if the new instance is ready before the spike ends. For a 30-second startup, autoscaling lags real traffic.
- Cost predictability matters more than peak performance. A flat 3 instances bills the same every month, no matter the load.
- The service has a disk (covered above - autoscaling isn’t an option here).
A worked decision tree
flowchart TB
start["What does this service do?"]
uploads{"Stores files<br/>between requests?"}
obj["Move blobs to S3/R2<br/>autoscale freely"]
disk{"Multiple instances<br/>need the same data?"}
diskYes["Pick a different store<br/>(Postgres, KV, object storage)"]
diskNo["Disk: yes<br/>numInstances: 1<br/>brief deploy downtime"]
steady{"Traffic is<br/>steady?"}
manual["numInstances: N"]
scale["scaling.<br/>min/max + targets"]
start --> uploads
uploads -->|"No"| steady
uploads -->|"Yes"| disk
disk -->|"Yes"| diskYes
disk -->|"No"| diskNo
steady -->|"Yes"| manual
steady -->|"No"| scale
diskYes --> steady
obj --> steady
The most common branch for a green-field web service: no disk, traffic varies, autoscale with sensible min/max. The most common branch for a CMS or a SQLite-backed tool: disk attached, single instance, accept the deploy gap.
Snapshots and restore
Render takes daily snapshots of attached disks. If something goes wrong - a bad migration, a corrupted file - you can restore from the Render Dashboard. You don’t write this in render.yaml; it’s a Render Dashboard-only operation.
The Blueprint surface for disks is just the three fields above. Snapshots, restore points, and disk resizing are all operational concerns handled outside the YAML. Plan the disk size carefully up front because resizing is offline (a brief restart) and shrinking isn’t supported.
What you learned
- `disk` attaches a persistent SSD volume at `mountPath`; everything else is ephemeral
- A disk pins the service to a single instance and forces brief deploy downtime - no autoscaling, no zero-downtime
- Use disks for single-instance ownership (CMS uploads, SQLite, build caches). For shared writable state, use Postgres, Key Value, or object storage
- Pick `numInstances` for steady traffic and predictable cost; pick `scaling.min/max + targetCPUPercent/targetMemoryPercent` when traffic varies
- Autoscaling is disabled in preview environments - previews run at `scaling.minInstances`