In this step you’ll turn the dial on two knobs (shard count and instance plan) and write down the wall-clock impact yourself. By the end you’ll have a defensible answer to “how do I make this faster?” for your own ETL.
Regenerate the data at 1M rows
$cd scripts$python generate_data.py --rows 1000000Wrote sample_data/crm.csv (1000000 rows) Wrote sample_data/billing.csv (1000000 rows) Wrote sample_data/product.csv (1000000 rows) Wrote sample_data/support.csv (1000000 rows)
Your deployed Workflow service only sees files that are in the repo or mounted into the service. For this tutorial, commit the regenerated sample_data/ directory and push it before the benchmark. For a real ETL, put the dataset in object storage or a database and change load_csv(filename) to read from that durable source. Do not assume a local sample_data/ directory on your laptop exists on Render.
Baseline run
Run the default config first: 10 shards on the standard plan. Trigger the same script, record the wall-clock, and keep the run id so you can compare logs after the scale-up.
$python trigger.py --report-elapsedRun started: <run-id> Merged 4000000 source rows into 1000000 enriched profiles in 32.6s
Bump shard count and instance plan
- NUM_SHARDS = 10- @app.taskasync def merge_customer_data() -> dict:...
+ NUM_SHARDS = 25+ @app.task(plan="pro")async def merge_customer_data() -> dict:+ ...++ @app.task(+ retry=Retry(max_retries=3, wait_duration_ms=2000),+ plan="pro",+ )+ def process_shard(shard_id: int) -> dict:...
- const NUM_SHARDS = 10;const mergeCustomerData = task(- { name: "merge_customer_data" },async function mergeCustomerData() {// ...});
+ const NUM_SHARDS = 25;const mergeCustomerData = task(+ { name: "merge_customer_data", plan: "pro" },async function mergeCustomerData() {// ...}+ );++ const processShard = task(+ {+ name: "process_shard",+ retry: { maxRetries: 3, waitDurationMs: 2000 },+ plan: "pro",+ },+ function processShard(shardId: number) { /* ... */ });
$git add -A && git commit -m 'scale up' && git push$# wait for the deploy, then:$python trigger.py --report-elapsedRun started: <run-id> Merged 4000000 source rows into 1000000 enriched profiles in 11.4s
Fill in your own numbers
Record both runs in the table below so you have something to point at the next time someone asks “is this worth scaling?”
| Config | Records | Wall-clock | Cost/run (rough) |
|---|---|---|---|
10 shards, standard (baseline) | 1M | your time | your number |
25 shards, pro (scaled) | 1M | your time | your number |
| Speedup | n/a | ratio | n/a |
What’s next
If you want another production-shaped Workflow, try Extend SF Pulse with Render Workflows. It adds an LLM pipeline and a real Render Postgres database. If you want to go back to the SDK basics, use the Render Workflows quickstart. Keep Workflows limits nearby as you push shard count, payload size, and run duration further.
What you learned
- Shard count scales parallelism; instance plan scales per-shard throughput
- Past a point you stop being shard-bound. Profile to find the next bottleneck
- Always pair a scale-up with the chaos drill from step 6. Bigger fan-out means more chances to hit a flake
- Your own before/after numbers are the most credible benchmark you'll have