In this step you’ll deploy the pipeline to Render and run it remotely from the same trigger script. The script you wrote in step 7 keeps working. You change two env vars and that’s it.
Push to GitHub
The 1K-row sample_data/ directory is small (~250 KB) and the deployed Workflow needs to read it, so commit it with the rest of the project:
$git init && git add -A && git commit -m 'Sharded customer-merge ETL'$gh repo create customer-merge --public --source=. --remote=origin --pushCreated repository <your-user>/customer-merge
Create the Workflow service
- In the Render Dashboard, click New then Workflow.
- Connect your
customer-mergerepo. - Set the Root Directory to
workflowsand the build/start commands per the table below. - Set
DATA_DIR=../sample_dataas a service environment variable. That’s where the deployed code reads the four CSVs from, relative to the workflows root. - Click Deploy Workflow and wait for the first build to finish.
- Copy the workflow slug from the service page. You’ll need it for the trigger script.
| Field | Value |
|---|---|
| Name | customer-merge-py |
| Language | Python 3 |
| Root Directory | workflows |
| Build Command | pip install -r requirements.txt |
| Start Command | python main.py |
| Field | Value |
|---|---|
| Name | customer-merge-ts |
| Language | Node |
| Root Directory | workflows |
| Build Command | npm install && npm run build |
| Start Command | npm start |
Trigger the deployed workflow
Flip RENDER_USE_LOCAL_DEV off, set RENDER_API_KEY and the workflow slug, and run the same trigger script:
$export RENDER_API_KEY=<your-render-api-key>$export WORKFLOW_SLUG=customer-merge-py/merge_customer_data$unset RENDER_USE_LOCAL_DEV && python trigger.pyGenerated 1000 profiles across 10 shards Avg health score: 52.7 Churn distribution: {'LOW': 412, 'MEDIUM': 487, 'HIGH': 101} Sample profile keys: [...] OK
$export RENDER_API_KEY=<your-render-api-key>$export WORKFLOW_SLUG=customer-merge-ts/merge_customer_data$unset RENDER_USE_LOCAL_DEV && npx tsx trigger.tsGenerated 1000 profiles across 10 shards Avg health score: 52.7 Churn distribution: {"LOW":412,"MEDIUM":487,"HIGH":101} Sample profile keys: ... OK
Open the Runs tab on your Workflow service in the Render Dashboard. A healthy run shows one parent row for merge_customer_data and ten child rows for process_shard (one per shard), all green. Click into any subtask to see its stdout: the Shard X: Loading CSV files... and Shard X: Generated NNN enriched profiles lines from the task’s print() calls.
What’s next
You have a working sharded ETL on Render. The next tutorial picks up exactly where this one leaves off and turns it into something you’d run in production.
Part 2: Productionize an ETL pipeline with Render Workflows adds retries with exponential backoff, idempotency keys so re-runs are safe, structured per-shard logs, a chaos drill that proves recovery, and a benchmarked scale-up to 1M+ rows.
What you learned
- Workflow services are created in the Render Dashboard. Blueprints don't cover them yet
- The trigger script you wrote in step 7 works locally and remotely. Only env vars change
- The Runs tab shows one parent task plus one row per shard subtask. All green is the success signal
- You now have a working pipeline. Part 2 makes it production-safe