What you'll build — Batched image generation with Render Workflows

By the end of this tutorial you’ll have a Render Workflow that takes a batch.json of prompts, generates each prompt with multiple AI image models in parallel, and uploads every finished image to object storage. You’ll trigger it from your laptop and watch the run tree fan out in the Render Dashboard.

The reference repo, render-examples/blog-thumbnails-workflows, ships a workflow that generates one thumbnail at a time across N models. That’s the right starting point. The problem with it for real work is the input shape: one prompt in, several images out. If you want to generate thumbnails for 50 blog posts across 3 models, you’d have to call the workflow 50 times and stitch the results together yourself. This tutorial adds a second dimension so a single SDK call does the whole batch in one fan-out tree.

Before you start

You’ll need:

A Render account with Workflows enabled.
The Render CLI 2.12.0+ for the local dev server and manual task starts.
Python 3.11+ or Node.js 20+, depending on the language tab you pick.
At least one of an OpenAI API key or a Google AI API key. Setting both lets you fan across all three supported models.
Docker for the local MinIO container, or any S3-compatible bucket you already have credentials for.
Comfort with Render Workflows, or completed the quickstart.

Image generation costs real money. The local runs in this tutorial use 1 to 6 images at a time, which is cents, not dollars.

Why fan-out matters for image generation

Image APIs have wildly different latencies. A Gemini Flash call comes back in 3 to 5 seconds. DALL-E 3 is 15 to 30 seconds. GPT-Image-1 is somewhere in between. Run them sequentially for one prompt and you’re waiting on the slowest model. Run them sequentially for a batch of 50 prompts and you’re hours deep before the first result lands.

Fan-out flips that. Each (prompt, model) pair runs in its own task, on its own instance. The whole batch finishes in roughly the time of the slowest single call, regardless of batch size. That’s the punchline you’ll prove in step 7.

The system at a glance

A single call to generateBatch spawns one subtask per (prompt, model) pair. Each subtask calls its provider’s image API, downloads the bytes, and uploads the finished JPEG to MinIO. The parent task aggregates the storage URLs and returns them to the SDK client. The reference repo’s frontend and API service are out of scope. You’ll work with the workflow service and a small SDK trigger script.

What you’ll ignore from the reference repo

The reference ships a React frontend, two API servers (Express and FastAPI), and the workflow service. You only need the workflow folder (workflow-ts/ or workflow-python/) and the shared/ config. Skip the rest.

You're running a batch of 20 prompts across 3 image models. Gemini takes 4s, GPT-Image-1 takes 12s, DALL-E 3 takes 25s. Roughly how long should the whole batch take when each (prompt, model) pair runs as its own task in parallel?

About 4 seconds (the fastest model wins)About 25 to 30 seconds (the slowest task wall-clock, plus a little overhead)About 60 seconds (3 models in sequence times the slowest)About 820 seconds (every task in sequence)

Roadmap

What you’ll build. This page.
Tour the repo and isolate the workflow. Clone, install only the workflow folder, set up API keys.
Run one generation locally. Start the local workflow server and produce a single image to confirm the loop works.
Read the multi-model fan-out. Walk through the existing task that fans one prompt across M models.
Add the per-prompt fan-out. Wrap the existing task so a batch of N prompts also fans out, giving N x M parallel subtasks.
Deploy the workflow to Render. Push the workflow service, set env vars, trigger the single-prompt task remotely.
Run a real batch. Submit batch.json from your laptop, watch the run tree fill out in the Render Dashboard, and browse the gallery in MinIO.

What you learned

Image APIs have wildly different latencies, so serial generation wastes time
Fan-out runs each (prompt, model) pair on its own instance in parallel
The reference repo fans one prompt across M models. You'll extend it to N prompts x M models
You'll trigger the workflow from a small SDK script. No frontend or API service in this tutorial