Author a task — Localhost Part 2: Run AI agents as Render Workflows

This is the payoff. You’ve seen the same agent run three ways. Now you edit the thing that makes Pattern 3 smaller than Pattern 2: a task-wrapped agent.

The goal is not to rebuild the code-review workflow from scratch. The workshop repo now ships a working your-review sandbox. You run it, change what its custom reviewer cares about, then watch retries and fan-out come from task config and composition instead of queue code.

Start the local runtime

You need the local workflow runtime running in a terminal before any task commands work. If you already have it running from the previous step, skip ahead. Otherwise, start it now:

cd packages/workflow-agents
npm run dev:workflows

cd packages/workflow_agents
render workflows dev -- uv run python -m workflow_agents.workflow

Leave Terminal A running for the rest of this page. Every --local command in Terminal B talks to this runtime. After every code edit, restart it (Ctrl-C, then re-run the same command) because the runtime loads task code once at startup.

1. Run the starter task

Open the review-task sandbox for your track. The file already defines a custom reviewer, wraps it in a task, and calls that task from the root workflow.

Open packages/workflow-agents/src/workflows/your-review/index.ts.

Open packages/workflow_agents/src/workflow_agents/workflows/your_review.py.

In Terminal B, run it against the LlamaIndex baseline PR:

render workflows tasks list --local
render workflows tasks start your-review --local   --input='[{"url":"https://github.com/run-llama/LlamaIndexTS/pull/2234"}]'

render workflows tasks list --local
render workflows tasks start your_review --local   --input='[{"url":"https://github.com/run-llama/LlamaIndexTS/pull/2234"}]'

The starter fetches the PR diff, filters noisy files, and runs a custom clarity reviewer. The important pieces are visible in the sandbox:

const myReviewer = defineAgent({
  name: "my-reviewer",
  model: resolveModelSpec("medium"),
  tools: ["diff_stats"],
  systemPrompt: `# Code clarity reviewer

You review a pull request's per-file patches for clarity and maintainability.

Focus on:
- Confusing variable or function names
- Missing or misleading comments on non-obvious logic
- Functions doing too many things (suggest splits)
- Dead code or unreachable branches
- Inconsistent patterns across the changed files`,
});

const myReviewerTask = task(
  {
    name: "my-reviewer",
    timeoutSeconds: 120,
    retry: { maxRetries: 2, waitDurationMs: 1000, backoffScaling: 2 },
  },
  async (input: { patches: Patches }, runId?: string) => {
    return myReviewer.run(input, {
      tracer: storeTracer(),
      ...(runId ? { runId } : {}),
    });
  },
);

my_reviewer = define_agent(AgentDefinition(
    name="my-reviewer",
    model=resolve_model_spec("medium"),
    tools=["diff_stats"],
    system_prompt="""# Code clarity reviewer

You review a pull request's per-file patches for clarity and maintainability.

Focus on:
- Confusing control flow or deeply nested logic
- Missing or misleading names (variables, functions, types)
- Dead code or unreachable branches
- Overly broad exception handling that hides bugs""",
))

@app.task(name="my_reviewer", timeout_seconds=120)
async def my_reviewer_task(
    patches: list[dict[str, str]], run_id: str | None = None,
) -> dict[str, Any]:
    ctx = RunContext(tracer=store_tracer(), run_id=run_id)
    result = await my_reviewer.run({"patches": patches}, ctx)
    return {
        "text": result.text,
        "usage": {
            "input_tokens": result.usage.input_tokens,
            "output_tokens": result.usage.output_tokens,
        },
    }

The task wrapper is the bridge between a shared agent and a Render task. The agent .run() call is the same kind of call the naive and queue patterns make. Wrapping it as a task buys isolation, retries, timeouts, and traces. The optional run id links the agent span to the review row when the gateway dashboard triggers the workflow. Local CLI runs omit it, which is fine.

2. Make the reviewer yours

Change one thing about my-reviewer, then run your-review again. Pick a small edit you can see in the output.

For example, change the focus from clarity to error handling:

systemPrompt: `# Error-handling reviewer

You review a pull request's per-file patches for error handling.

Focus on:
- Exceptions that hide important failures
- Missing retries around network calls
- Error messages that do not tell an operator what failed
- Cleanup work that should run after failure`,

You can also switch model tier or tools:

model: resolveModelSpec("small"),
tools: ["diff_stats", "scan_for_secrets"],

For example, change the focus from clarity to error handling:

system_prompt="""# Error-handling reviewer

You review a pull request's per-file patches for error handling.

Focus on:
- Exceptions that hide important failures
- Missing retries around network calls
- Error messages that do not tell an operator what failed
- Cleanup work that should run after failure"""

You can also switch model tier or tools:

model=resolve_model_spec("small"),
tools=["diff_stats", "scan_for_secrets"],

Re-run the task with the same PR URL. The run trace still shows the same task tree, but the reviewer output follows your new focus. You changed the agent’s behavior without writing queue code, retry code, or registration code.

flowchart LR
  yr["your-review<br/>task"]
  prep["prepare diff<br/>plain fn"]
  filter["filter diff<br/>plain fn"]
  custom["my-reviewer<br/>task"]

  yr --> prep --> filter --> custom

3. Force a retry

Open the root your-review task at the bottom of the sandbox file. Find the function that runs when you start that task (yourReview in TypeScript, workflow_task in Python). Add a temporary throw as the first line inside that function, before anything else runs.

Scroll to the export default task(...) near the bottom of the file. Inside yourReview, add the throw before const runId:

async function yourReview(input: YourReviewInput) {
  if (Math.random() < 0.5) throw new Error("flaky!");

  const runId = input._runId;
  // ... rest of the function unchanged
}

Do not put this at the top of the file. If it runs during import, task registration fails and tasks list --local shows nothing.

Inside workflow_task, add the throw before the diff-fetch steps:

import random

async def workflow_task(input: dict[str, Any]) -> dict[str, Any]:
    if random.random() < 0.5:
        raise RuntimeError("flaky!")

    run_id = input.get("_runId")
    # ... rest of the function unchanged

Do not put this at module top level (outside the function). If it runs during import, task registration fails and tasks list --local shows nothing.

Re-run a few times. Render retries in a fresh instance per your retry config: maxRetries: 2 with backoff. No try/catch, no queue, no dead-letter logic. You wrote a config object, not a retry loop. Remove the throw when you’re done.

4. Bonus: fan out a second reviewer

Add a second custom reviewer and run both in parallel. This mirrors the idea in the comments at the bottom of the sandbox.

Define a second agent and task next to myReviewer and myReviewerTask:

const namingReviewer = defineAgent({
  name: "naming-reviewer",
  model: resolveModelSpec("small"),
  tools: ["diff_stats"],
  systemPrompt: `# Naming reviewer

Review changed files for confusing names. Return only naming findings.`,
});

const namingReviewerTask = task(
  { name: "naming-reviewer", timeoutSeconds: 120 },
  async (input: { patches: Patches }, runId?: string) => {
    return namingReviewer.run(input, {
      tracer: storeTracer(),
      ...(runId ? { runId } : {}),
    });
  },
);

Then replace the single reviewer call with a fan-out:

const [clarity, naming] = await Promise.all([
  myReviewerTask({ patches }, runId),
  namingReviewerTask({ patches }, runId),
]);

return {
  verdict: "approve",
  reason: [clarity.text, naming.text].join("\n\n"),
  reviews: [
    { agent: myReviewer.name, note: clarity.text },
    { agent: namingReviewer.name, note: naming.text },
  ],
  usage: clarity.usage,
};

Define a second agent and task next to my_reviewer and my_reviewer_task:

naming_reviewer = define_agent(AgentDefinition(
    name="naming-reviewer",
    model=resolve_model_spec("small"),
    tools=["diff_stats"],
    system_prompt="""# Naming reviewer

Review changed files for confusing names. Return only naming findings.""",
))

@app.task(name="naming_reviewer", timeout_seconds=120)
async def naming_reviewer_task(
    patches: list[dict[str, str]], run_id: str | None = None,
) -> dict[str, Any]:
    ctx = RunContext(tracer=store_tracer(), run_id=run_id)
    result = await naming_reviewer.run({"patches": patches}, ctx)
    return {
        "text": result.text,
        "usage": {
            "input_tokens": result.usage.input_tokens,
            "output_tokens": result.usage.output_tokens,
        },
    }

Then replace the single reviewer call with a fan-out:

import asyncio

clarity, naming = await asyncio.gather(
    step(my_reviewer_task)(patches_dicts, run_id),
    step(naming_reviewer_task)(patches_dicts, run_id),
)

return {
    "verdict": "approve",
    "reason": "\n\n".join([clarity["text"], naming["text"]]),
    "reviews": [
        {"agent": my_reviewer.name, "note": clarity["text"]},
        {"agent": naming_reviewer.name, "note": naming["text"]},
    ],
    "usage": clarity["usage"],
}

Re-run the task. The trace now has two reviewer branches under your-review. That is the same fan-out shape as the built-in code-review workflow.

Compare your file to the real code-review workflow

packages/workflow-agents/src/workflows/code-review/index.ts defines securityTask, performanceTask, uxTask, and judgeTask, then fans out reviewer tasks, adds ux only for frontend changes, and runs judge:

const reviewerTasks = [
  { name: securityReviewer.name, run: securityTask },
  { name: performanceReviewer.name, run: performanceTask },
];
if (hasFrontendFiles(patches)) {
  reviewerTasks.push({ name: uxReviewer.name, run: uxTask });
}

const reviewerResults = await Promise.all(
  reviewerTasks.map(async ({ name, run }) => {
    const result = await run({ patches }, runId);
    return { agent: name, note: result.text, usage: result.usage };
  }),
);

const decision = await judgeTask({ findings: reviewerResults.map(({ agent, note }) => ({ agent, note })) }, runId);

return toReviewSummary(reviewerResults, decision);

packages/workflow_agents/src/workflow_agents/workflows/code_review.py defines security_task, performance_task, ux_task, and judge_task, then fans out reviewer tasks, adds ux only for frontend changes, and runs judge:

reviewer_names = ["security", "performance"]
if has_frontend_files(patches):
    reviewer_names.append("ux")

raw_results = await asyncio.gather(
    *[step(REVIEWER_TASKS[name])(patches, run_id) for name in reviewer_names]
)
reviewer_results = []
for name, raw in zip(reviewer_names, raw_results, strict=True):
    reviewer_results.append({
        "agent": name,
        "note": raw["text"],
        "usage": raw["usage"],
    })

findings = [{"agent": r["agent"], "note": r["note"]} for r in reviewer_results]
decision_raw = await step(judge_task)(findings, run_id)

Same shape as yours. The production workflow is a few more tasks composed the same way.

You added retries with backoff in task config and nothing else. Where does the retry actually happen?

In a try/catch you add around the task bodyRender runs the task again in a fresh instance per the config, with no retry code from youThe local CLI loops the function until it stops throwingPostgres replays the failed job from a dead-letter table

Troubleshooting

Find the symptom that matches what you’re seeing, then apply the fix.

You edited the file but the re-run shows the old behavior. The local runtime loads task code once at startup and doesn’t hot-reload. After every edit, restart Terminal A (Ctrl-C, then re-run the dev command) before re-running in Terminal B. This is the number-one source of “my change isn’t working” on this page.

--input='[...]' won’t parse on Windows. cmd.exe and PowerShell handle single quotes differently and mangle the inner double quotes. Put the JSON array in a file (input.json) and pass --input-file=input.json instead of the inline --input.

task not found when you run it. The task name differs by track: TypeScript registers your-review (hyphen), Python registers your_review (underscore). Same for code-review / code_review. Run render workflows tasks list --local and copy the exact name. Note the agent’s display name is hyphenated in both tracks (my-reviewer); only the task name differs.

After the retry edit, tasks list suddenly shows nothing. You put the throw at module top level, so it fires during import and aborts task discovery. Move it inside the task function body (yourReview / workflow_task), as the step intends. Remove it when you’re done.

Your prompt, model, or tool edit produces no visible change. You’re on the mock model, which ignores the system prompt and returns canned output. To see edits take effect, set a real key (OPENAI_API_KEY or ANTHROPIC_API_KEY) in the package .env and restart Terminal A. Watch for a falling back to mock model line in Terminal A.

A new tool name is ignored or errors. Tools auto-discover from the shared tools directory and must match the registered name exactly: diff_stats, scan_for_secrets, contrast_ratio, current_time. Hyphenated guesses won’t match.

Local CLI runs don’t create a dashboard review row. Expected: the gateway only creates and links a row when it dispatches the workflow. A direct render workflows tasks start --local prints its result and trace to the terminal instead. The dashboard linkage comes back in Part 2 via the gateway.

The fan-out paste throws a parse error on the reason line. The .join("\n\n") separator must be on one line. If you copied it with the newlines split across physical lines, rewrite it as [clarity.text, naming.text].join("\n\n"). The bonus also imports the agent judge (which you wrap in a task() yourself); there is no exported judgeTask.

No runtime at all on this page. If your Terminal A from the previous page failed, it was likely missing uv run. Restart it with render workflows dev -- uv run python -m workflow_agents.workflow. For the bonus, use the relative import from .code_review import judge_task (with the leading dot). The fan-out "\n\n".join(...) must also be on one line. Note the shipped your_review returns {url, overview, extensions, dropped, review, usage}; if you swap in the {verdict, reason, reviews} shape, the persisted output changes shape too.

What you learned

Ran the auto-discovered `your-review` task with the shipped `my-reviewer` agent
Changed the reviewer prompt, model tier, or tools and saw the output follow that focus
Forced a failure and watched Render retry per the `retry` config, with no retry code of your own
Fanned out a second custom reviewer, matching the built-in `code-review` workflow shape