Migrate from Heroku and get up to $10k in credits

Get started
Guest
April 01, 2026

Sharp opinions, clean infrastructure: How Cynical Sally serves nine clients from one backend on Render

Thomas Geelens

Cynical Sally is a reviewer who roasts your work with abandon. She reads your code, resume, or questionable AirBnB listing, scores it 0–10, and delivers verdicts like this:

A utility function with no type hints, a comment that just restates the function name, and variable names that read like someone mashed a keyboard. I've seen more documentation on a sticky note.

Sally started as a website. Render let me turn her into an ecosystem. One backend now serves a website, a Chrome extension, a Safari extension, a CLI tool, an MCP server, an open-source Lite version, and soon an alarm app (Sally wakes you up with a roast) and a companion app (Sally as your snarky alter ego). All of them hit the same API, the same queues, and the same Frankfurt database. That's the Sallyverse.

At some point while building, I stopped having website problems and started having infrastructure problems. One client is easy, but nine clients sharing the same backend is a different challenge entirely, and most platforms aren't built for it. That’s where Render comes in.

Why I left Vercel

I started on Vercel. It's a great platform for websites, but Sally stopped being one the moment she needed a background job.

Full Truth reviews are minutes-long AI jobs, and Vercel's serverless functions timeout after 10 to 60 seconds. This fundamental mismatch pointed me toward Render's Background Worker service, which runs Sally's three BullMQ queues as persistent processes with no timeouts or workarounds.

Then there was Redis. Persistent connections for sessions, quotas, rate limiting, and caching don't mix well with a serverless runtime that spins up a fresh connection per instance and kills it afterward. Serverless and ioredis with connection pooling is a constant fight you're always losing. Render's managed Postgres and Key Value sit co-located in Frankfurt, so connections persist and latency stays low.

I also needed a real worker process, not a function. Something that stays alive, drains a queue, and doesn't need a web request to trigger it. Vercel doesn't have that, which, in practice, means standing up a second platform and another failure point. A single render.yaml deploys the web tier, background worker, and cron jobs together in one push.

Docker was the dealbreaker. Sally has MP4 video generation ready to ship via ffmpeg for burn card animations, and Vercel just has no answer for that. Render runs Docker natively, so when I'm ready to ship, it's a one-line change to the service definition.

The last thing Vercel never solved cleanly was EU data residency. Serverless functions can execute anywhere unless you carefully configure around it. On Render, I pin the region in render.yaml and every service deploys to Frankfurt, including every fork of Sally Lite. Anyone who deploys their own instance gets the right region without touching a thing.

I moved everything to Render and haven't looked back.

How the backend serves nine clients

Here's what that actually looks like in practice. The backend is deployed on Render as three interconnected services. The web service runs the Next.js App Router and handles all inbound requests: authentication, billing, rate limiting, and the review engine. The worker service is a BullMQ job consumer that processes async review jobs. The cron service handles daily maintenance, with scheduled tasks for the alarm app coming soon.

The split minimizes latency. Sally has two review tiers: Quick Roast uses Claude Haiku and responds synchronously in seconds, while Full Truth uses Claude Sonnet and can run for several minutes on a large codebase. Holding an HTTP connection open for that long would make the service appear dead to both users and Render's health checks, so Full Truth jobs get enqueued instead. The web service hands back a job ID immediately and stays free to handle everything else while the worker does the slow work in the background.

That three-service split makes the thin-client model possible. Every client in the Sallyverse, the website, Chrome extension, Safari extension, CLI, and MCP server, is essentially a thin layer on top of the backend. Each one collects input, forwards it to the backend, and renders whatever comes back. Prompts, authentication, scoring logic, and billing code all live in the backend.

The architecture scales without multiplying complexity. Adding a new client means building a frontend, not replicating backend logic. The alarm app and companion app currently in development will follow the same pattern. Nine clients follow one set of rules: they hit the API, receive a roast, and get out of the way. When I update a prompt or tweak the scoring rubric, every surface in the Sallyverse gets it instantly.

The data layer follows the same principle. Every client reads from the same Postgres instance and Key Value store in Frankfurt, and Cloudflare R2 handles media assets. That's a single source of truth with no duplicated state.

The CLI shows how thin these clients can get. You point it at a directory, and it ships the code to the backend, which returns a burn card you can share:

That's the pattern across every client in the Sallyverse, and it's also why the frontend is clean enough to open source.

Deploy your own corner of the Sallyverse

Because the backend handles everything opinionated, the open-source Sally Lite frontend layer has almost nothing in it. Sally Lite is a dumb proxy: it collects code, forwards it to the backend, and returns the result.

You can deploy your own instance on Render. The source is on GitHub under the MIT license, and the entire deployment is 15 lines:

A few noteworthy things keep this blueprint dead simple:

type: web makes Sally Lite a long-running Render web service. Port binding, TLS termination, and request routing are all handled automatically.

region: frankfurt propagates to every fork, so anyone who deploys their own instance gets the right region without any manual configuration.

healthCheckPath: /health means Render polls the service continuously and restarts it automatically if it goes unhealthy, which is a meaningful safety net on a free-tier app at no extra cost.

SALLY_API_URL is the one pre-configured environment variable. Because it's not a secret entered manually after deploy, the Deploy to Render button is genuinely one-click with zero configuration required.

Your turn

Render is the reason the Sallyverse is one thing instead of five half-finished things. It gave me the full stack, and the full Sallyverse, without any unnecessary complexity.

You can deploy your own instance of Sally Lite on Render here. Happy roasting!