Harden and iterate — Build and host a full-featured, secure MCP server on Render

You have a deployed, authenticated, observable MCP server. The remaining work is the gap between “works for me” and “works for users over months.” Every item here is a small, well-scoped change you can ship one PR at a time.

1. Move OAuth state to Postgres

The provider from step 5 keeps clients, pending authorizations, and authorization codes in Maps. They reset on every deploy and don’t cross instances. That’s fine for one box; the moment you scale to two it breaks.

Two new tables, one new module, then point the provider at it.

CREATE TABLE IF NOT EXISTS oauth_clients (
  client_id           TEXT         PRIMARY KEY,
  client_secret_hash  TEXT,
  metadata            JSONB        NOT NULL,
  registered_at       TIMESTAMPTZ  NOT NULL DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS oauth_codes (
  code            TEXT         PRIMARY KEY,
  client_id       TEXT         NOT NULL REFERENCES oauth_clients(client_id) ON DELETE CASCADE,
  redirect_uri    TEXT         NOT NULL,
  code_challenge  TEXT         NOT NULL,
  scopes          TEXT[]       NOT NULL DEFAULT '{}',
  state           TEXT,
  github_user_id  BIGINT,
  github_login    TEXT,
  expires_at      TIMESTAMPTZ  NOT NULL
);

CREATE INDEX IF NOT EXISTS oauth_codes_expires_idx ON oauth_codes (expires_at);

Then refactor provider.ts so clientsStore, the pending map, and the codes map all read/write Postgres via the pg.Pool. The function bodies are tiny - one INSERT and one SELECT apiece. The interface doesn’t move; mcpAuthRouter keeps working as-is.

A periodic DELETE FROM oauth_codes WHERE expires_at < NOW() is worth scheduling - a Render cron job that runs every 10 minutes does the trick:

- type: cron
  name: notes-mcp-oauth-gc
  runtime: node
  schedule: "*/10 * * * *"
  buildCommand: corepack enable && pnpm install --frozen-lockfile && pnpm build
  startCommand: node dist/scripts/gc-oauth.js
  envVars:
    - key: DATABASE_URL
      fromDatabase:
        name: notes-mcp-db
        property: connectionString

Same fromDatabase wiring the web service already uses. Each consumer declares its own block - covered in detail in the Postgres wiring step.

2. Scope data to the authenticated user

Right now every authenticated caller sees every other caller’s notes. The fix is one column and one WHERE clause.

ALTER TABLE notes ADD COLUMN IF NOT EXISTS owner_sub TEXT;
CREATE INDEX IF NOT EXISTS notes_owner_sub_idx ON notes (owner_sub, created_at DESC);

Tool handlers receive the auth context via the extra argument the SDK threads through. Read sub off it and filter every query:

server.registerTool(
  "notes.create",
  { /*...unchanged... */ },
  async ({ title, body }, { authInfo }) => {
    const sub = authInfo?.extra?.sub as string | undefined;
    if (!sub) throw new Error("Missing identity");
    const note = await store.create({ title, body, ownerSub: sub });
    return { /*...unchanged... */ };
  },
);

recent(limit) becomes recent({ ownerSub, limit }) and the SQL gains WHERE owner_sub = $1. Same for getById. Five-minute change end-to-end; the auth context was already there waiting to be used.

3. Rotate the JWT signing secret

JWT_SIGNING_SECRET was set with generateValue: true, so Render has the only copy. Rotating it is a Render Dashboard click:

Open the Environment tab for the web service Find JWT_SIGNING_SECRET and hit the regenerate icon.
Confirm Render generates a new value and rolls the service.
Existing tokens are invalidated Anyone with a token issued under the old secret hits verifyAccessToken and gets a 401. The MCP client transparently re-runs the OAuth flow.

The trade-off is the blip: in-flight requests using the old token fail once. For real production you’d run two signing keys in parallel during a rotation window (publish both as JWKS, accept either, sign with the new one). For most internal MCP services the blip is acceptable.

Right now anyone with a GitHub account can authenticate. Two clean ways to scope that down - pick what matches the workload.

The simplest possible scoping. Check the GitHub user against an env-var list in the callback handler, before minting the authorization code:

const allow = (process.env.ALLOWED_GITHUB_LOGINS ?? "")
 .split(",").map((s) => s.trim()).filter(Boolean);
if (allow.length && !allow.includes(user.login)) {
  return res.status(403).send(`Sign-in for @${user.login} is not authorized.`);
}

Add ALLOWED_GITHUB_LOGINS to the Blueprint with sync: false and fill it in the Render Dashboard. Comma-separated, no spaces. Best for solo deployments or tiny teams where you can hand-curate the list.

What most teams end up with. Replace the allow-list check with a call to GET /user/memberships/orgs/\{org\} using the GitHub token. Members get through; non-members get a 403.

const ALLOWED_ORG = process.env.ALLOWED_GITHUB_ORG;
if (ALLOWED_ORG) {
  const membershipResp = await fetch(
    `https://api.github.com/user/memberships/orgs/${ALLOWED_ORG}`,
    {
      headers: {
        Authorization: `Bearer ${ghToken}`,
        Accept: "application/vnd.github+json",
      },
    },
  );
  if (membershipResp.status !== 200) {
    return res
     .status(403)
     .send(`@${user.login} is not a member of ${ALLOWED_ORG}.`);
  }
  const membership = (await membershipResp.json()) as { state: string };
  if (membership.state !== "active") {
    return res
     .status(403)
     .send(`@${user.login}'s membership in ${ALLOWED_ORG} is pending.`);
  }
}

The read:user scope on the GitHub OAuth App already gives you read access to org memberships when the user has the org’s visibility set to public; for private memberships you’d add read:org to the scope list in provider.ts.

Add ALLOWED_GITHUB_ORG to the Blueprint with sync: false. One env var, scales to every member of the org without you touching the Render Dashboard again.

5. Scaling and capacity

Render’s web service plans scale up and out independently. The dimensions that matter for this server:

Dimension	Where it bites	What to do
Concurrent OAuth flows	`pending` and `oauth_codes` writes	Move to Postgres (\u00a71). Postgres handles thousands of concurrent inserts; in-memory maps don’t survive horizontal scaling.
Postgres connections	`pg.Pool({ max: 10 })` per instance	Stay under the database’s `max_connections`. The pooling step covers PgBouncer if you scale to many web instances.
Outbound calls to GitHub	The callback handler hits `api.github.com` twice per login	GitHub allows 5,000 authenticated requests/hour per app - plenty unless you’re seeing thousands of logins/hour.
CPU on the web service	JWT signing is fast (HS256); the bottleneck is usually JSON parsing of MCP messages	Scale the web service horizontally (autoscaling on CPU >70%).
MCP sessions	The template uses stateless mode (`sessionIdGenerator: undefined`), so every request is independent of every other	Nothing to do - stateless is already horizontally trivial. If you switch to stateful (because a tool needs progress notifications or sampling), session affinity becomes a problem: Render’s load balancer doesn’t sticky-route, so you’d back the session map with Redis (Key Value) or accept session loss on instance rotation.

The Render scaling docs cover the configuration; the architectural call (pin sessions vs. go stateless) depends on whether your tools need server-initiated messages.

6. The week-one watchlist

The metrics that catch real problems before users tell you. Set Render alerts on each.

Metric	Threshold	What it means
HTTP 5xx rate	>1% over 5 min	Real errors. Logs query `level:error` finds the cause.
`/healthz` 503 rate	>0 for 2 min	Postgres unreachable. Could be a transient blip or a real outage - check the database’s Logs tab.
Postgres CPU	>80% sustained	A query is hot. Add an index or move work to a background job.
Postgres connections	>70% of `max_connections`	The pool is undersized or a leak. Drop `max` per-instance or add PgBouncer.
p95 latency on `/mcp`	>500 ms	Tool handlers are doing too much synchronously. Profile with the request ID.
OAuth `/token` error rate	>1%	Client-side problem (wrong code verifier, expired codes) - the response body tells you which.

Render Metrics gives you the first five out of the box; the last one you’ll get from your own structured logs.

7. Add tests (the template gave you a starting point)

The template includes a tests/ folder and a vitest.config.ts. Re-run them to confirm the existing tests still pass:

npm test

A few high-value additions for the surface you added:

A test that posts to /.well-known/oauth-protected-resource and asserts the JSON shape (catches a broken PUBLIC_URL early).
A test that posts to /mcp without an Authorization header and asserts 401 + the WWW-Authenticate response header.
A test that mints a JWT directly with provider.verifyAccessToken-compatible claims, hits /mcp with it, and asserts a successful tool call.

Each one is ~15 lines and runs in under 1s - wire them into the npm test script and have CI gate merges to main before flipping autoDeploy: true in the Blueprint.

8. Where to go next

The skills the agent and the docs both reach for if you want to keep building:

Render web services - port binding, custom domains, autoscaling.
Postgres on Render - the full HA, replicas, and pooling story.
Background workers - if your tools need to do long async work (LLM batch jobs, data sync), move it off the request path.
Render Workflows - when “background worker” stops being enough and you need retries, fan-out, and durable state.
MCP authorization spec - the source of truth for what you implemented; useful when an MCP client behaves unexpectedly.
render-examples/mcp-server-typescript - the template’s repo. Watch it for upstream changes (SDK bumps, transport updates) you might want to cherry-pick.

You scale the web service from 1 to 3 instances. OAuth sign-ins now fail with 'Unknown authorization code' about 2/3 of the time. What's the most likely cause, given the changes you made (and didn't make) in this tutorial?

GitHub is rate-limiting your callback handlerRender's load balancer is corrupting the redirect URLThe provider's pending/codes maps are in-memory per instance, so the `/token` request often lands on a different instance than `/authorize` didThe JWT signing secret rotated automatically when you scaled

What you learned

OAuth state must live in Postgres before you scale horizontally - in-memory maps don't survive multiple instances
Scope tool data by `req.auth.extra.sub` so each user only sees their own notes
`generateValue: true` makes JWT secret rotation a one-click Render Dashboard action
Allow-list GitHub logins or require org membership in the callback handler - the cheapest possible auth scoping
Set Render alerts on 5xx rate, `/healthz` 503s, Postgres CPU + connections, and `/mcp` p95 latency before anything else