Rate limit, structured logs, and a real health check — Build and host a full-featured, secure MCP server on Render

The auth gate from step 5 ensures every request has a real identity. This step uses that identity to keep one bad actor from ruining the day for everyone else, and makes the service observable enough to debug problems without an SSH session.

1. Per-identity rate limiting

express-rate-limit is the standard pick. Default behavior keys by IP, which is wrong for an MCP server (every client behind the same NAT shares a limit). Override the keyGenerator to use the OAuth sub claim instead.

npm install express-rate-limit

pnpm add express-rate-limit

yarn add express-rate-limit

import rateLimit from "express-rate-limit";

export const mcpRateLimit = rateLimit({
  windowMs: 60_000,
  limit: 120,
  standardHeaders: "draft-8",
  legacyHeaders: false,
  keyGenerator: (req) => {
    const sub = (req as { auth?: { extra?: { sub?: string } } }).auth?.extra?.sub;
    return sub ?? req.ip ?? "anonymous";
  },
  message: { jsonrpc: "2.0", error: { code: -32000, message: "Rate limit exceeded" }, id: null },
});

120 requests per minute per identity is generous for interactive agent use and tight enough that a runaway loop trips before it costs you real money. Tune to your workload - the Render web services docs cover Starter-plan capacity guidance.

Order matters when wiring it in: rate limiting goes after auth so you know who to key on:

app.post("/mcp", bearerAuth, mcpRateLimit, async (req, res) => { /* unchanged */ });

2. Structured logs with pino

console.log is fine for npm start on your laptop. It is not fine for digging through three days of production logs trying to find why one user’s tool call failed. Pino gives you JSON logs, request IDs, and a child-logger pattern that keeps context attached as a request flows through middleware.

npm install pino pino-http
npm install -D pino-pretty

pnpm add pino pino-http
pnpm add -D pino-pretty

yarn add pino pino-http
yarn add -D pino-pretty

import pino from "pino";
import pinoHttp from "pino-http";
import { randomUUID } from "node:crypto";

const isDev = process.env.NODE_ENV !== "production";

export const logger = pino({
  level: process.env.LOG_LEVEL ?? "info",
  transport: isDev ? { target: "pino-pretty", options: { colorize: true } } : undefined,
  redact: ["req.headers.authorization", "req.headers.cookie"],
});

export const httpLogger = pinoHttp({
  logger,
  genReqId: (req) => (req.headers["x-request-id"] as string) ?? randomUUID(),
  customLogLevel: (_req, res, err) => {
    if (err || res.statusCode >= 500) return "error";
    if (res.statusCode >= 400) return "warn";
    return "info";
  },
  customProps: (req) => {
    const sub = (req as { auth?: { extra?: { sub?: string } } }).auth?.extra?.sub;
    return sub ? { user: sub } : {};
  },
});

Piece	Why
`redact: ["req.headers.authorization",...]`	The OAuth Bearer token must never end up in a log line. Pino redacts before the line is serialized.
`genReqId` reading `x-request-id`	Render’s load balancer forwards a request ID header. Reusing it lets you trace a request across edge -> app -> database.
`customProps` pulling `req.auth.extra.sub`	Every log line in an authenticated request gets a `user` field. Filtering by user in the Render Logs view becomes trivial.
`customLogLevel` mapping status -> level	5xx becomes `error`, 4xx becomes `warn`. Alerting on `level >= error` then has signal.

Mount it as the first middleware in src/app.ts so the request ID and timer are set before anything else runs:

import { httpLogger, logger } from "./logger.js";

//...createMcpExpressApp call returns `app`...
app.use(httpLogger);
// then mcpAuthRouter, callback, /mcp, etc.

Update src/server.ts to log through pino at startup instead of console.log:

import { app } from "./app.js";
import { logger } from "./app.js";  // re-export from app, or import from "./logger.js"

const PORT = parseInt(process.env.PORT || "10000", 10);

app.listen(PORT, "0.0.0.0", () => {
  logger.info({ port: PORT }, "MCP server listening");
});

process.on("SIGINT", () => {
  logger.info("Shutting down");
  process.exit(0);
});

Restart and you’ll see logs like this in dev:

[10:42:11.103] INFO (12345): MCP server listening
    port: 10000
[10:42:18.882] INFO (12345): request completed
    req: { "id": "8c1f...", "method": "POST", "url": "/mcp" }
    res: { "statusCode": 200 }
    responseTime: 42
    user: "github:1234567"

In production those same lines are one-line JSON, which Render’s log search indexes natively.

3. A real health check

The /health the template ships returns 200 {status: "ok"} regardless of whether the server can actually do anything. That’s worse than no health check - Render keeps a broken instance in rotation.

A useful health check:

Confirms the database is reachable (one cheap query).
Returns fast (Render’s default timeout is 30s; it should be well under 1s).
Doesn’t trigger downstream cost (no expensive joins, no external API calls).
Is unauthenticated (the load balancer doesn’t have a token).

import type { Pool } from "pg";
import type { Request, Response } from "express";

export function makeHealthHandler(pool: Pool) {
  return async (_req: Request, res: Response) => {
    try {
      await pool.query("SELECT 1");
      res.json({ status: "ok" });
    } catch (err) {
      res.status(503).json({ status: "degraded", error: (err as Error).message });
    }
  };
}

You already have a handle to the pg.Pool from step 4 - createPgStore returned { store, pool }. Swap the template’s static /health route for one that exercises the database:

Template's /health

- app.get("/health", (_req, res) => res.json({ status: "ok" }));

DB-backed /health

 
+ import { makeHealthHandler } from "./health.js";
+ 
+ app.get("/health", makeHealthHandler(pool));

Now verify locally that the new probe reflects database state. Stop Postgres, watch /health flip to 503, start it back up, watch it recover.

Take Postgres down and watch /health degrade

$curl -s http://localhost:10000/health | jq
{
  "status": "ok"
}
$docker compose stop postgres
[+] Stopping 1/1
  Container notes-mcp-postgres-1  Stopped
$curl -s -w '\n%{http_code}\n' http://localhost:10000/health
{"status":"degraded","error":"Connection terminated unexpectedly"}
503
$docker compose start postgres
[+] Running 1/1
  Container notes-mcp-postgres-1  Started
$curl -s http://localhost:10000/health | jq
{
  "status": "ok"
}

That’s the contract Render’s zero-downtime deploys rely on - when the database is down, the instance comes out of rotation; when it’s back, the instance returns.

4. Put it together - the final middleware stack

For reference, here’s the order everything ends up in inside src/app.ts:

app.use(httpLogger);                     // request id, timing, structured logs
app.use(mcpAuthRouter({... }));         // OAuth endpoints (public)
app.get("/oauth/github/callback",...);  // GitHub redirect target (public)
app.get("/health", makeHealthHandler(pool)); // health (public)
app.post("/mcp", bearerAuth, mcpRateLimit, mcpHandler); // gated MCP transport

Public routes (/health, /.well-known/..., /authorize, /token, /register, the GitHub callback) are all reachable without a token. The only protected surface is /mcp itself, which is the right blast radius - the OAuth endpoints have to be public to bootstrap the auth flow.

flowchart TD
  req[Incoming HTTP request]
  log[httpLogger<br/>+ request ID]
  router{Route}
  health[/health: DB ping/]
  oauth[OAuth endpoints<br/>+ GitHub callback]
  mcp[/mcp]
  auth[requireBearerAuth]
  ratelimit[mcpRateLimit per sub]
  handler[per-request McpServer<br/>+ StreamableHTTPServerTransport]

  req --> log --> router
  router --> health
  router --> oauth
  router --> mcp --> auth --> ratelimit --> handler

Why does the rate limiter key on the OAuth `sub` claim instead of the client IP?

It's faster to look up a JWT claim than to read `req.ip`MCP clients sit behind shared NATs (corporate proxies, residential CGNAT). IP-keyed limits punish whole networks for one client's behavior; identity-keyed limits punish only the offender.`express-rate-limit` doesn't support IP keying anymoreThe MCP spec requires sub-keyed rate limiting

What you learned

Rate-limit on OAuth `sub`, not IP - NAT-shared clients would otherwise throttle each other
Mount `httpLogger` first so request IDs and timing are set before anything else
Redact `Authorization` and `Cookie` headers from logs; tokens in logs is a leak
Replace the template's static `/health` with one that exercises Postgres - a static 200 lies to Render's load balancer
Public routes are exactly the OAuth bootstrap surface; everything else lives behind `requireBearerAuth`