When should I use stdio vs. Streamable HTTP?

Use stdio for local-only tools that ship with a specific client machine (single-user CLI utilities or Claude Desktop extensions installed per-user). Use Streamable HTTP whenever multiple clients connect to the same server, when the server needs to persist between client sessions, or when you want centralized updates, logging, and scaling. Anything deployed to a cloud platform is a Streamable HTTP server.

Is OAuth 2.1 strictly required, or can I ship bearer tokens?

Bearer tokens are fine for internal prototypes, CI integrations, and anything where you control both the server and the client. Render's MCP server templates use bearer tokens for exactly this reason. But MCP clients shipped by major vendors (Claude Desktop, ChatGPT, Cursor) expect OAuth 2.1 discovery and will refuse to connect to servers that don't expose /.well-known/oauth-protected-resource. If your server is meant for third-party use, OAuth 2.1 is effectively required.

What happens when a tool takes longer than the HTTP request timeout?

Render web services enforce a request timeout, so any long-running tool should return a job ID immediately and let the client poll for status. The experimental Tasks primitive in the 2025-11-25 spec formalizes this pattern. You can implement the same behavior today by pairing your web service with a background worker that processes the actual job and writes results back to a shared store like Render Key Value or Postgres.

Can I horizontally scale an MCP server on Render?

Yes, as long as your server is stateless (which is the Streamable HTTP default). Enable autoscaling on your web service and Render will route traffic across instances. If you maintain session state (for example, long-lived conversation context or cached tool results), move that state into Render Key Value or Postgres so any instance can serve any client. Avoid pinning sessions to specific instances.

Should I run my MCP server as a private service instead of a web service?

Use a private service when the MCP server is consumed only by other services inside your Render workspace (internal agents, sidecars, or tool providers for your own backend apps). Use a web service when external AI clients like Cursor or Claude Desktop connect from the public internet. A private service isn't reachable off-network, which is usually a security win but incompatible with hosted third-party AI clients.

Which language should I use to build my MCP server?

Pick the language your tool implementations are easiest to write in. Python and TypeScript have the most mature first-party SDKs, both supporting Streamable HTTP, OAuth 2.1 integration, and the full primitive set. FastMCP (Python) offers the fastest decorator-based API and is a good fit for data and ML tools. The TypeScript SDK integrates cleanly with existing Node services and Express middleware. Render provides one-click templates for both (Python, TypeScript). Other languages are supported too. The MCP community maintains SDKs in Go, Rust, Java, Kotlin, C#, Ruby, and Swift. Render natively runs Python, Node.js, Ruby, Go, Rust, and Elixir, so servers written in any of those deploy the same way. For anything else (or when you want full control of the build), ship your server as a Docker image and deploy it as a web service without changing your platform workflow.

Deployment

mcp

Deployment

mcp

Building and hosting MCP servers: a complete guide

April 17, 2026

The Model Context Protocol (MCP) is an open standard that defines how large language models discover and invoke external tools, read structured data, and access reusable prompt templates. MCP is a protocol pattern, not a product. It standardizes the interface between AI clients (like Claude Desktop or custom agents) and the capabilities you expose through a server.

A locally running MCP server works well during development: it communicates over stdio, piping JSON-RPC messages through stdin and stdout within a single process. But real-world use cases (shared teams, cloud-hosted AI agents, production integrations) require a remotely accessible server that communicates over HTTP. That local-to-remote boundary is what this guide addresses.

This guide teaches the concepts behind building an MCP server, illustrates them with simplified examples in both Python and Node.js, and walks through deploying to Render. It targets MCP specification 2025-11-25, the latest stable revision as of April 2026. The examples are intentionally simplified to teach core patterns. You'll adapt them to your own tools and use cases.

MCP defines three stable primitives that a server can expose to an LLM client:

Tools are functions the LLM can invoke. Each tool has a name, a JSON Schema describing its inputs, and returns structured output. Tools are analogous to POST endpoints: they perform actions and produce results. Example: a convert_currency tool that accepts an amount and target currency.
Resources are read-only data the LLM can query for context. Resources are analogous to GET endpoints: they return information without side effects. Example: a config://app-settings resource that returns the current application configuration.
Prompts are reusable prompt templates the server exposes. They let the server define parameterized instructions that clients can retrieve and fill, ensuring consistent LLM interactions across consumers.

The 2025-11-25 spec also introduced an experimental Tasks primitive for long-running or asynchronous operations that exceed normal HTTP request lifetimes. If a tool's work takes longer than a few seconds, Tasks let clients poll for completion instead of holding the connection open. The primitive is still evolving, so treat it as forward-looking rather than production-default.

MCP defines two transport mechanisms:

stdio pipes JSON-RPC messages through stdin and stdout. It's local and process-bound: the client spawns the server as a subprocess. This is the default for local development and tools like Claude Desktop.
Streamable HTTP exposes the server over a network via an HTTP endpoint. The client sends JSON-RPC requests to a single URL. This is the transport required for remote hosting and the focus of this guide.

An earlier HTTP+SSE transport was deprecated in the 2025-03-26 spec and is being sunset across major providers in 2026. New servers should use Streamable HTTP exclusively.

Transport choice shapes your architecture: if your MCP server must be reachable over the internet, use Streamable HTTP.

MCP servers are lightweight by design. The protocol handles capability negotiation, message framing, and schema advertisement. Your job is to define tool schemas and connect them to meaningful logic.

The following simplified examples register a single tool, get_weather, that accepts a city string parameter and returns a simulated weather response. Both examples configure the server for streamable HTTP transport.

Python (using the standalone fastmcp package):

python

import os
from fastmcp import FastMCP

mcp = FastMCP("weather-server")

@mcp.tool
def get_weather(city: str) -> str:
    """Get current weather for a given city."""
    return f"Weather in {city}: 72°F, partly cloudy"

if __name__ == "__main__":
    port = int(os.environ.get("PORT", 8080))
    mcp.run(transport="streamable-http", host="0.0.0.0", port=port)

The standalone fastmcp package (version 3.x as of April 2026) is the actively maintained successor to the legacy mcp.server.fastmcp module bundled inside the official mcp SDK. Import from fastmcp directly to avoid version conflicts.

Node.js (using the @modelcontextprotocol/sdk package):

javascript

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import express from "express";
import { z } from "zod";

const server = new McpServer({ name: "weather-server", version: "1.0.0" });

server.tool("get_weather", { city: z.string() }, async ({ city }) => ({
  content: [{ type: "text", text: `Weather in ${city}: 72°F, partly cloudy` }],
}));

const app = express();
app.use(express.json());

app.post("/mcp", async (req, res) => {
  const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined });
  await server.connect(transport);
  await transport.handleRequest(req, res, req.body);
});

const port = parseInt(process.env.PORT || "8080", 10);
app.listen(port, "0.0.0.0", () => {
  console.log(`MCP server running on port ${port}`);
});

Both examples read the port from the PORT environment variable and fall back to 8080 for local development. This is the idiomatic pattern on Render and most other platforms: the host injects PORT, and your code adapts.

The Python SDK infers the tool's input schema from the function signature. The Node.js SDK uses Zod for schema declaration. In both cases, the server advertises the get_weather tool (its name, description, and input schema) to any MCP client that connects. The LLM uses this advertisement to decide when and how to call the tool.

To add a resource, register a read-only handler with a URI pattern (e.g., weather://forecast/{city}) that returns data without side effects. Prompts follow a similar registration pattern with parameterized templates.

Render's native runtime support for Python and Node.js, combined with automatic builds from Git, takes you from a working server to a hosted endpoint in a few minutes.

If you want a working server before adding your own tools, start from one of Render's official MCP templates. Both include a render.yaml Blueprint, Streamable HTTP transport, a health check endpoint, an auto-generated bearer token, and an AGENTS.md file so AI coding assistants can scaffold new tools that match project conventions.

MCP server template - Python (FastMCP)
MCP server template - TypeScript (official TypeScript SDK)

Deploy the template, fork the repo, replace the example tool with your own, and push. The templates use bearer-token auth for simple prototyping; layer on OAuth 2.1 (covered below) before you put the server in front of production clients.

Prerequisites: Push your MCP server code to a GitHub, GitLab, or Bitbucket repository. Include a dependency file: requirements.txt for Python (with the fastmcp package) or package.json for Node.js (with @modelcontextprotocol/sdk, express, and zod).

Deploy as a web service:

Go to the Render Dashboard and click Add new > Web Service.
Connect your Git repository.
Configure runtime settings:
- Language: Python 3 or Node
- Build command: pip install -r requirements.txt (Python) or npm install (Node.js)
- Start command: python server.py (Python) or node server.js (Node.js)
Deploy. Render builds from your repository, injects a PORT environment variable, and assigns a .onrender.com URL.

Your MCP server is now reachable at https://your-service.onrender.com/mcp. Any MCP client configured with this URL as a Streamable HTTP endpoint can discover and invoke your tools. The auto-deploy feature rebuilds on every push to your linked branch, so iteration stays fast.

Streamable HTTP is stateless by default, which makes MCP servers a natural fit for horizontal scaling. On Render, you can enable autoscaling to handle variable client load without sticky sessions. If you need to avoid cold starts entirely, any paid instance type stays running. Only free instances spin down on idle.

If you prefer infrastructure as code, define the service in a render.yaml file at the repo root:

yaml

services:
  - type: web
    name: mcp-server
    runtime: python
    buildCommand: pip install -r requirements.txt
    startCommand: python server.py

Once your MCP server is reachable over the internet, security is your responsibility. The MCP spec is opinionated about authentication for HTTP transports, so ad-hoc bearer tokens are fine for internal prototypes but not sufficient for production clients that expect standards-compliant discovery.

Authentication: MCP's HTTP transports require OAuth 2.1 with mandatory PKCE. Your server acts as an OAuth 2.1 Resource Server and must expose protected resource metadata at /.well-known/oauth-protected-resource (RFC 9728) so clients can discover your authorization server. Validate the aud claim on every token (RFC 8707) to prevent token replay and confused-deputy attacks. Never pass client-supplied tokens through to downstream APIs. Obtain separately scoped tokens instead. For stdio transports (local only), skip OAuth and use environment-based credentials. In nearly all cases, delegate authentication to a managed identity provider rather than implementing an authorization server yourself.
Input validation: Every tool input arrives as client-supplied data. Validate and sanitize all parameters before passing them to backend logic. MCP schema definitions provide structural validation, but you need to enforce semantic constraints (allowlists, length limits, injection prevention) in your tool handlers.
Least privilege for tools: Each tool your server exposes is an attack surface. Only register tools that are necessary. If a tool performs writes, mutations, or accesses sensitive systems, gate it behind additional authorization checks. An MCP server that reads weather data shouldn't also expose a tool that deletes database records.
Transport security: Always deploy behind HTTPS. Render provisions TLS certificates automatically for all .onrender.com domains and custom domains, so encrypted transport requires no additional configuration.
Secrets management: Store tokens, client secrets, and database credentials as environment variables in Render. Never commit them to source code. Use sync: false in render.yaml for values you want to set manually in the Dashboard.

One common pitfall to avoid: if you implement Dynamic Client Registration (DCR), strictly validate redirect_uri patterns. Loose validation has led to one-click account takeover vulnerabilities in several shipping MCP servers.

MCP gives LLMs a standardized contract of tools, resources, and prompts (with Tasks emerging for async work) that any compliant client can consume. The protocol is intentionally minimal: define what your server can do, describe it with schemas, secure the endpoint with OAuth 2.1, and let the transport layer handle the rest.

For a production reference, see Render's own hosted MCP server at https://mcp.render.com/mcp and the Render MCP server docs. The 2026 MCP roadmap focuses on stateless horizontal scaling, standardized .well-known discovery, and enterprise features like audit trails and SSO, all of which align well with Render's platform model. Servers you ship today should continue to fit as the spec matures.

Further reading: the MCP specification, the Python MCP SDK, the TypeScript MCP SDK, and Render's documentation.

Building and hosting MCP servers: a complete guide

Why MCP matters and why your server needs a URL

Core MCP concepts: primitives and transports

Building an MCP server

Deploying to Render

Fastest path: one-click templates

Deploying manually from your own repository

Security considerations for remote MCP servers

From protocol to production

Frequently asked questions