# Building and hosting MCP servers: a complete guide

- Date: 2026-04-17T00:00:00.000Z
- Tags: Deployment, mcp, AI
- URL: https://render.com/articles/building-and-hosting-mcp-servers-a-complete-guide


## Why MCP matters and why your server needs a URL

The Model Context Protocol (MCP) is an open standard that defines how large language models discover and invoke external tools, read structured data, and access reusable prompt templates. MCP is a protocol pattern, not a product. It standardizes the interface between AI clients (like Claude Desktop or custom agents) and the capabilities you expose through a server.

A locally running MCP server works well during development: it communicates over stdio, piping JSON-RPC messages through stdin and stdout within a single process. But real-world use cases (shared teams, cloud-hosted AI agents, production integrations) require a remotely accessible server that communicates over HTTP. That local-to-remote boundary is what this guide addresses.

This guide teaches the concepts behind building an MCP server, illustrates them with simplified examples in both Python and Node.js, and walks through deploying to [Render](https://render.com). It targets MCP specification [2025-11-25](https://modelcontextprotocol.io/specification/2025-11-25/changelog), the latest stable revision as of April 2026. The examples are intentionally simplified to teach core patterns. You'll adapt them to your own tools and use cases.

## Core MCP concepts: primitives and transports

MCP defines three stable primitives that a server can expose to an LLM client:

- *Tools* are functions the LLM can invoke. Each tool has a name, a JSON Schema describing its inputs, and returns structured output. Tools are analogous to POST endpoints: they perform actions and produce results. Example: a `convert_currency` tool that accepts an amount and target currency.
- *Resources* are read-only data the LLM can query for context. Resources are analogous to GET endpoints: they return information without side effects. Example: a `config://app-settings` resource that returns the current application configuration.
- *Prompts* are reusable prompt templates the server exposes. They let the server define parameterized instructions that clients can retrieve and fill, ensuring consistent LLM interactions across consumers.

The 2025-11-25 spec also introduced an experimental *Tasks* primitive for long-running or asynchronous operations that exceed normal HTTP request lifetimes. If a tool's work takes longer than a few seconds, Tasks let clients poll for completion instead of holding the connection open. The primitive is still evolving, so treat it as forward-looking rather than production-default.

MCP defines two transport mechanisms:

- *stdio* pipes JSON-RPC messages through stdin and stdout. It's local and process-bound: the client spawns the server as a subprocess. This is the default for local development and tools like Claude Desktop.
- *Streamable HTTP* exposes the server over a network via an HTTP endpoint. The client sends JSON-RPC requests to a single URL. This is the transport required for remote hosting and the focus of this guide.

An earlier HTTP+SSE transport was deprecated in the 2025-03-26 spec and is being sunset across major providers in 2026. New servers should use Streamable HTTP exclusively.

Transport choice shapes your architecture: if your MCP server must be reachable over the internet, use Streamable HTTP.

## Building an MCP server

MCP servers are lightweight by design. The protocol handles capability negotiation, message framing, and schema advertisement. Your job is to define tool schemas and connect them to meaningful logic.

The following simplified examples register a single tool, `get_weather`, that accepts a `city` string parameter and returns a simulated weather response. Both examples configure the server for streamable HTTP transport.

*Python (using the standalone `fastmcp` package):*

```python runnable
import os
from fastmcp import FastMCP

mcp = FastMCP("weather-server")

@mcp.tool
def get_weather(city: str) -> str:
    """Get current weather for a given city."""
    return f"Weather in {city}: 72°F, partly cloudy"

if __name__ == "__main__":
    port = int(os.environ.get("PORT", 8080))
    mcp.run(transport="streamable-http", host="0.0.0.0", port=port)
```

The standalone `fastmcp` package (version 3.x as of April 2026) is the actively maintained successor to the legacy `mcp.server.fastmcp` module bundled inside the official `mcp` SDK. Import from `fastmcp` directly to avoid version conflicts.

*Node.js (using the `@modelcontextprotocol/sdk` package):*

```javascript runnable
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import express from "express";
import { z } from "zod";

const server = new McpServer({ name: "weather-server", version: "1.0.0" });

server.tool("get_weather", { city: z.string() }, async ({ city }) => ({
  content: [{ type: "text", text: `Weather in ${city}: 72°F, partly cloudy` }],
}));

const app = express();
app.use(express.json());

app.post("/mcp", async (req, res) => {
  const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined });
  await server.connect(transport);
  await transport.handleRequest(req, res, req.body);
});

const port = parseInt(process.env.PORT || "8080", 10);
app.listen(port, "0.0.0.0", () => {
  console.log(`MCP server running on port ${port}`);
});
```

Both examples read the port from the `PORT` environment variable and fall back to `8080` for local development. This is the idiomatic pattern on Render and most other platforms: the host injects `PORT`, and your code adapts.

The Python SDK infers the tool's input schema from the function signature. The Node.js SDK uses [Zod](https://zod.dev/) for schema declaration. In both cases, the server advertises the `get_weather` tool (its name, description, and input schema) to any MCP client that connects. The LLM uses this advertisement to decide when and how to call the tool.

To add a resource, register a read-only handler with a URI pattern (e.g., `weather://forecast/{city}`) that returns data without side effects. Prompts follow a similar registration pattern with parameterized templates.

## Deploying to Render

Render's native runtime support for [Python](https://render.com/docs/python-version) and [Node.js](https://render.com/docs/node-version), combined with automatic builds from Git, takes you from a working server to a hosted endpoint in a few minutes.

### Fastest path: one-click templates

If you want a working server before adding your own tools, start from one of Render's official MCP templates. Both include a `render.yaml` Blueprint, Streamable HTTP transport, a health check endpoint, an auto-generated bearer token, and an `AGENTS.md` file so AI coding assistants can scaffold new tools that match project conventions.

- [MCP server template - Python](https://render.com/templates/mcp-server-python) (FastMCP)
- [MCP server template - TypeScript](https://render.com/templates/mcp-server-typescript) (official TypeScript SDK)

Deploy the template, fork the repo, replace the example tool with your own, and push. The templates use bearer-token auth for simple prototyping; layer on OAuth 2.1 (covered below) before you put the server in front of production clients.

### Deploying manually from your own repository

*Prerequisites:* Push your MCP server code to a GitHub, GitLab, or Bitbucket repository. Include a dependency file: `requirements.txt` for Python (with the `fastmcp` package) or `package.json` for Node.js (with `@modelcontextprotocol/sdk`, `express`, and `zod`).

*Deploy as a web service:*

1. Go to the [Render Dashboard](https://dashboard.render.com/) and click *Add new > Web Service*.
2. Connect your Git repository.
3. Configure runtime settings:
   - *Language:* Python 3 or Node
   - *Build command:* `pip install -r requirements.txt` (Python) or `npm install` (Node.js)
   - *Start command:* `python server.py` (Python) or `node server.js` (Node.js)
4. Deploy. Render builds from your repository, injects a `PORT` environment variable, and assigns a `.onrender.com` URL.

Your MCP server is now reachable at `https://your-service.onrender.com/mcp`. Any MCP client configured with this URL as a Streamable HTTP endpoint can discover and invoke your tools. The [auto-deploy](https://render.com/docs/deploys#automatic-git-deploys) feature rebuilds on every push to your linked branch, so iteration stays fast.

Streamable HTTP is stateless by default, which makes MCP servers a natural fit for horizontal scaling. On Render, you can enable [autoscaling](https://render.com/docs/scaling) to handle variable client load without sticky sessions. If you need to avoid cold starts entirely, any paid [instance type](https://render.com/docs/pricing#compute) stays running. Only free instances spin down on idle.

If you prefer infrastructure as code, define the service in a `render.yaml` file at the repo root:

```yaml
services:
  - type: web
    name: mcp-server
    runtime: python
    buildCommand: pip install -r requirements.txt
    startCommand: python server.py
```

## Security considerations for remote MCP servers

Once your MCP server is reachable over the internet, security is your responsibility. The MCP spec is opinionated about authentication for HTTP transports, so ad-hoc bearer tokens are fine for internal prototypes but not sufficient for production clients that expect standards-compliant discovery.

- *Authentication:* MCP's HTTP transports require [OAuth 2.1](https://modelcontextprotocol.io/specification/2025-11-25/basic/authorization) with mandatory PKCE. Your server acts as an OAuth 2.1 Resource Server and must expose protected resource metadata at `/.well-known/oauth-protected-resource` (RFC 9728) so clients can discover your authorization server. Validate the `aud` claim on every token (RFC 8707) to prevent token replay and confused-deputy attacks. Never pass client-supplied tokens through to downstream APIs. Obtain separately scoped tokens instead. For `stdio` transports (local only), skip OAuth and use environment-based credentials. In nearly all cases, delegate authentication to a managed identity provider rather than implementing an authorization server yourself.
- *Input validation:* Every tool input arrives as client-supplied data. Validate and sanitize all parameters before passing them to backend logic. MCP schema definitions provide structural validation, but you need to enforce semantic constraints (allowlists, length limits, injection prevention) in your tool handlers.
- *Least privilege for tools:* Each tool your server exposes is an attack surface. Only register tools that are necessary. If a tool performs writes, mutations, or accesses sensitive systems, gate it behind additional authorization checks. An MCP server that reads weather data shouldn't also expose a tool that deletes database records.
- *Transport security:* Always deploy behind HTTPS. Render provisions [TLS certificates automatically](https://render.com/docs/tls-certificates) for all `.onrender.com` domains and custom domains, so encrypted transport requires no additional configuration.
- *Secrets management:* Store tokens, client secrets, and database credentials as [environment variables in Render](https://render.com/docs/configure-environment-variables). Never commit them to source code. Use `sync: false` in `render.yaml` for values you want to set manually in the Dashboard.

One common pitfall to avoid: if you implement Dynamic Client Registration (DCR), strictly validate `redirect_uri` patterns. Loose validation has led to one-click account takeover vulnerabilities in several shipping MCP servers.

## From protocol to production

MCP gives LLMs a standardized contract of tools, resources, and prompts (with Tasks emerging for async work) that any compliant client can consume. The protocol is intentionally minimal: define what your server can do, describe it with schemas, secure the endpoint with OAuth 2.1, and let the transport layer handle the rest.

For a production reference, see Render's own hosted MCP server at `https://mcp.render.com/mcp` and the [Render MCP server docs](https://render.com/docs/mcp-server). The [2026 MCP roadmap](https://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/) focuses on stateless horizontal scaling, standardized `.well-known` discovery, and enterprise features like audit trails and SSO, all of which align well with Render's platform model. Servers you ship today should continue to fit as the spec matures.

Further reading: the [MCP specification](https://modelcontextprotocol.io/), the [Python MCP SDK](https://github.com/modelcontextprotocol/python-sdk), the [TypeScript MCP SDK](https://github.com/modelcontextprotocol/typescript-sdk), and [Render's documentation](https://render.com/docs).

## Frequently asked questions

###### When should I use stdio vs. Streamable HTTP?

Use stdio for local-only tools that ship with a specific client machine (single-user CLI utilities or Claude Desktop extensions installed per-user). Use Streamable HTTP whenever multiple clients connect to the same server, when the server needs to persist between client sessions, or when you want centralized updates, logging, and scaling. Anything deployed to a cloud platform is a Streamable HTTP server.

###### Is OAuth 2.1 strictly required, or can I ship bearer tokens?

Bearer tokens are fine for internal prototypes, CI integrations, and anything where you control both the server and the client. Render's [MCP server templates](https://render.com/templates/mcp-server-python) use bearer tokens for exactly this reason. But MCP clients shipped by major vendors (Claude Desktop, ChatGPT, Cursor) expect OAuth 2.1 discovery and will refuse to connect to servers that don't expose `/.well-known/oauth-protected-resource`. If your server is meant for third-party use, OAuth 2.1 is effectively required.

###### What happens when a tool takes longer than the HTTP request timeout?

Render web services enforce a request timeout, so any long-running tool should return a job ID immediately and let the client poll for status. The experimental Tasks primitive in the 2025-11-25 spec formalizes this pattern. You can implement the same behavior today by pairing your web service with a [background worker](https://render.com/docs/background-workers) that processes the actual job and writes results back to a shared store like [Render Key Value](https://render.com/docs/key-value) or [Postgres](https://render.com/docs/postgresql-creating-connecting).

###### Can I horizontally scale an MCP server on Render?

Yes, as long as your server is stateless (which is the Streamable HTTP default). Enable [autoscaling](https://render.com/docs/scaling) on your web service and Render will route traffic across instances. If you maintain session state (for example, long-lived conversation context or cached tool results), move that state into Render Key Value or Postgres so any instance can serve any client. Avoid pinning sessions to specific instances.

###### Should I run my MCP server as a private service instead of a web service?

Use a [private service](https://render.com/docs/private-services) when the MCP server is consumed only by other services inside your Render workspace (internal agents, sidecars, or tool providers for your own backend apps). Use a web service when external AI clients like Cursor or Claude Desktop connect from the public internet. A private service isn't reachable off-network, which is usually a security win but incompatible with hosted third-party AI clients.

###### Which language should I use to build my MCP server?

Pick the language your tool implementations are easiest to write in. Python and TypeScript have the most mature first-party SDKs, both supporting Streamable HTTP, OAuth 2.1 integration, and the full primitive set. FastMCP (Python) offers the fastest decorator-based API and is a good fit for data and ML tools. The TypeScript SDK integrates cleanly with existing Node services and Express middleware. Render provides one-click templates for both ([Python](https://render.com/templates/mcp-server-python), [TypeScript](https://render.com/templates/mcp-server-typescript)).

Other languages are supported too. The MCP community maintains SDKs in Go, Rust, Java, Kotlin, C#, Ruby, and Swift. Render natively runs [Python, Node.js, Ruby, Go, Rust, and Elixir](https://render.com/docs/native-runtimes), so servers written in any of those deploy the same way. For anything else (or when you want full control of the build), ship your server as a [Docker image](https://render.com/docs/docker) and deploy it as a web service without changing your platform workflow.


