Migrate from Heroku and get up to $10k in credits

Get started
Platform

Building an agent with LangChain and Claude/OpenAI

Why agents represent a paradigm shift

An agent is an LLM that can reason in a loop, decide to use tools, and synthesize results across multiple steps. This is fundamentally different from a single chat completion call, where you send a prompt and receive one response. With an agent, the model evaluates a query, determines that it needs external information, invokes tools to get that information, and then reasons over the combined results before responding.

This article teaches the pattern of building a tool-using agent with LangChain, demonstrating the architecture with both Claude and OpenAI as interchangeable LLM providers. All code is illustrative. Treat each example as a starting point—production concerns like error handling, authentication, and observability are called out but intentionally omitted for clarity.

The agent mental model

Imagine you ask an assistant to find today's weather in San Francisco and suggest what to wear. A standard LLM generates a plausible-sounding answer from training data. An agent recognizes it needs current weather data, calls a weather API tool, receives the actual temperature and conditions, then reasons over that real data to produce clothing recommendations.

This is the ReAct (Reasoning + Acting) loop:

InputLLM ReasoningTool SelectionTool ExecutionLLM SynthesisOutput

The cycle may repeat multiple times. An agent answering a complex question might call three different tools across five reasoning steps before producing a final response.

Key vocabulary:

  • Agent: An LLM configured with tools and a reasoning strategy that enables multi-step problem solving
  • Tool: A Python function with a description that the LLM reads to determine when and how to invoke it
  • Agent Executor: The LangChain runtime that manages the reasoning loop, routing tool calls and collecting results

One critical distinction: the LLM does NOT execute code. It requests tool calls by generating structured output that names a tool and provides arguments. The framework executes the corresponding Python function and feeds the result back to the LLM for further reasoning.

Defining tools: the agent's capabilities

Tools are Python functions with metadata that the LLM reads to decide when to invoke them. The description string is critical, as it's the LLM's only understanding of what the tool does and what arguments it expects. A poorly described tool gets misused or ignored.

LangChain's tool definition pattern is to write a Python function, apply the @tool decorator, and let the framework extract the schema from the function signature and docstring.

The tool returns a hardcoded string. In a real implementation, this function would call an external weather API. The agent only sees the docstring and type signature. This separation is what makes the agent pattern composable: you can swap tool implementations without changing agent logic.

Assembling the agent: LangChain with Claude or OpenAI

LangChain provides a provider-agnostic abstraction layer where switching between Claude and OpenAI is a model configuration concern, not an architectural redesign. Your agent construction code, tool bindings, and invocation logic stay identical regardless of provider.

This requires langchain-openai, langchain-anthropic, and langgraph packages. Set API keys as environment variables: OPENAI_API_KEY or ANTHROPIC_API_KEY. Refer to the LangChain chat model integration docs for provider-specific options.

The create_react_agent function from LangGraph constructs the full reasoning loop. When you call agent.invoke(), the framework sends the user message to the LLM, the LLM decides to call get_current_temperature with "Tokyo", the framework executes the function, returns the result, and the LLM synthesizes a final response.

Exposing the agent as a web service

To make your agent accessible over HTTP, wrap it in a lightweight web framework:

For a minimal deployment, put the tool definition, model configuration, create_react_agent(...), and the Flask route in the same app.py file. That way, gunicorn app:app can import the Flask app object directly.

Production deployments require request validation, authentication middleware, timeout handling, and structured error responses.

Deploying to Render

Deploy your agent as a web service from a Git repository with Render's managed infrastructure. Create a requirements.txt:

In your Render Dashboard, create a new Web Service connected to your Git repository. If your Flask app lives in app.py, set the Start Command to gunicorn app:app. If you use a different module name, update the command accordingly. Then configure OPENAI_API_KEY or ANTHROPIC_API_KEY as environment variables.

For long-running agent interactions, use Background Workers or Render Workflows for asynchronous processing. You can configure autoscaling to handle variable load, as agent services tend to have bursty traffic since each request may take several seconds across multiple reasoning steps. Note that autoscaling requires a Professional Render workspace type or higher. Review Render's pricing to select an appropriate instance type.

From tutorial pattern to production system

Bridging from these patterns to production requires addressing several concerns:

  • Error handling: Tool functions must catch API failures, timeouts, and malformed inputs. Set a maximum iteration limit to prevent infinite reasoning loops.
  • Rate limiting: Both LLM provider APIs and external tool APIs have rate limits. Implement backoff strategies and request queuing. If using Render Workflows, this functionality is already built in and configurable.
  • Memory and state: These examples are stateless. Production agents typically need conversation memory and persistence layers.
  • Observability: Integrate LangSmith to inspect reasoning traces, tool call sequences, and latency breakdowns.
  • Security: Tools accessing external services represent an attack surface. Validate all inputs and scope tool permissions narrowly.
  • Cost management: Each reasoning step consumes tokens. A single query might trigger five or more LLM calls. Monitor usage and set budget thresholds.

The agent pattern — observe, think, act, observe — is the foundation. LangChain provides the abstraction layer, tools define capabilities, and Render provides your deployment infrastructure. Start with these simplified patterns, then layer in production concerns incrementally.

Frequently asked questions