Can I use LLM providers other than Claude and OpenAI?

Yes. LangChain supports dozens of providers through the same interface—Mistral, Cohere, Google Gemini, and others each have a corresponding package. Swap the llm assignment and install the relevant package; your tool definitions and agent executor code remain unchanged.

How does the agent decide which tool to call?

The LLM reads each tool's name and docstring to determine relevance. When the model decides a tool is needed, it generates structured output (a tool call object) naming the tool and providing arguments. The framework, not the LLM, then executes the corresponding Python function and returns the result for further reasoning. This is why accurate, specific docstrings matter: they are the model's only signal about what a tool does.

How do I add conversation memory so the agent remembers earlier turns?

The examples above are stateless—each agent.invoke() call starts fresh. To add memory, pass a checkpointer to createreactagent and include a thread_id in the config when invoking. LangGraph's built-in memory checkpointing stores message history and replays it on subsequent calls within the same thread.

What happens if a tool call fails or raises an exception?

By default, an unhandled exception in a tool propagates up and terminates the agent run. Wrap tool implementations in try/except blocks and return a descriptive error string rather than raising. This way, the LLM can reason over the failure and decide whether to retry, use a different tool, or inform the user. Set a recursion_limit on the agent to prevent infinite retry loops.

How many tools can I give an agent?

There is no hard framework limit, but LLM context windows and attention are practical constraints. Every tool's name, description, and parameter schema is injected into the system prompt, and too many tools can degrade tool selection accuracy. As a rule of thumb, keep focused agents under 10–15 tools. For broader capability sets, consider routing queries to specialized sub-agents rather than loading a single agent with dozens of tools.

How do I inspect what my agent is doing during a run?

Integrate LangSmith by setting the LANGCHAINAPIKEY and LANGCHAINTRACINGV2=true environment variables. LangSmith automatically traces every reasoning step, tool call, input, and output without changing any application code. This is the fastest way to debug unexpected tool selections or runaway reasoning loops.

Can I run multiple agents that hand off tasks to each other?

Yes. This is called a multi-agent architecture. LangGraph supports defining multiple agents as nodes in a graph, with edges that route outputs from one agent to another. A common pattern is a supervisor agent that dispatches subtasks to specialized worker agents and aggregates their results. See the LangGraph multi-agent documentation for implementation patterns.

Platform

Building an agent with LangChain and Claude/OpenAI

April 02, 2026

An agent is an LLM that can reason in a loop, decide to use tools, and synthesize results across multiple steps. This is fundamentally different from a single chat completion call, where you send a prompt and receive one response. With an agent, the model evaluates a query, determines that it needs external information, invokes tools to get that information, and then reasons over the combined results before responding.

This article teaches the pattern of building a tool-using agent with LangChain, demonstrating the architecture with both Claude and OpenAI as interchangeable LLM providers. All code is illustrative. Treat each example as a starting point—production concerns like error handling, authentication, and observability are called out but intentionally omitted for clarity.

Imagine you ask an assistant to find today's weather in San Francisco and suggest what to wear. A standard LLM generates a plausible-sounding answer from training data. An agent recognizes it needs current weather data, calls a weather API tool, receives the actual temperature and conditions, then reasons over that real data to produce clothing recommendations.

This is the ReAct (Reasoning + Acting) loop:

Input → LLM Reasoning → Tool Selection → Tool Execution → LLM Synthesis → Output

The cycle may repeat multiple times. An agent answering a complex question might call three different tools across five reasoning steps before producing a final response.

Key vocabulary:

Agent: An LLM configured with tools and a reasoning strategy that enables multi-step problem solving
Tool: A Python function with a description that the LLM reads to determine when and how to invoke it
Agent Executor: The LangChain runtime that manages the reasoning loop, routing tool calls and collecting results

One critical distinction: the LLM does NOT execute code. It requests tool calls by generating structured output that names a tool and provides arguments. The framework executes the corresponding Python function and feeds the result back to the LLM for further reasoning.

Tools are Python functions with metadata that the LLM reads to decide when to invoke them. The description string is critical, as it's the LLM's only understanding of what the tool does and what arguments it expects. A poorly described tool gets misused or ignored.

LangChain's tool definition pattern is to write a Python function, apply the @tool decorator, and let the framework extract the schema from the function signature and docstring.

python

from langchain_core.tools import tool


@tool
def get_current_temperature(city: str) -> str:
    """Get the current temperature for a given city.

    Args:
        city: The name of the city, e.g. 'San Francisco'

    Returns:
        A string describing the current temperature.
    """
    # Production: add input validation, error handling, and rate limiting
    return f"The current temperature in {city} is 72°F and sunny."

The tool returns a hardcoded string. In a real implementation, this function would call an external weather API. The agent only sees the docstring and type signature. This separation is what makes the agent pattern composable: you can swap tool implementations without changing agent logic.

LangChain provides a provider-agnostic abstraction layer where switching between Claude and OpenAI is a model configuration concern, not an architectural redesign. Your agent construction code, tool bindings, and invocation logic stay identical regardless of provider.

python

from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent

# Provider swap: change this one line to switch LLM backends
# Option A: OpenAI
llm = ChatOpenAI(model="gpt-5.4", temperature=0)

# Option B: Anthropic Claude (uncomment to switch)
# llm = ChatAnthropic(model="claude-sonnet-4-6", temperature=0)

tools = [get_current_temperature]
 
agent = create_react_agent(llm, tools)

result = agent.invoke(
    {"messages": [{"role": "user", "content": "What's the temperature in Tokyo?"}]}
)

This requires langchain-openai, langchain-anthropic, and langgraph packages. Set API keys as environment variables: OPENAI_API_KEY or ANTHROPIC_API_KEY. Refer to the LangChain chat model integration docs for provider-specific options.

The create_react_agent function from LangGraph constructs the full reasoning loop. When you call agent.invoke(), the framework sends the user message to the LLM, the LLM decides to call get_current_temperature with "Tokyo", the framework executes the function, returns the result, and the LLM synthesizes a final response.

To make your agent accessible over HTTP, wrap it in a lightweight web framework:

python

from flask import Flask, request, jsonify

app = Flask(__name__)


@app.route("/agent", methods=["POST"])
def run_agent():
    """HTTP endpoint that accepts a user query and returns the agent's response."""
    user_input = request.json.get("message", "")
    result = agent.invoke(
        {"messages": [{"role": "user", "content": user_input}]}
    )
    return jsonify({"response": result["messages"][-1].content})

For a minimal deployment, put the tool definition, model configuration, create_react_agent(...), and the Flask route in the same app.py file. That way, gunicorn app:app can import the Flask app object directly.

Production deployments require request validation, authentication middleware, timeout handling, and structured error responses.

Deploy your agent as a web service from a Git repository with Render's managed infrastructure. Create a requirements.txt:

plaintext

flask
gunicorn
langchain-openai
langchain-anthropic
langgraph

In your Render Dashboard, create a new Web Service connected to your Git repository. If your Flask app lives in app.py, set the Start Command to gunicorn app:app. If you use a different module name, update the command accordingly. Then configure OPENAI_API_KEY or ANTHROPIC_API_KEY as environment variables.

For long-running agent interactions, use Background Workers or Render Workflows for asynchronous processing. You can configure autoscaling to handle variable load, as agent services tend to have bursty traffic since each request may take several seconds across multiple reasoning steps. Note that autoscaling requires a Professional Render workspace type or higher. Review Render's pricing to select an appropriate instance type.

Bridging from these patterns to production requires addressing several concerns:

Error handling: Tool functions must catch API failures, timeouts, and malformed inputs. Set a maximum iteration limit to prevent infinite reasoning loops.
Rate limiting: Both LLM provider APIs and external tool APIs have rate limits. Implement backoff strategies and request queuing. If using Render Workflows, this functionality is already built in and configurable.
Memory and state: These examples are stateless. Production agents typically need conversation memory and persistence layers.
Observability: Integrate LangSmith to inspect reasoning traces, tool call sequences, and latency breakdowns.
Security: Tools accessing external services represent an attack surface. Validate all inputs and scope tool permissions narrowly.
Cost management: Each reasoning step consumes tokens. A single query might trigger five or more LLM calls. Monitor usage and set budget thresholds.

The agent pattern — observe, think, act, observe — is the foundation. LangChain provides the abstraction layer, tools define capabilities, and Render provides your deployment infrastructure. Start with these simplified patterns, then layer in production concerns incrementally.

Building an agent with LangChain and Claude/OpenAI

Why agents represent a paradigm shift

The agent mental model

Defining tools: the agent's capabilities

Assembling the agent: LangChain with Claude or OpenAI

Exposing the agent as a web service

Deploying to Render

From tutorial pattern to production system

Frequently asked questions