Custom Tools

The built-in tools cover filesystem operations, but real agents need domain-specific capabilities. In this module, you’ll learn how to create custom tools that extend your agent’s abilities with specialized functions tailored to your application’s needs.

Exercise 1: Simple Tool

The @tool decorator from langchain_core makes it easy to turn any Python function into a tool the agent can use. Let’s start with a basic example.

  1. Create a simple word-counting tool:

    • Run

    • Code Preview

    cat > simple_tool.py << 'EOF'
    import os
    from langchain_core.tools import tool
    from deepagents import create_deep_agent
    from utils import agent_response
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    @tool
    def word_count(text: str) -> int:
        """Count the number of words in the given text."""
        return len(text.split())
    
    agent = create_deep_agent(
        model=MODEL,
        tools=[word_count],
    )
    
    result = agent.invoke({"messages": [("user",
        "How many words are in this sentence: "
        "'The quick brown fox jumps over the lazy dog'"
    )]})
    
    print(agent_response(result))
    EOF
    import os
    from langchain_core.tools import tool
    from deepagents import create_deep_agent
    from utils import agent_response
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    @tool
    def word_count(text: str) -> int:
        """Count the number of words in the given text."""
        return len(text.split())
    
    agent = create_deep_agent(
        model=MODEL,
        tools=[word_count],
    )
    
    result = agent.invoke({"messages": [("user",
        "How many words are in this sentence: "
        "'The quick brown fox jumps over the lazy dog'"
    )]})
    
    print(agent_response(result))
  2. Run it to see the agent use your custom tool:

    uv run simple_tool.py
    Sample output (your results may vary)
    There are 9 words in that sentence.

Notice how custom tools are passed via the tools=[…​] parameter and are added alongside (not replacing) the built-in tools.

Your docstring IS your schema. The @tool decorator uses Pydantic under the hood to convert your function’s type hints and docstring into a JSON schema that the LLM reads to understand how to call your tool. A clear, specific docstring means better tool routing. A vague or missing docstring means the LLM will guess — and guess wrong. Write docstrings as if you’re explaining the tool to a colleague who’s never seen it before.

Exercise 2: Tool with Structured Input

For more complex tools, you can use Pydantic V2 models to define structured input schemas with rich field descriptions.

  1. Create a code analysis tool with multiple parameters:

    • Run

    • Code Preview

    cat > structured_tool.py << 'EOF'
    import os
    from pydantic import BaseModel, Field
    from langchain_core.tools import tool
    from deepagents import create_deep_agent
    from utils import agent_response
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    class CodeAnalysisInput(BaseModel):
        code: str = Field(description="The Python code to analyze")
        check_types: bool = Field(default=True, description="Whether to check for type hints")
        check_docstrings: bool = Field(default=True, description="Whether to check for docstrings")
    
    @tool(args_schema=CodeAnalysisInput)
    def analyze_code(code: str, check_types: bool = True, check_docstrings: bool = True) -> str:
        """Analyze Python code for quality issues like missing type hints and docstrings."""
        issues = []
        if check_types and "->" not in code and ":" not in code:
            issues.append("No type hints found")
        if check_docstrings and '"""' not in code and "'''" not in code:
            issues.append("No docstrings found")
        if not issues:
            return "Code looks good!"
        return "Issues found: " + ", ".join(issues)
    
    agent = create_deep_agent(
        model=MODEL,
        tools=[analyze_code],
    )
    
    result = agent.invoke({"messages": [("user",
        "Analyze this code: def add(a, b): return a + b"
    )]})
    
    print(agent_response(result))
    EOF
    import os
    from pydantic import BaseModel, Field
    from langchain_core.tools import tool
    from deepagents import create_deep_agent
    from utils import agent_response
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    class CodeAnalysisInput(BaseModel):
        code: str = Field(description="The Python code to analyze")
        check_types: bool = Field(default=True, description="Whether to check for type hints")
        check_docstrings: bool = Field(default=True, description="Whether to check for docstrings")
    
    @tool(args_schema=CodeAnalysisInput)
    def analyze_code(code: str, check_types: bool = True, check_docstrings: bool = True) -> str:
        """Analyze Python code for quality issues like missing type hints and docstrings."""
        issues = []
        if check_types and "->" not in code and ":" not in code:
            issues.append("No type hints found")
        if check_docstrings and '"""' not in code and "'''" not in code:
            issues.append("No docstrings found")
        if not issues:
            return "Code looks good!"
        return "Issues found: " + ", ".join(issues)
    
    agent = create_deep_agent(
        model=MODEL,
        tools=[analyze_code],
    )
    
    result = agent.invoke({"messages": [("user",
        "Analyze this code: def add(a, b): return a + b"
    )]})
    
    print(agent_response(result))
  2. Run it to see structured input in action:

    uv run structured_tool.py
    Sample output (your results may vary)
    The code has a couple of quality issues:
    
    1. No type hints found - the function parameters and return value lack type annotations
    2. No docstrings found - there's no documentation explaining what the function does
    
    Consider adding type hints like `def add(a: int, b: int) -> int:` and a docstring describing the function's purpose.

Pydantic V2 models give the LLM a rich schema with field descriptions, making it easier for the model to understand how to call your tool correctly with all the right parameters.

Exercise 3: Structured Output with Pydantic

You’ve seen structured input (telling the LLM what to pass to a tool). Now let’s get structured output — forcing the agent to return data in a specific Pydantic schema. This is essential when agent responses feed into downstream systems that need predictable formats.

  1. Create a script that gets a structured code review from the agent:

    • Run

    • Code Preview

    cat > structured_output.py << 'EOF'
    import os
    from pydantic import BaseModel, Field
    from deepagents import create_deep_agent
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    
    class CodeIssue(BaseModel):
        severity: str = Field(description="CRITICAL, WARNING, or INFO")
        line: str = Field(description="Approximate location in the code")
        issue: str = Field(description="One-sentence description of the problem")
        fix: str = Field(description="One-sentence suggested fix")
    
    
    class CodeReview(BaseModel):
        summary: str = Field(description="One-sentence overall assessment")
        issues: list[CodeIssue] = Field(description="List of issues found")
        score: int = Field(description="Quality score from 1-10")
    
    
    agent = create_deep_agent(model=MODEL)
    
    # Use with_structured_output to force the response into our schema
    structured_agent = agent | (lambda result: result)  # passthrough for now
    
    # The LangChain way: bind structured output to the model
    from langchain.chat_models import init_chat_model
    
    model = init_chat_model(MODEL)
    structured_model = model.with_structured_output(CodeReview)
    
    code_to_review = """
    def get_user(id):
        data = eval(open(f"users/{id}.json").read())
        return data
    """
    
    review = structured_model.invoke(
        f"Review this Python code and identify all issues:\n\n{code_to_review}"
    )
    
    # review is now a CodeReview object, not a string
    print(f"Summary: {review.summary}")
    print(f"Score: {review.score}/10")
    print(f"\nIssues ({len(review.issues)}):")
    for issue in review.issues:
        print(f"  [{issue.severity}] {issue.line}: {issue.issue}")
        print(f"    Fix: {issue.fix}")
    EOF
    import os
    from pydantic import BaseModel, Field
    from deepagents import create_deep_agent
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    
    class CodeIssue(BaseModel):
        severity: str = Field(description="CRITICAL, WARNING, or INFO")
        line: str = Field(description="Approximate location in the code")
        issue: str = Field(description="One-sentence description of the problem")
        fix: str = Field(description="One-sentence suggested fix")
    
    
    class CodeReview(BaseModel):
        summary: str = Field(description="One-sentence overall assessment")
        issues: list[CodeIssue] = Field(description="List of issues found")
        score: int = Field(description="Quality score from 1-10")
    
    
    agent = create_deep_agent(model=MODEL)
    
    # Use with_structured_output to force the response into our schema
    structured_agent = agent | (lambda result: result)  # passthrough for now
    
    # The LangChain way: bind structured output to the model
    from langchain.chat_models import init_chat_model
    
    model = init_chat_model(MODEL)
    structured_model = model.with_structured_output(CodeReview)
    
    code_to_review = """
    def get_user(id):
        data = eval(open(f"users/{id}.json").read())
        return data
    """
    
    review = structured_model.invoke(
        f"Review this Python code and identify all issues:\n\n{code_to_review}"
    )
    
    # review is now a CodeReview object, not a string
    print(f"Summary: {review.summary}")
    print(f"Score: {review.score}/10")
    print(f"\nIssues ({len(review.issues)}):")
    for issue in review.issues:
        print(f"  [{issue.severity}] {issue.line}: {issue.issue}")
        print(f"    Fix: {issue.fix}")
  2. Run it:

    uv run structured_output.py
    Sample output (your results may vary)
    Summary: Critical security vulnerability with multiple code quality issues
    Score: 2/10
    
    Issues (3):
      [CRITICAL] eval() call: Using eval() on file contents allows arbitrary code execution
        Fix: Replace eval() with json.load() for safe JSON parsing
      [WARNING] open() without context manager: File handle may not be closed properly
        Fix: Use 'with open(...) as f:' context manager
      [WARNING] No input validation: The id parameter is used directly in a file path
        Fix: Validate and sanitize the id parameter before constructing the path

The response is a CodeReview Pydantic object, not a string. You can access review.score, review.issues[0].severity, etc. programmatically. This is how you build agent pipelines where one agent’s output feeds into another’s input with guaranteed structure.

with_structured_output() is a LangChain model feature, not a Deep Agents feature. It works on the model directly (init_chat_model(MODEL).with_structured_output(Schema)), bypassing the agent’s tool loop. Use it when you need guaranteed structured responses — for example, a subagent that returns findings in a specific format for the orchestrator to parse.

Exercise 4: Tool Design Patterns

Before writing more tools, consider these design patterns that make tools more effective:

Tool descriptions are the primary routing signal

The LLM decides which tool to use based primarily on the description in your docstring. Make it count.

Keep descriptions specific and action-oriented

Good: "Calculate the sum of two numbers and return the result"

Bad: "A tool for math"

When to use a tool vs. system prompt instruction

Use tools for:

  • Actions with side effects (API calls, file writes)

  • Deterministic computations

  • External data access

Use system prompts for:

  • Formatting preferences

  • Tone and style

  • General behavioral guidelines

One tool per concern

Don’t create a single "do_everything" tool. Instead, create focused tools that each do one thing well. This makes it easier for the LLM to choose the right tool and for you to maintain the code.

Exercise 5: Multiple Tools

Let’s add several tools and observe how the agent routes between them based on user questions.

  1. Create an agent with multiple specialized tools:

    • Run

    • Code Preview

    cat > multiple_tools.py << 'EOF'
    import os
    from langchain_core.tools import tool
    from deepagents import create_deep_agent
    from utils import agent_response
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    @tool
    def word_count(text: str) -> int:
        """Count the number of words in the given text."""
        return len(text.split())
    
    @tool
    def get_timestamp() -> str:
        """Get the current date and time in ISO format."""
        from datetime import datetime
        return datetime.now().isoformat()
    
    @tool
    def calculate(expression: str) -> str:
        """Evaluate a mathematical expression safely. Only supports basic arithmetic."""
        allowed = set("0123456789+-*/.() ")
        if not all(c in allowed for c in expression):
            return "Error: only basic arithmetic is supported"
        return str(eval(expression))
    
    agent = create_deep_agent(
        model=MODEL,
        tools=[word_count, get_timestamp, calculate],
    )
    
    # Try questions that route to different tools
    questions = [
        "What time is it?",
        "How many words: 'Hello world from Python'",
        "What is 15 * 23 + 7?",
    ]
    
    for question in questions:
        print(f"\nQuestion: {question}")
        result = agent.invoke({"messages": [("user", question)]})
        print(f"Answer: {agent_response(result)}")
    EOF
    import os
    from langchain_core.tools import tool
    from deepagents import create_deep_agent
    from utils import agent_response
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    @tool
    def word_count(text: str) -> int:
        """Count the number of words in the given text."""
        return len(text.split())
    
    @tool
    def get_timestamp() -> str:
        """Get the current date and time in ISO format."""
        from datetime import datetime
        return datetime.now().isoformat()
    
    @tool
    def calculate(expression: str) -> str:
        """Evaluate a mathematical expression safely. Only supports basic arithmetic."""
        allowed = set("0123456789+-*/.() ")
        if not all(c in allowed for c in expression):
            return "Error: only basic arithmetic is supported"
        return str(eval(expression))
    
    agent = create_deep_agent(
        model=MODEL,
        tools=[word_count, get_timestamp, calculate],
    )
    
    # Try questions that route to different tools
    questions = [
        "What time is it?",
        "How many words: 'Hello world from Python'",
        "What is 15 * 23 + 7?",
    ]
    
    for question in questions:
        print(f"\nQuestion: {question}")
        result = agent.invoke({"messages": [("user", question)]})
        print(f"Answer: {agent_response(result)}")
  2. Run it and watch the agent select the appropriate tool for each question:

    uv run multiple_tools.py
    Sample output (your results may vary)
    Question: What time is it?
    Answer: The current date and time is 2026-03-30T14:32:15.891234.
    
    Question: How many words: 'Hello world from Python'
    Answer: There are 4 words in that text.
    
    Question: What is 15 * 23 + 7?
    Answer: The result is 352.

The agent automatically chooses the right tool based on the question and the tool descriptions you provided.

Exercise 6: Interrupt on Tool Call

For tools with side effects or high costs, you can add human-in-the-loop approval using interrupts.

  1. Create an agent that pauses before executing certain tools:

    • Run

    • Code Preview

    cat > interrupt_tool.py << 'EOF'
    import os
    from langchain_core.tools import tool
    from langgraph.checkpoint.memory import MemorySaver
    from langgraph.types import Command
    from deepagents import create_deep_agent
    from utils import agent_response
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    @tool
    def word_count(text: str) -> int:
        """Count the number of words in the given text."""
        return len(text.split())
    
    @tool
    def calculate(expression: str) -> str:
        """Evaluate a mathematical expression safely. Only supports basic arithmetic."""
        allowed = set("0123456789+-*/.() ")
        if not all(c in allowed for c in expression):
            return "Error: only basic arithmetic is supported"
        return str(eval(expression))
    
    # Interrupts require a checkpointer to save state between pause and resume
    checkpointer = MemorySaver()
    
    agent = create_deep_agent(
        model=MODEL,
        tools=[word_count, calculate],
        interrupt_on={"calculate": True},
        checkpointer=checkpointer,
    )
    
    # The agent will pause before executing the calculate tool
    config = {"configurable": {"thread_id": "interrupt-demo"}}
    print("Sending: 'What is 100 * 50?'")
    result = agent.invoke(
        {"messages": [("user", "What is 100 * 50?")]},
        config=config,
    )
    
    # Check if we hit an interrupt
    state = agent.get_state(config)
    if state.next:
        print(f"\n*** INTERRUPT: Agent paused at {state.next} ***")
        # Show what the agent wants to do
        last_msg = state.values["messages"][-1]
        if hasattr(last_msg, 'tool_calls'):
            for tc in last_msg.tool_calls:
                print(f"    Tool: {tc['name']}")
                print(f"    Args: {tc['args']}")
    
        # Auto-approve for demo purposes
        print("\n[Auto-approving for demo...]")
        # Resume by providing approval decisions
        decisions = {"decisions": [{"type": "approve"}]}
        result = agent.invoke(Command(resume=decisions), config=config)
        print(f"\nResult: {agent_response(result)}")
    else:
        print(f"Result: {agent_response(result)}")
    EOF
    import os
    from langchain_core.tools import tool
    from langgraph.checkpoint.memory import MemorySaver
    from langgraph.types import Command
    from deepagents import create_deep_agent
    from utils import agent_response
    
    MODEL = os.environ.get("DEEPAGENTS_MODEL", "anthropic:claude-sonnet-4-6")
    
    @tool
    def word_count(text: str) -> int:
        """Count the number of words in the given text."""
        return len(text.split())
    
    @tool
    def calculate(expression: str) -> str:
        """Evaluate a mathematical expression safely. Only supports basic arithmetic."""
        allowed = set("0123456789+-*/.() ")
        if not all(c in allowed for c in expression):
            return "Error: only basic arithmetic is supported"
        return str(eval(expression))
    
    # Interrupts require a checkpointer to save state between pause and resume
    checkpointer = MemorySaver()
    
    agent = create_deep_agent(
        model=MODEL,
        tools=[word_count, calculate],
        interrupt_on={"calculate": True},
        checkpointer=checkpointer,
    )
    
    # The agent will pause before executing the calculate tool
    config = {"configurable": {"thread_id": "interrupt-demo"}}
    print("Sending: 'What is 100 * 50?'")
    result = agent.invoke(
        {"messages": [("user", "What is 100 * 50?")]},
        config=config,
    )
    
    # Check if we hit an interrupt
    state = agent.get_state(config)
    if state.next:
        print(f"\n*** INTERRUPT: Agent paused at {state.next} ***")
        # Show what the agent wants to do
        last_msg = state.values["messages"][-1]
        if hasattr(last_msg, 'tool_calls'):
            for tc in last_msg.tool_calls:
                print(f"    Tool: {tc['name']}")
                print(f"    Args: {tc['args']}")
    
        # Auto-approve for demo purposes
        print("\n[Auto-approving for demo...]")
        # Resume by providing approval decisions
        decisions = {"decisions": [{"type": "approve"}]}
        result = agent.invoke(Command(resume=decisions), config=config)
        print(f"\nResult: {agent_response(result)}")
    else:
        print(f"Result: {agent_response(result)}")
  2. Run it to see the interrupt mechanism:

    uv run interrupt_tool.py
    Sample output (your results may vary)
    Sending: 'What is 100 * 50?'
    
    *** INTERRUPT: Agent paused at ('HumanInTheLoopMiddleware.after_model',) ***
        Tool: calculate
        Args: {'expression': '100 * 50'}
    
    [Auto-approving for demo...]
    
    Result: The result of 100 * 50 is 5000.

The agent pauses before executing calculate, shows you exactly what it wants to do, and waits for approval. Notice three requirements:

  • interrupt_on={"calculate": True} — tells the agent which tools need approval

  • checkpointer=MemorySaver() — saves the agent’s state when it pauses so it can resume. Without a checkpointer, there’s nowhere to store the paused state.

  • Command(resume=decisions) — resumes execution by passing the approval decision back to the agent

This pattern is critical for tools with side effects like sending emails, making purchases, or deleting files.

Beyond custom tools: MCP integration

Deep Agents can also consume MCP (Model Context Protocol) servers as tools. MCP is an open standard that lets you connect to community-built integrations — databases, APIs, file systems, and more — without writing custom tool code. If an MCP server exists for a service you want to integrate, you can plug it into your agent directly.

See Bonus: MCP — Connecting Agents to External Services for a hands-on walkthrough where you build a status checker MCP server and wire it into a Deep Agent with subagents.

Module Summary

You’ve learned how to extend agents with custom tools and structured data:

  • Basic tools using the @tool decorator — docstrings power the schema

  • Structured input with Pydantic V2 — rich field descriptions for complex tool parameters

  • Structured output with with_structured_output() — typed Pydantic responses for agent pipelines

  • Tool design patterns — action-oriented descriptions, tool vs. prompt decisions

  • Multiple tools — the agent routes between tools based on descriptions

  • Human-in-the-loopinterrupt_on with MemorySaver and Command(resume=) for approval gates

  • MCP integration — connect to external tool servers instead of writing everything in Python

Custom tools are where agents become truly useful in real applications. The combination of structured input (telling the LLM how to call tools), structured output (getting typed data back), and interrupt gates (human approval for side effects) gives you full control over how agents interact with your systems.