LangChain Agent: GraphRecursionError With PII Middleware

by SLV Team 57 views
GraphRecursionError When Adding Multiple Custom PIIMiddleware

Hey guys! I'm running into a pesky issue with LangChain, and I could really use your help. It seems like when I add multiple custom PIIMiddleware instances to my LangChain agent, I hit a GraphRecursionError. Let's dive into the details, shall we?

The Bug and My Setup

So, the main problem is that I'm getting a GraphRecursionError: Recursion limit of 25 reached without hitting a stop condition when I try to run my LangChain agent. This error pops up when I've got a bunch of PIIMiddleware instances defined. These middleware are designed to handle Personally Identifiable Information (PII) like emails, credit cards, names, and more. I'm using both built-in types (like email and credit_card) and custom detectors for things like phone numbers and addresses. It looks like the issue is related to how LangChain processes these multiple layers of middleware, possibly getting stuck in a loop.

Code Snippet

Here's a snippet of my code to give you a clearer picture:

from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.runnables import Runnable,
from langchain.agents import create_agent, AgentExecutor
from langchain.chains.base import Chain


class PIIMiddleware(BaseModel):
    """Middleware for handling PII data."""
    name: str
    strategy: str = "redact"
    detector: str | None = None
    apply_to_input: bool = True


pii_middleware = [
    # Built-in PII types
    PIIMiddleware("email", strategy="redact", apply_to_input=True),
    PIIMiddleware("credit_card", strategy="mask", apply_to_input=True),
    PIIMiddleware("url", strategy="mask", apply_to_input=True),
    PIIMiddleware("ip", strategy="redact", apply_to_input=True),
    PIIMiddleware("mac_address", strategy="redact", apply_to_input=True),

    # # Custom PII entities with detectors
    PIIMiddleware("name", strategy="redact", detector=r"\b[A-Z][a-z]+ [A-Z][a-z]+\b", apply_to_input=True),
    PIIMiddleware("phone", strategy="mask", detector=r"\b\d{3}[-.]?\d{3}[-.]?\d{4}\b", apply_to_input=True),
    PIIMiddleware("address", strategy="redact", detector=r"\b\d+\s+[A-Za-z\s]+(?:Street|St|Avenue|Ave|Road|Rd|Boulevard|Blvd|Lane|Ln|Drive|Dr)\b", apply_to_input=True),
    PIIMiddleware("account_number", strategy="mask", detector=r"\b\d{10,12}\b", apply_to_input=True),
    PIIMiddleware("username", strategy="redact", detector=r"\b@\w+\b", apply_to_input=True),
    PIIMiddleware("password", strategy="redact", detector=r"\bpassword[:\s]+\w+\b", apply_to_input=True),
    # PIIMiddleware("age", strategy="redact", detector=r"\b\d{1,2}\s+years?\s+old\b", apply_to_input=True),

    PIIMiddleware("insurance_number", strategy="redact", detector=r"\bINS\d+\b", apply_to_input=True)

]

agent = create_agent(
    model="openai:gpt-4o",
    tools=[],
    middleware=pii_middleware,
    debug=True,
    system_prompt="You are a helpful assistant. return the exact user input without any modifications.",
    cache=True

)

# Example usage:
# agent.invoke("My email is test@example.com and my phone number is 123-456-7890.")

As you can see, I'm defining a list of PIIMiddleware instances, each specifying a PII type, a strategy (like redact or mask), and a regular expression detector for custom types. Then, I pass this list to my agent during creation. The agent is set up to use gpt-4o and has a simple system prompt. When I run the agent with an input containing PII, it's supposed to process it according to the middleware rules. But, boom, the GraphRecursionError appears.

Error Details

The full error message I'm getting is:

langgraph.errors.GraphRecursionError: Recursion limit of 25 reached without hitting a stop condition.

This tells me that something within the LangChain graph execution is getting stuck in a recursive loop. Given that the error happens when applying multiple middleware layers, my suspicion is that the way these layers interact or the way they're applied is causing this recursion.

What I've Checked

I've done a bit of digging to try and figure out what's going on:

  • Checked Similar Issues: I've searched through GitHub issues and Stack Overflow, but I haven't found a direct solution for this specific problem. There are similar issues related to recursion in LangChain, but not precisely with PIIMiddleware and this error message.
  • Reproducible Example: The code I've provided is a self-contained, minimal, reproducible example. You should be able to copy and paste it and run it as is to reproduce the error. I've made sure to include all necessary imports and configurations.
  • Updated Packages: I've confirmed that I'm using the latest stable versions of the relevant LangChain packages. Here's my system information:
System Information
------------------
> OS: Windows
> OS Version: 10.0.26100
> Python Version: 3.11.9 (tags/v3.11.9:de54cf5, Apr  2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)]

Package Information
-------------------
> langchain_core: 1.0.2
> langchain: 1.0.3
> langsmith: 0.4.38
> langchain_openai: 1.0.1
> langgraph_sdk: 0.2.9

Optional packages not installed
-------------------------------
> langserve

Other Dependencies
------------------
> claude-agent-sdk: Installed. No version info available.
> httpx: 0.28.1
> jsonpatch: 1.33
> langchain-anthropic: Installed. No version info available.
> langchain-aws: Installed. No version info available.
> langchain-community: Installed. No version info available.
> langchain-deepseek: Installed. No version info available.
> langchain-fireworks: Installed. No version info available.
> langchain-google-genai: Installed. No version info available.
> langchain-google-vertexai: Installed. No version info available.
> langchain-groq: Installed. No version info available.
> langchain-huggingface: Installed. No version info available.
> langchain-mistralai: Installed. No version info available.
> langchain-ollama: Installed. No version info available.
> langchain-perplexity: Installed. No version info available.
> langchain-together: Installed. No version info available.
> langchain-xai: Installed. No version info available.
> langgraph: 1.0.2
> langsmith-pyo3: Installed. No version info available.
> openai: 2.6.1
> openai-agents: Installed. No version info available.
> opentelemetry-api: Installed. No version info available.
> opentelemetry-exporter-otlp-proto-http: Installed. No version info available.
> opentelemetry-sdk: Installed. No version info available.
> orjson: 3.11.4
> packaging: 25.0
> pydantic: 2.12.3
> pytest: Installed. No version info available.
> pyyaml: 6.0.3
> requests: 2.32.5
> requests-toolbelt: 1.0.0
> rich: Installed. No version info available.
> tenacity: 9.1.2
> tiktoken: 0.12.0
> typing-extensions: 4.15.0
> vcrpy: Installed. No version info available.
> zstandard: 0.25.0
  • Minimal Example: I've created the smallest possible example that still reproduces the issue to make it easier for anyone to test and debug.

The Big Question

My primary question is: When using multiple PIIMiddleware layers, what is the recommended or safe recursion depth to avoid the GraphRecursionError?

Is there a configuration setting or a different way I should be structuring my middleware to prevent this error? Perhaps there's a limit to how many middleware instances can be chained together. Or maybe there's an issue with the regular expressions I'm using in the detectors that's causing the recursion. Any insights or suggestions on how to approach this would be greatly appreciated. Any help would be fantastic, guys!

Potential Areas to Investigate

I'm thinking these might be areas to explore:

  • Middleware Ordering: Does the order in which the middleware is applied matter? Could a specific ordering help avoid the recursion?
  • Regular Expression Complexity: Could the complexity of the regular expressions in the custom detectors be a factor? Are some expressions more prone to causing recursion?
  • LangChain Internals: Is there a configuration within LangChain, or possibly within the langgraph package that controls the recursion limit, and can it be adjusted?
  • Alternative Middleware Approaches: Are there alternative ways to achieve PII detection and redaction that are less likely to trigger this error?

I'm looking forward to any guidance you can provide! Thanks in advance for your help!