Iterative Document Supervision with LangGraph and Extended Thinking
Long-form academic documents have uneven theoretical depth. Some sections are well-grounded with citations to foundational work, while others remain superficial. Single-pass LLM generation cannot catch these gaps, and manual review is time-consuming and inconsistent.
This pattern implements an iterative supervision loop that uses Opus with extended thinking to analyze documents for theoretical gaps, triggers targeted research expansion on identified issues, and integrates findings until quality thresholds are met.
The Core Insight
The key innovation is using extended thinking (an 8000-token reasoning budget) for gap analysis. This allows the supervisor to deeply reason about theoretical grounding before making a decision. Combined with tracking explored issues to prevent re-exploration, the pattern achieves focused, efficient document improvement.
How It Works
graph TD A[START] --> B[analyze_review] B --> C{Decision?} C -->|research_needed| D[expand_topic] C -->|pass_through| G[finalize] D --> E[integrate_content] E --> F{Continue?} F -->|continue| B F -->|complete| G G --> H[END]
The supervision loop:
- Analyze: Opus examines the document for theoretical gaps, identifying one issue per iteration.
- Decide: Either flag a gap to address, or pass through (approve the document).
- Expand: If a gap is found, run targeted research to find relevant sources.
- Integrate: Merge new findings into the document with full restructuring allowed.
- Loop: Continue until pass-through or max iterations reached.
Implementation
Structured Decision Output
The supervisor returns a structured decision that either approves the document or identifies a specific gap:
from pydantic import BaseModel, Field, ConfigDict
from typing import Literal
class IdentifiedIssue(BaseModel):
"""A theoretical gap identified by the supervisor."""
model_config = ConfigDict(extra="forbid")
topic: str = Field(description="The specific topic lacking depth")
issue_type: Literal[
"underlying_theory",
"methodological_foundation",
"unifying_threads",
"foundational_concepts",
] = Field(description="Category of theoretical gap")
rationale: str = Field(description="Why this gap matters")
research_query: str = Field(description="Query to find relevant papers")
integration_guidance: str = Field(
description="How to integrate findings into the document"
)
class SupervisorDecision(BaseModel):
"""Supervisor's decision after analysis."""
model_config = ConfigDict(extra="forbid")
action: Literal["research_needed", "pass_through"] = Field(
description="Whether more research is needed"
)
reasoning: str = Field(description="Explanation for the decision")
issue: IdentifiedIssue | None = Field(
default=None,
description="Identified issue if action is research_needed",
)Analysis Node with Extended Thinking
The analysis node uses Opus with extended thinking for deep gap detection:
async def analyze_review_node(state: dict[str, Any]) -> dict[str, Any]:
"""Analyze the document for theoretical gaps."""
current_review = state.get("current_review", "")
issues_explored = state.get("issues_explored", [])
iteration = state.get("iteration", 0)
# Use Opus with extended thinking for deep analysis
llm = get_llm(
tier=ModelTier.OPUS,
thinking_budget=8000, # 8K tokens for reasoning
max_tokens=4096,
)
structured_llm = llm.with_structured_output(SupervisorDecision)
messages = [
{"role": "system", "content": SUPERVISOR_SYSTEM},
{"role": "user", "content": SUPERVISOR_USER.format(
final_review=current_review,
issues_explored=format_explored(issues_explored),
iteration=iteration + 1,
)},
]
decision = await structured_llm.ainvoke(messages)
updates = {"decision": decision.model_dump(), "iteration": iteration + 1}
if decision.action == "pass_through":
updates["is_complete"] = True
elif decision.issue:
# Track to prevent re-exploration
updates["issues_explored"] = [decision.issue.topic]
return updatesState with Proper Reducers
For LangGraph workflows with accumulating state, use reducers to ensure correct list/dict merging:
from typing import Annotated
from operator import add
def merge_dicts(a: dict, b: dict) -> dict:
return {**a, **b}
class SupervisionState(TypedDict, total=False):
current_review: str
iteration: int
max_iterations: int
is_complete: bool
# Accumulating fields need reducers
issues_explored: Annotated[list[str], add]
supervision_expansions: Annotated[list[dict], add]
# Dict fields that merge
paper_corpus: Annotated[dict[str, Any], merge_dicts]Graph Construction
def create_supervision_graph(state_class) -> StateGraph:
builder = StateGraph(state_class)
builder.add_node("analyze_review", analyze_review_node)
builder.add_node("expand_topic", expand_topic_node)
builder.add_node("integrate_content", integrate_content_node)
builder.add_node("finalize", finalize_node)
builder.add_edge(START, "analyze_review")
builder.add_conditional_edges(
"analyze_review",
route_after_analysis,
{"expand": "expand_topic", "finalize": "finalize"},
)
builder.add_edge("expand_topic", "integrate_content")
builder.add_conditional_edges(
"integrate_content",
should_continue_supervision,
{"continue": "analyze_review", "complete": "finalize"},
)
builder.add_edge("finalize", END)
return builder.compile()Quality Tier Integration
The pattern supports quality tiers that control iteration bounds:
| Quality Tier | Max Iterations | Use Case |
|---|---|---|
| quick | 1 | Fast feedback, minor improvements |
| standard | 2 | Balanced quality and speed |
| comprehensive | 3 | Thorough review |
| high_quality | 5 | Maximum depth |
Key Design Decisions
One issue per iteration: Rather than identifying all gaps at once, the supervisor identifies one issue per iteration. This prevents resource waste on issues that might become irrelevant after earlier expansions address them.
Issue tracking: Previously explored topics are passed to the supervisor prompt to prevent re-exploration of the same gaps.
Full restructuring: The integration node is allowed to restructure the document, not just append. This produces more coherent results when new content changes the document’s narrative flow.
Conservative supervision: The prompt emphasizes being conservative. The goal is quality assurance, not endless expansion.
Termination Conditions
The loop terminates when:
- Pass-through: Supervisor approves current quality.
- Max iterations: Iteration limit reached (configurable by quality tier).
- Circuit breaker: Two or more consecutive failures (graceful degradation).
Trade-offs
Benefits:
- Targeted improvement (only researches specific gaps).
- Quality assurance (Opus-level analysis catches subtle issues).
- Bounded iteration (quality settings control effort).
Costs:
- Multiple Opus calls (analysis and integration per iteration).
- Latency (each iteration adds research and integration time).