Fact-check Workflow Extraction for LangGraph
When verification logic grows complex—multiple phases, parallel workers, sophisticated result aggregation—embedding it in a larger editing workflow creates problems. You can’t run verification independently, can’t skip it without code changes, and can’t test it in isolation.
This pattern extracts verification into a standalone workflow with parallel section workers, automatic result aggregation, and early termination for documents without citations.
The problem
Embedded verification creates tight coupling:
editing/
├── nodes/
│ ├── structure.py (500 lines)
│ ├── enhance.py (400 lines)
│ ├── verify.py (800+ lines) ← Growing complexity
│ │ ├── screen_sections()
│ │ ├── fact_check_worker()
│ │ ├── reference_check_worker()
│ │ └── aggregate_results()
│ └── polish.py (200 lines)
└── graph/
└── construction.py (Complex routing embedded)
Issues:
- Can’t run verification independently
- Can’t skip verification without code changes
- Can’t test verification in isolation
- Quality settings mixed with editing settings
- Routing logic for parallel workers clutters the main graph
The solution
Extract verification into a standalone workflow:
enhance/
├── editing/ # Clean editing workflow
├── fact_check/ # Standalone verification
│ ├── state.py # Independent state schema
│ ├── quality_presets.py # Verification-specific presets
│ ├── nodes/ # Specialized nodes
│ └── graph/ # Self-contained routing
└── __init__.py # Three-phase orchestration
The extracted workflow uses three key patterns:
- Accumulator reducers for parallel result aggregation
- Send pattern for section-level parallel workers
- Citation gating for early termination
Accumulator reducers
Parallel workers need to aggregate results. LangGraph’s Annotated[list, add] pattern handles this automatically:
from operator import add
from typing import Annotated, TypedDict
class FactCheckState(TypedDict, total=False):
# Parallel workers accumulate results
fact_check_results: Annotated[list[dict], add]
reference_check_results: Annotated[list[dict], add]
pending_edits: Annotated[list[dict], add]
errors: Annotated[list[dict], add]When worker one returns {"fact_check_results": [result1]} and worker two returns {"fact_check_results": [result2]}, LangGraph produces {"fact_check_results": [result1, result2]}.
Critical: workers must return lists. Return {"results": [item]}, not {"results": item}.
Parallel section workers
The Send pattern dispatches parallel workers for each section:
from langgraph.types import Send
def route_to_fact_check_sections(state: dict) -> list[Send] | str:
"""Route to parallel workers or skip to next phase."""
screened_sections = state.get("screened_sections", [])
if not screened_sections:
return "pre_validate_citations" # Skip if nothing to check
# Dispatch parallel workers
sends = []
for section_id in screened_sections:
worker_state = {
"section_id": section_id,
"section_content": get_section_content(section_id),
"confidence_threshold": state["quality_settings"].get(
"verify_confidence_threshold", 0.75
),
}
sends.append(Send("fact_check_section", worker_state))
return sendsEach Send creates an independent worker with isolated state. Workers run in parallel; their results aggregate via the add reducer.
Citation gating
Skip expensive verification for documents without citations:
def route_citations_or_finalize(state: dict) -> str:
"""Gate: Skip workflow if no citations detected."""
if state.get("has_citations", False):
return "screen_sections"
return "finalize" # Early exitWire the gate in graph construction:
builder.add_conditional_edges(
"detect_citations",
route_citations_or_finalize,
{
"screen_sections": "screen_sections",
"finalize": "finalize",
},
)Documents without citations skip directly to finalization, saving all verification compute.
Graph construction
The complete graph wires parallel dispatch with convergence:
def build_fact_check_graph() -> StateGraph:
builder = StateGraph(FactCheckState)
# Nodes
builder.add_node("detect_citations", detect_citations_node)
builder.add_node("screen_sections", screen_sections_node)
builder.add_node("fact_check_section", fact_check_worker)
builder.add_node("assemble_fact_checks", assemble_node)
builder.add_node("finalize", finalize_node)
# Sequential start
builder.add_edge(START, "detect_citations")
# Citation gate
builder.add_conditional_edges(
"detect_citations",
route_citations_or_finalize,
{"screen_sections": "screen_sections", "finalize": "finalize"},
)
# Parallel dispatch
builder.add_conditional_edges(
"screen_sections",
route_to_fact_check_sections,
["fact_check_section", "finalize"],
)
# Convergence
builder.add_edge("fact_check_section", "assemble_fact_checks")
builder.add_edge("assemble_fact_checks", "finalize")
builder.add_edge("finalize", END)
return builder.compile()Three-phase orchestration
Integrate the standalone workflow as a toggleable phase:
async def enhance_report(
report: str,
topic: str,
quality: str = "standard",
run_editing: bool = True,
run_fact_check: bool = True, # Toggle
) -> dict:
"""Three-phase enhancement: supervision -> editing -> fact_check."""
current_report = report
# Phase 1: Supervision
if quality != "test":
result = await supervision_enhance(...)
current_report = result["final_report"]
# Phase 2: Editing
if run_editing:
result = await editing(...)
current_report = result["final_report"]
# Phase 3: Fact-check (standalone, toggleable)
if run_fact_check:
result = await fact_check(
document=current_report,
topic=topic,
quality=quality,
)
current_report = result["final_report"]
return {"final_report": current_report}Quality tier configuration
Each quality tier controls verification depth:
QUALITY_PRESETS = {
"quick": {
"confidence_threshold": 0.5,
"max_tool_calls": 5,
"perplexity_enabled": False,
},
"standard": {
"confidence_threshold": 0.75,
"max_tool_calls": 15,
"perplexity_enabled": True,
},
"comprehensive": {
"confidence_threshold": 0.85,
"max_tool_calls": 25,
"perplexity_enabled": True,
},
}When to use this pattern
Use when:
- Verification functionality has grown to 500+ lines
- Multiple parallel workers needed for section-level processing
- Need to run verification independently of editing
- Quality tiers should control verification depth separately
- Phase should be toggleable without modifying core workflow
Don’t use when:
- Verification is simple (single pass, no parallelism)
- Tight coupling with editing is intentional
- Overhead of separate workflow isn’t justified (less than 200 lines)
Trade-offs
Benefits:
- Independent execution for testing and standalone use
- Phase toggling via
run_fact_check=False - Isolated testing without editing dependencies
- Verification-specific quality presets
- Parallel efficiency with section-level workers
- Clean aggregation via add reducers
Costs:
- Additional coordination between phases
- Potential re-parsing if editing doesn’t expose document model
- Workflow overhead for simple use cases
- Configuration in multiple packages