Pipeline simplification with bridge compatibility for LangGraph
Complex LangGraph pipelines often evolve iteration loops that rarely execute. When your workflow averages 1.2 iterations per document, the iteration machinery (state tracking, regression detection, retry logic) costs more than it delivers.
This pattern replaces iteration with single-pass extended thinking while preserving downstream phase compatibility through a bridge node.
The problem
A five-phase editing pipeline with iteration:
parse -> analyze -> plan -> execute -> verify -> (loop if issues)
State fields accumulate for iteration control:
structure_iteration,max_structure_iterationsneeds_more_structure_workbaseline_coherence_scorecoherence_regression_detected,coherence_regression_warningcoherence_regression_retry_used
When iteration rarely executes (less than 30 percent of documents), you’re maintaining complexity for minimal benefit.
The catch: downstream phases (enhancement, polish) depend on the pipeline’s output format. Refactoring the structure phase means breaking consumers unless you add a bridge.
The solution
Simplify the core pipeline and add a bridge node that converts between formats:
V2 Structure: analyze -> router -> [parallel rewrite] -> reassemble
|
v
bridge
|
v
V1 Enhancement: (unchanged, receives DocumentModel format)
|
v
V1 Polish: (unchanged)
The bridge isolates V2 changes from V1 consumers. When you’re ready to refactor Enhancement to V3, add another bridge. It enables incremental migration.
Extended thinking replaces iteration
The key insight: iteration loops often compensate for model limitations. Extended thinking addresses this directly.
@traceable(run_type="chain", name="EditingV2.Analyze")
async def v2_analyze_node(state: dict) -> dict:
"""Single-pass analysis using extended thinking.
Replaces V1's parse -> analyze -> plan -> verify cycle.
"""
document = state["input"]["document"]
topic = state["input"]["topic"]
sections = parse_sections(document)
# Extended thinking enables deep reasoning in one call
analysis = await get_structured_output(
output_schema=GlobalAnalysisResult,
user_prompt=format_prompt(topic, document, sections),
system_prompt=V2_GLOBAL_ANALYSIS_SYSTEM,
tier=ModelTier.OPUS,
thinking_budget=6000, # Quality without iteration
use_json_schema_method=True,
)
return {
"sections": [s.model_dump() for s in sections],
"edit_instructions": [i.model_dump() for i in analysis.instructions],
}Production metrics:
- V1 averaged 1.2 iterations per document
- V2 single-pass: quality improved from 0.82 to 0.85
- State fields: more than 25 reduced to 18 (seven removed)
The bridge node
A bridge should be thin (pure format conversion):
@traceable(run_type="chain", name="V2ToV1Bridge")
async def v2_to_v1_bridge_node(state: dict) -> dict:
"""Convert V2 markdown output to V1 DocumentModel format.
This bridge enables V2 to feed into existing V1 Enhancement
and Polish phases without modification.
"""
final_document = state.get("final_document", "")
if not final_document:
final_document = state.get("input", {}).get("document", "")
# Parse markdown to V1 format
document_model = parse_markdown_to_model(final_document)
return {
"updated_document_model": document_model.to_dict(),
}If your bridge is doing more than format conversion (citation extraction, state initialization, validation), that logic belongs elsewhere.
Parallel processing preserved
Section rewriting runs concurrently using LangGraph’s Send pattern:
def v2_route_to_rewriters(state: dict) -> list[Send] | str:
"""Route to parallel section rewriters or skip."""
instructions = state.get("edit_instructions", [])
if not instructions:
return "reassemble"
sends = []
for instr_data in instructions:
worker_state = {
"sections": state.get("sections", []),
"instruction": instr_data,
"topic": state["input"]["topic"],
}
sends.append(Send("rewrite_section", worker_state))
return sendsEach worker returns results in a list for the add reducer to accumulate:
return {
"rewritten_sections": [result.model_dump()] # List for accumulator
}State simplification
V1 state with iteration:
class EditingState(TypedDict, total=False):
# ... more than 25 fields including:
structure_iteration: int
max_structure_iterations: int
needs_more_structure_work: bool
baseline_coherence_score: float
coherence_regression_detected: bool
coherence_regression_warning: str
coherence_regression_retry_used: boolV2 state simplified:
class EditingState(TypedDict, total=False):
# V2 Structure Phase
sections: list[dict]
edit_instructions: list[dict]
rewritten_sections: Annotated[list[dict], add]
final_document: str
# Bridge output
updated_document_model: dict # V1 compatible
# V1 Enhancement Phase (unchanged)
enhance_iteration: int
section_enhancements: Annotated[list[dict], add]
# V1 Polish Phase (unchanged)
polish_results: list[dict]Seven iteration fields removed, four V2 fields added.
When to use this pattern
Use when:
- Pipeline has iteration loops that rarely execute (less than 30 percent of documents)
- State complexity exceeds the value of iteration
- Higher-quality model with extended thinking could replace iteration
- Downstream phases must be preserved to avoid cascading changes
Don’t use when:
- Iteration genuinely improves quality (human feedback loops)
- Downstream phases are also being refactored
- Output format conversion is lossy or expensive
- Simplified pipeline can’t match iteration quality
Trade-offs
Benefits:
- Reduced complexity: five phases with iteration to three linear phases
- Cleaner state: seven fewer iteration tracking fields
- Better debuggability: linear flow is easier to trace
- Preserved downstream: enhancement and polish phases unchanged
- Parallel processing retained: section rewriting runs concurrently
- Higher quality: extended thinking outperforms shallow iteration
Costs:
- Lost iteration capability for edge cases
- Bridge overhead (minimal; parsing is fast)
- Format coupling between bridge and downstream