Workflow Modularization with LangGraph

Monolithic workflows that combine core logic with optional enhancement phases create problems: You cannot use the core functionality independently, testing requires running the full pipeline, and coupling makes changes risky.

This pattern restructures workflows into composable units: core workflows that complete useful work independently, wrapper workflows that add optional enhancement, and guidelines for promoting subgraphs to top-level workflows.

The Problem

A literature review workflow that embeds supervision loops has issues:

workflows/research/subgraphs/academic_lit_review/
├── graph/
│   └── phases/
│       └── supervision.py  # Tightly coupled
└── supervision/            # Cannot test independently
  • You cannot run a quick literature review without supervision overhead.
  • Testing synthesis quality requires running all five supervision loops.
  • It is unclear whether book_finding is a true subgraph or an independent workflow.

The Solution

Restructure into composable workflows:

workflows/
├── academic_lit_review/      # Core (ends at synthesis)
├── supervised_lit_review/    # Wrapper (core + supervision)
├── book_finding/             # Promoted to top-level
└── research/subgraphs/
    └── web_researcher/       # Remains as true subgraph

Core Workflow Design

Core workflows end at a useful output, not an intermediate state:

def create_core_workflow(state_class) -> StateGraph:
    """Create literature review workflow that ends at synthesis.
 
    This produces a complete, useful output without requiring
    any optional enhancement phases.
    """
    builder = StateGraph(state_class)
 
    builder.add_node("discovery", discovery_phase_node)
    builder.add_node("diffusion", diffusion_phase_node)
    builder.add_node("processing", processing_phase_node)
    builder.add_node("clustering", clustering_phase_node)
    builder.add_node("synthesis", synthesis_phase_node)
 
    builder.add_edge(START, "discovery")
    builder.add_edge("discovery", "diffusion")
    builder.add_edge("diffusion", "processing")
    builder.add_edge("processing", "clustering")
    builder.add_edge("clustering", "synthesis")
    builder.add_edge("synthesis", END)  # Ends at useful output
 
    return builder.compile()

Core workflow principles:

  • Ends at a useful output, not an intermediate state.
  • Has no dependencies on optional phases.
  • Can be tested independently.
  • Exposes a clean public API.

Wrapper Workflow Design

Wrapper workflows compose core workflows with optional enhancement:

async def supervised_lit_review(
    topic: str,
    research_questions: list[str],
    quality: str = "standard",
) -> dict:
    """Run literature review with full supervision loops.
 
    Composes:
    1. Core academic_lit_review workflow
    2. Supervision loops 1-5 for quality enhancement
    """
    # Run core workflow
    lit_review_result = await academic_lit_review(
        topic=topic,
        research_questions=research_questions,
        quality=quality,
    )
 
    # Skip enhancement if quick mode
    quality_settings = lit_review_result.get("quality_settings", {})
    if quality_settings.get("supervision_loops") == "none":
        return lit_review_result
 
    # Run enhancement phases
    supervised_result = await run_supervision_loops(
        final_review=lit_review_result["final_review"],
        paper_corpus=lit_review_result["paper_corpus"],
        quality_settings=quality_settings,
    )
 
    # Return superset of core result
    return {
        **lit_review_result,
        "final_review": supervised_result["final_review_v2"],
        "supervision_state": supervised_result,
    }

Wrapper workflow principles:

  • Calls core workflow and does not duplicate it.
  • Adds optional enhancement phases.
  • Can skip enhancement based on configuration.
  • Returns a superset of the core result.

Subgraph Promotion

When a subgraph becomes independent, promote it to top-level:

SignalAction
Subgraph used without parentPromote to top-level
Optional phases added to coreExtract to wrapper
Testing requires full parentCore workflow too coupled

Subgraph vs Top-Level

AspectSubgraphTop-Level Workflow
Used bySingle parentMultiple callers
DependenciesOn parentIndependent
Public APIInternalExported
TestingWith parentStandalone

Promotion Checklist

  1. Move to top-level: research/subgraphs/X/ to workflows/X/.
  2. Update all import references across the codebase.
  3. Expose a clean public API via __init__.py.
  4. Remove any parent dependencies.
# workflows/book_finding/__init__.py
 
"""Standalone book finding workflow.
 
Public API:
    book_finding(theme, quality, language) -> BookFindingResult
"""
 
from workflows.book_finding.graph.api import book_finding
from workflows.book_finding.state import BookResult, BookFindingState
 
__all__ = [
    "book_finding",
    "BookResult",
    "BookFindingState",
]

Testability Benefits

The modular structure dramatically improves testability:

Before (monolithic):

  • Testing synthesis quality requires running all supervision loops.
  • Integration tests are slow with many LLM calls.
  • You cannot isolate failures to specific phases.

After (modular):

# Test core workflow in isolation
result = await academic_lit_review(topic="test", quality="test")
assert result["final_review"] is not None
 
# Test enhancement separately
result = await run_supervision_loops(
    final_review=mock_review,
    loops="one",  # Test single loop
)

Trade-offs

Benefits:

  • Independent testing of the core workflow.
  • Flexible composition—use the core or enhanced version.
  • Clear dependencies with no hidden coupling.
  • Faster iterations by skipping enhancement for quick tests.

Costs:

  • Import churn when restructuring with many files to update.
  • More files to maintain.
  • Larger public API surface.