Multi-Language Workflow Orchestration with Haiku Filtering in LangGraph

Research in a single language misses regional perspectives, non-English academic literature, and cultural variations. But running research workflows across 10 languages is expensive. This pattern shows how to use cheap pre-filtering to cut costs by 60% or more while capturing cross-cultural insights.

The Problem

When researching topics with global relevance:

  • Coverage gaps: English-only research misses non-English expertise.
  • Cost explosion: Running full research in 10 languages is expensive.
  • Synthesis challenges: Combining findings from multiple languages requires structure.
  • Relevance uncertainty: Not every language has meaningful content for every topic.

The Solution: Haiku Pre-Filtering with Two-Document Output

The pattern has three phases:

  1. Haiku relevance filtering: Use a cheap model to determine which languages have meaningful content.
  2. Parallel research: Run workflows only for relevant languages using Send().
  3. Two-document synthesis: Produce both a comparative analysis and an integrated synthesis.
flowchart LR
    START --> S[select_languages]
    S --> D[dispatch_relevance]
    D --> |Send × N| C[check_relevance]
    C --> F[filter_relevant]
    F --> R[dispatch_research]
    R --> |Send × M| L[research_language]
    L --> A[comparative_analysis]
    A --> SY[synthesize]
    SY --> END

Implementation

Pydantic Models for Structured Output

Using with_structured_output() instead of manual JSON parsing ensures reliable LLM responses:

class RelevanceAssessment(BaseModel):
    """Haiku-powered relevance decision for a language."""
 
    has_meaningful_discussion: bool = Field(
        description="True if this language likely has valuable unique content"
    )
    confidence: float = Field(
        description="Confidence score: 1.0=certain, 0.5=uncertain",
        ge=0.0,
        le=1.0,
    )
    suggested_depth: SearchDepth = Field(
        description="Recommended search depth based on expected value"
    )

State with Reducers for Parallel Writes

LangGraph requires reducers for any state field written by parallel workers:

class MultiLangState(TypedDict):
    input: MultiLangInput
 
    # Reducer for parallel relevance checks
    relevance_results: Annotated[list[dict], add]
 
    # Reducer for parallel research workers
    language_results: Annotated[list[LanguageResult], add]
 
    # Reducer for error accumulation
    errors: Annotated[list[dict], add]

Haiku Relevance Filtering

Use a cheap model to filter before expensive operations:

RELEVANCE_CHECK_PROMPT = """<instructions>
Determine if the given language has meaningful unique content for the topic.
 
Consider:
1. Is this language spoken in regions with expertise on this topic?
2. Are there likely academic/professional discussions in this language?
3. Would this language add unique perspectives not covered in English?
</instructions>
 
<language>{language_name}</language>
<topic>{topic}</topic>
"""
 
async def check_language_relevance(state: dict) -> dict[str, Any]:
    llm = get_llm("haiku", max_tokens=200)
    llm_with_schema = llm.with_structured_output(RelevanceAssessment)
 
    result = await llm_with_schema.ainvoke([{"role": "user", "content": prompt}])
 
    return {
        "relevance_results": [{
            "language": language,
            "relevant": result.has_meaningful_discussion,
            "depth": result.suggested_depth.value,
        }]
    }

Parallel Dispatch with Send()

Use LangGraph’s Send() for parallel fan-out instead of manual index iteration:

async def dispatch_relevance_checks(state: MultiLangState) -> list[Send]:
    """Fan-out to parallel relevance checks."""
    topic = state["input"]["topic"]
    return [
        Send("check_language_relevance", {"language": lang, "topic": topic})
        for lang in state["target_languages"]
    ]

This is faster and cleaner than looping with an index counter.

Two-Document Output

The pattern produces two complementary documents:

  1. Comparative analysis (Sonnet 1M): Identifies cross-language patterns, consensus, and differences.
  2. Integrated synthesis: Combines findings into a unified report.
class ComparativeAnalysis(BaseModel):
    commonalities: str = Field(description="Universal themes across languages")
    differences: str = Field(description="Regional variations by theme")
    unique_contributions: dict[str, str] = Field(
        description="What each language uniquely adds"
    )
    coverage_gaps: str = Field(description="What English-only research misses")

Why This Approach Works

  1. Cost optimization: Haiku filtering costs pennies. Skipping irrelevant languages saves dollars.
  2. Parallel execution: Send() runs relevance checks and research concurrently.
  3. Structured synthesis: Two-document output captures both patterns and integration.
  4. Graceful degradation: Errors in one language don’t block others.

Cost Comparison

For a 10-language topic where only four languages have meaningful content:

ApproachLanguages ResearchedEstimated Cost
Research all10$15-30
Haiku filter first4$6-12

The filtering step costs less than $0.10 and saves 60% or more.

Trade-offs

  • Duration: Sequential checkpointing adds overhead for long runs.
  • Complexity: More state management than single-language workflows.
  • Filter accuracy: Haiku may occasionally miss relevant languages.