Multi-Language Workflow Orchestration with Haiku Filtering in LangGraph
Research in a single language misses regional perspectives, non-English academic literature, and cultural variations. But running research workflows across 10 languages is expensive. This pattern shows how to use cheap pre-filtering to cut costs by 60% or more while capturing cross-cultural insights.
The Problem
When researching topics with global relevance:
- Coverage gaps: English-only research misses non-English expertise.
- Cost explosion: Running full research in 10 languages is expensive.
- Synthesis challenges: Combining findings from multiple languages requires structure.
- Relevance uncertainty: Not every language has meaningful content for every topic.
The Solution: Haiku Pre-Filtering with Two-Document Output
The pattern has three phases:
- Haiku relevance filtering: Use a cheap model to determine which languages have meaningful content.
- Parallel research: Run workflows only for relevant languages using
Send(). - Two-document synthesis: Produce both a comparative analysis and an integrated synthesis.
flowchart LR START --> S[select_languages] S --> D[dispatch_relevance] D --> |Send × N| C[check_relevance] C --> F[filter_relevant] F --> R[dispatch_research] R --> |Send × M| L[research_language] L --> A[comparative_analysis] A --> SY[synthesize] SY --> END
Implementation
Pydantic Models for Structured Output
Using with_structured_output() instead of manual JSON parsing ensures reliable LLM responses:
class RelevanceAssessment(BaseModel):
"""Haiku-powered relevance decision for a language."""
has_meaningful_discussion: bool = Field(
description="True if this language likely has valuable unique content"
)
confidence: float = Field(
description="Confidence score: 1.0=certain, 0.5=uncertain",
ge=0.0,
le=1.0,
)
suggested_depth: SearchDepth = Field(
description="Recommended search depth based on expected value"
)State with Reducers for Parallel Writes
LangGraph requires reducers for any state field written by parallel workers:
class MultiLangState(TypedDict):
input: MultiLangInput
# Reducer for parallel relevance checks
relevance_results: Annotated[list[dict], add]
# Reducer for parallel research workers
language_results: Annotated[list[LanguageResult], add]
# Reducer for error accumulation
errors: Annotated[list[dict], add]Haiku Relevance Filtering
Use a cheap model to filter before expensive operations:
RELEVANCE_CHECK_PROMPT = """<instructions>
Determine if the given language has meaningful unique content for the topic.
Consider:
1. Is this language spoken in regions with expertise on this topic?
2. Are there likely academic/professional discussions in this language?
3. Would this language add unique perspectives not covered in English?
</instructions>
<language>{language_name}</language>
<topic>{topic}</topic>
"""
async def check_language_relevance(state: dict) -> dict[str, Any]:
llm = get_llm("haiku", max_tokens=200)
llm_with_schema = llm.with_structured_output(RelevanceAssessment)
result = await llm_with_schema.ainvoke([{"role": "user", "content": prompt}])
return {
"relevance_results": [{
"language": language,
"relevant": result.has_meaningful_discussion,
"depth": result.suggested_depth.value,
}]
}Parallel Dispatch with Send()
Use LangGraph’s Send() for parallel fan-out instead of manual index iteration:
async def dispatch_relevance_checks(state: MultiLangState) -> list[Send]:
"""Fan-out to parallel relevance checks."""
topic = state["input"]["topic"]
return [
Send("check_language_relevance", {"language": lang, "topic": topic})
for lang in state["target_languages"]
]This is faster and cleaner than looping with an index counter.
Two-Document Output
The pattern produces two complementary documents:
- Comparative analysis (Sonnet 1M): Identifies cross-language patterns, consensus, and differences.
- Integrated synthesis: Combines findings into a unified report.
class ComparativeAnalysis(BaseModel):
commonalities: str = Field(description="Universal themes across languages")
differences: str = Field(description="Regional variations by theme")
unique_contributions: dict[str, str] = Field(
description="What each language uniquely adds"
)
coverage_gaps: str = Field(description="What English-only research misses")Why This Approach Works
- Cost optimization: Haiku filtering costs pennies. Skipping irrelevant languages saves dollars.
- Parallel execution:
Send()runs relevance checks and research concurrently. - Structured synthesis: Two-document output captures both patterns and integration.
- Graceful degradation: Errors in one language don’t block others.
Cost Comparison
For a 10-language topic where only four languages have meaningful content:
| Approach | Languages Researched | Estimated Cost |
|---|---|---|
| Research all | 10 | $15-30 |
| Haiku filter first | 4 | $6-12 |
The filtering step costs less than $0.10 and saves 60% or more.
Trade-offs
- Duration: Sequential checkpointing adds overhead for long runs.
- Complexity: More state management than single-language workflows.
- Filter accuracy: Haiku may occasionally miss relevant languages.