Distinctiveness Enforcement in Parallel Content Generation with LangGraph

When generating a series of related articles in parallel, how do you prevent them from covering the same ground? The naive approach—generate sequentially and check for overlap—defeats the purpose of parallelization. This pattern demonstrates proactive distinctiveness enforcement through routing-time injection of “must_avoid” lists.

The Problem With Parallel Content Generation

Parallel article generation offers significant speed benefits but creates a coordination challenge: each writer operates independently without visibility into what siblings are producing. Without intervention, you’ll often get three articles that hit the same compelling points from your source material.

Sequential generation solves this—each writer can see what came before—but it sacrifices the speed gains. Post-hoc deduplication requires expensive rewrites. We need a third option: proactive prevention at fan-out time.

The Must_Avoid Pattern

The key insight is that distinctiveness can be enforced at routing time, before parallel writers begin. Since a planning phase has already allocated distinct themes to each article, we can inject those theme assignments as constraints:

from langgraph.types import Send
 
 
def route_to_write(state: dict) -> list[Send] | str:
    """Fan out to parallel writers with distinctiveness enforcement.
 
    Each writer receives a must_avoid list containing the themes of
    sibling articles. This prevents overlap without requiring sequential
    generation or post-hoc deduplication.
    """
    assignments = state.get("article_assignments", [])
    enriched_content = state.get("enriched_content", [])
    source_document = state["input"]["source_document"]
 
    if not assignments:
        return "finalize"
 
    # Build theme lookup: {id: theme}
    themes_by_id = {a["id"]: a["theme"] for a in assignments}
 
    sends = []
    for assignment in assignments:
        # must_avoid = themes of all OTHER articles
        must_avoid = [
            f"{other_id}: {theme}"
            for other_id, theme in themes_by_id.items()
            if other_id != assignment["id"]
        ]
 
        # Filter content to only what this writer needs
        writer_content = [
            ec for ec in enriched_content
            if ec["article_id"] == assignment["id"]
        ]
 
        sends.append(
            Send(
                "write_article",
                {
                    "article_id": assignment["id"],
                    "title": assignment["title"],
                    "theme": assignment["theme"],
                    "structural_approach": assignment["structural_approach"],
                    "must_avoid": must_avoid,
                    "enriched_content": writer_content,
                    "source_document": source_document,
                },
            )
        )
 
    return sends

Each Send carries a must_avoid list describing what the sibling articles will cover. The writer’s prompt then explicitly instructs: “These themes are covered elsewhere—do not overlap substantially.”

Structured Planning Enables Enforcement

This pattern requires an upfront planning phase that allocates distinct themes. We use Pydantic schemas to enforce structure:

from typing import Literal
from pydantic import BaseModel, Field
 
 
class ArticleTopicPlan(BaseModel):
    """Plan for a single article in the series."""
 
    id: Literal["article_1", "article_2", "article_3"]
    title: str = Field(description="Evocative, specific title")
    theme: str = Field(description="2-3 sentence theme description")
    structural_approach: Literal["puzzle", "finding", "contrarian"]
    anchor_keys: list[str] = Field(
        description="Citation keys that anchor this article"
    )
 
 
class SeriesPlanningOutput(BaseModel):
    """Complete planning output for article series."""
 
    articles: list[ArticleTopicPlan] = Field(min_length=3, max_length=3)
    overview_scope: str
    series_coherence: str

The structural approach assignment adds another dimension of variety—even if themes have slight overlap, different narrative structures (puzzle vs. finding vs. contrarian) create distinct reading experiences.

State Design for Parallel Aggregation

Any field written by parallel nodes needs a reducer to avoid INVALID_CONCURRENT_GRAPH_UPDATE errors:

from operator import add
from typing import Annotated
from typing_extensions import TypedDict
 
 
class ContentSeriesState(TypedDict, total=False):
    """State with reducers for parallel writes."""
 
    # Sequential writes - no reducer needed
    input: dict
    article_assignments: list[dict]
    overview_scope: str
 
    # Parallel writes - MUST have reducers
    enriched_content: Annotated[list[dict], add]
    article_drafts: Annotated[list[dict], add]
    errors: Annotated[list[dict], add]
 
    # Sequential finalization
    overview_draft: dict
    final_outputs: list[dict]

The Annotated[list, add] pattern tells LangGraph to concatenate results from parallel nodes rather than overwriting.

Sync Barriers for Phase Coordination

Between fetch and write phases, a sync barrier ensures all content is aggregated before any writer starts:

async def sync_before_write_node(state: dict) -> dict:
    """Sync barrier between fetch and write phases.
 
    This node does no state modifications—it is purely a
    convergence point for parallel branches.
    """
    enriched = state.get("enriched_content", [])
    print(f"Sync: {len(enriched)} content items ready")
    return {}  # Pass-through

The graph wiring makes this a convergence point:

# All fetch nodes converge here before writers start
builder.add_edge("fetch_content", "sync_before_write")
 
# Then fan out to writers with distinctiveness
builder.add_conditional_edges(
    "sync_before_write",
    route_to_write,
    ["write_article", END],
)

Prompt Integration

The must_avoid list integrates directly into writer prompts:

ARTICLE_SYSTEM_PROMPT = """You are writing an article for a series...
 
## Your Focus
Title: {title}
Theme: {theme}
Structural Approach: {structural_approach}
 
## Must Avoid (covered in other articles):
{must_avoid}
 
These themes are covered elsewhere in the series. Do NOT significantly
overlap with them. Brief mentions for context are fine, but the substance
of your piece must be distinct.
"""
 
 
def format_must_avoid(must_avoid: list[str]) -> str:
    if not must_avoid:
        return "(No restrictions - this is the only article)"
    return "\n".join(f"- {item}" for item in must_avoid)

When to Use This Pattern

This pattern works well when:

You’re generating multiple related pieces from shared source material
Speed matters (parallel execution required)
An upfront planning phase can allocate distinct themes
Theme overlap would create redundant content

It’s less suitable when:

Content should genuinely overlap (e.g., different perspectives on identical topic)
Themes can’t be clearly delineated upfront
Sequential generation is acceptable

Complete Example

See the full implementation, which includes:

distinctiveness_enforcement.py—the must_avoid routing pattern
structured_planning.py—planning phase with Pydantic schemas
series_state.py—state definition with parallel reducers
series_graph.py—complete graph wiring with sync barriers

The pattern generalizes beyond articles to any parallel generation task where output distinctiveness matters: product descriptions, email variants, test case generation, or any domain where parallel workers might converge on similar solutions.

about thala

Explorer