Priority-Based Researcher Allocation in LangGraph Multi-Agent Workflows

When building research workflows that span multiple source types—web, academic papers, and books—the allocation of specialized agents becomes critical. A current events topic needs web researchers; a philosophy question benefits from books and academic journals. This pattern provides a clean solution: a priority-based allocation system that combines user control with LLM intelligence.

The Problem

Research topics vary widely in their optimal source types:

  • Current events: Web sources excel (news, blogs, forums)
  • Scientific research: Academic papers are essential
  • Foundational theory: Books provide comprehensive coverage
  • Humanities topics: Academic journals plus books work best

Hardcoding a fixed allocation wastes resources. Letting the LLM decide every time ignores domain expertise from power users. The solution: a priority hierarchy.

The Pattern: User > LLM > Default

The allocation follows a clear priority:

  1. User-specified: Power users can set allocation via simple “210” notation (2 web, 1 academic, 0 book)
  2. LLM-decided: If no user preference, the LLM analyzes the topic and allocates appropriately
  3. Default fallback: If neither is available, use balanced allocation (1,1,1)
flowchart TD
    A[Research Request] --> B{User allocation provided?}
    B -->|Yes: "210"| C[Parse allocation string]
    B -->|No| D{LLM allocation?}
    D -->|Yes| E[Use LLM decision]
    D -->|No| F[Default: 1,1,1]
    C --> G[Dispatch researchers]
    E --> G
    F --> G
    G --> H[Web Researcher x N]
    G --> I[Academic Researcher x N]
    G --> J[Book Researcher x N]

Implementation

Allocation Parser

The “210” notation provides a simple interface for users to specify allocation. Each digit represents the count for web, academic, and book researchers respectively.

from typing_extensions import TypedDict
 
class ResearcherAllocation(TypedDict):
    web_count: int       # 0-3: Current events, tech, products
    academic_count: int  # 0-3: Peer-reviewed research
    book_count: int      # 0-3: Foundational theory, history
 
def parse_allocation(allocation_str: str) -> ResearcherAllocation:
    """Parse '210' -> {web: 2, academic: 1, book: 0}"""
    if len(allocation_str) != 3 or not allocation_str.isdigit():
        raise ValueError(f"Expected 3 digits, got: {allocation_str!r}")
 
    web, academic, book = (int(c) for c in allocation_str)
 
    if max(web, academic, book) > 3:
        raise ValueError(f"Each digit must be 0-3")
 
    total = web + academic + book
    if not 1 <= total <= 3:
        raise ValueError(f"Total must be 1-3, got {total}")
 
    return ResearcherAllocation(
        web_count=web,
        academic_count=academic,
        book_count=book,
    )

Supervisor Decision Schema

When the LLM makes allocation decisions, it uses a Pydantic model with cross-field validation:

from pydantic import BaseModel, ConfigDict, Field, model_validator
from typing import Literal
 
class SupervisorDecision(BaseModel):
    """Supervisor's structured decision for research allocation."""
 
    action: Literal["conduct_research", "refine_draft", "research_complete"]
    reasoning: str
    research_questions: list[str] = Field(default_factory=list)
 
    # Allocation fields with LLM guidance
    web_researchers: int = Field(
        default=1, ge=0, le=3,
        description="For current events, tech, products, news (0-3)."
    )
    academic_researchers: int = Field(
        default=1, ge=0, le=3,
        description="For peer-reviewed papers across all disciplines (0-3)."
    )
    book_researchers: int = Field(
        default=1, ge=0, le=3,
        description="For foundational theory, historical context (0-3)."
    )
    allocation_reasoning: str | None = None
 
    model_config = ConfigDict(extra="forbid")
 
    @model_validator(mode="after")
    def validate_allocation(self) -> "SupervisorDecision":
        total = self.web_researchers + self.academic_researchers + self.book_researchers
        if self.action == "conduct_research":
            if total == 0:
                raise ValueError("Must allocate at least 1 researcher")
            if total > 3:
                raise ValueError(f"Total ({total}) exceeds limit of 3")
        return self

Type-Specific Query Generation

Different sources require different query strategies. A factory pattern creates specialized generators:

RESEARCHER_QUERY_PROMPTS = {
    "web": """Generate 2-3 web search queries.
Target: Official sites, news, expert blogs, forums.
Avoid: Academic papers (handled separately).""",
 
    "academic": """Generate 2-3 academic database queries.
Include methodology terms: "meta-analysis", "systematic review".
Use domain-specific terminology.""",
 
    "book": """Generate 2-3 book database queries.
Best for: Foundational theory, comprehensive overviews.
Include: "introduction to", "handbook of", "companion to".""",
}
 
def create_generate_queries(researcher_type: str):
    """Create a query generator for a specific researcher type."""
    async def generate_queries(state):
        llm = get_llm(ModelTier.HAIKU).with_structured_output(SearchQueries)
        prompt = f"{RESEARCHER_QUERY_PROMPTS[researcher_type]}\n\nQuestion: {state['question']}"
        result = await llm.ainvoke([{"role": "user", "content": prompt}])
        return {"search_queries": result.queries}
    return generate_queries

Routing with Send()

LangGraph’s Send() enables parallel dispatch with isolated state:

from langgraph.types import Send
 
def route_supervisor_action(state):
    allocation = state.get("researcher_allocation", {})
    sends = []
 
    for _ in range(allocation.get("web_count", 1)):
        sends.append(Send("web_researcher", {"question": ...}))
 
    for _ in range(allocation.get("academic_count", 1)):
        sends.append(Send("academic_researcher", {"question": ...}))
 
    for _ in range(allocation.get("book_count", 1)):
        sends.append(Send("book_researcher", {"question": ...}))
 
    return sends

Supervisor Prompt Guidance

Guide the LLM to make topic-appropriate allocations:

SUPERVISOR_ALLOCATION_GUIDANCE = """
<Researcher Allocation>
Allocate 1-3 researchers total based on topic suitability.
 
**Web**: Current events, tech trends, products, news, practitioner blogs.
**Academic**: Peer-reviewed research (STEM, humanities, social sciences).
**Book**: Foundational theory, historical context, classic works.
 
Guidelines:
- Tech/tools/products: web=2, academic=1, book=0
- Scientific/medical: web=1, academic=2, book=0
- Humanities/arts: web=0, academic=2, book=1
- Historical/theoretical: web=1, academic=1, book=1
- Breaking news: web=3, academic=0, book=0
- Unclear topics: web=1, academic=1, book=1 (balanced)
</Researcher Allocation>
"""

Why This Approach Works

  1. User control without complexity: “210” is easier to understand than JSON configuration.
  2. LLM intelligence when needed: Topic analysis leverages LLM reasoning for unknown domains.
  3. Predictable fallback: Default behavior is always sensible.
  4. Type-optimized queries: Each researcher gets prompts tuned for its source type.
  5. Parallel execution: Send() enables concurrent research across all allocated researchers.

Trade-offs

  • Prompt maintenance: Three query prompts to keep updated
  • LLM variability: Different runs may allocate differently for edge cases
  • User learning: Users need to understand the notation (though it’s simple)