Standalone Book Finding Workflow with Parallel LLM Calls in LangGraph

When building complex AI workflows, certain specialized tasks benefit from extraction into standalone modules. Book discovery, with its unique prompting requirements and multi-category structure, is a prime example. This pattern demonstrates how to create a focused workflow with parallel LLM calls that fan out from a single starting point and converge before downstream processing.

The Problem

In a larger research workflow, book discovery was originally bundled with web and academic research. This created several issues:

  • Different prompting needs: Books benefit from cross-domain thinking, not keyword searches.
  • Configuration complexity: Three-way researcher allocation was hard for users to understand.
  • Processing differences: Books go through PDF extraction (slower than web scraping).
  • No category structure: Relevance ranking alone misses the value of categorized recommendations.

The Solution: Workflow Extraction + Parallel Categories

Extract book finding into a standalone workflow that:

  1. Generates recommendations via three parallel LLM calls (one per category).
  2. Uses state reducers to collect outputs from parallel branches.
  3. Synthesizes categorized markdown output.

The three categories provide complementary perspectives:

  • Analogous: Books from unexpected domains that illuminate the theme.
  • Inspiring: Transformative works that inspire action.
  • Expressive: Fiction that captures what the theme feels like.
flowchart LR
    START --> A[generate_analogous]
    START --> B[generate_inspiring]
    START --> C[generate_expressive]
    A --> S[synthesize_output]
    B --> S
    C --> S
    S --> END

Implementation

State with Reducers

The key to parallel execution is using Annotated[list, add] reducers. When multiple nodes write to the same field concurrently, the reducer merges their outputs:

from typing import Annotated
from operator import add
from typing_extensions import TypedDict
from pydantic import BaseModel
 
class BookRecommendation(BaseModel):
    title: str
    author: str | None = None
    explanation: str
    category: str  # analogous | inspiring | expressive
 
class BookFindingState(TypedDict):
    theme: str
    brief: str | None
    # Single list with category field - cleaner than 3 separate lists
    recommendations: Annotated[list[BookRecommendation], add]
    final_output: str | None

Category-Specific Prompts

Each category has distinct prompting that encourages different types of thinking:

CATEGORY_PROMPTS = {
    "analogous": """Find books that explore SIMILAR themes but in DIFFERENT domains.
The goal: unexpected connections that provide fresh perspective.
 
Examples:
- For "organizational dysfunction": books about ecological collapse or family systems
- For "creative process": books about jazz improvisation or scientific discovery
 
Return EXACTLY 3 recommendations as JSON.""",
 
    "inspiring": """Find books that INSPIRE ACTION or CHANGE:
- Manifestos and calls to action
- Transformative nonfiction that changes behavior
- Practical wisdom literature
 
EXCLUDE pure fiction (those belong in expressive category).""",
 
    "expressive": """Find works of FICTION that express what a theme FEELS LIKE:
- Novels that capture the phenomenological experience
- Utopian or dystopian explorations
- Stories that make abstract concepts viscerally real
 
EXCLUDE nonfiction (those belong in inspiring category).""",
}

Parallel Node Generation with Partial

Instead of writing three nearly-identical functions, use functools.partial:

from functools import partial
 
async def generate_recommendations(
    category: str,
    state: BookFindingState,
) -> dict[str, list[BookRecommendation]]:
    """Generate recommendations for a category."""
    theme = state["theme"]
    system_prompt = CATEGORY_PROMPTS[category]
 
    try:
        llm = get_llm(ModelTier.OPUS)
        response = await llm.ainvoke([
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": f"Theme: {theme}"},
        ])
 
        # Parse JSON response
        raw_recs = json.loads(extract_json(response.content))
 
        return {"recommendations": [
            BookRecommendation(
                title=r["title"],
                author=r.get("author"),
                explanation=r["explanation"],
                category=category,
            )
            for r in raw_recs[:3]
        ]}
 
    except Exception as e:
        logger.warning(f"Failed {category}: {e}")
        return {"recommendations": []}  # Graceful degradation

Graph Construction

The graph uses static fan-out (multiple edges from START) with automatic fan-in:

from langgraph.graph import StateGraph, START, END
 
def create_book_finding_graph() -> StateGraph:
    builder = StateGraph(BookFindingState)
 
    # Add parameterized nodes
    for category in ["analogous", "inspiring", "expressive"]:
        builder.add_node(
            f"generate_{category}",
            partial(generate_recommendations, category),
        )
 
    builder.add_node("synthesize_output", synthesize_output)
 
    # Fan-out from START to all generators
    for category in ["analogous", "inspiring", "expressive"]:
        builder.add_edge(START, f"generate_{category}")
        builder.add_edge(f"generate_{category}", "synthesize_output")
 
    builder.add_edge("synthesize_output", END)
    return builder.compile()

LangGraph automatically waits for all three parallel branches to complete before running synthesize_output. The add reducer merges all recommendation lists into one.

Workflow Extraction Guidelines

When extracting specialized functionality from a larger workflow:

  1. Identify distinct concerns: Book finding needs different prompting than web/academic search.
  2. Create standalone module: Full package with state, prompts, nodes, graph.
  3. Simplify parent workflow: Remove extracted functionality, update routing.
  4. Document the split: Explain when to use each workflow.

Why This Approach Works

  1. Cross-domain discovery: Analogous prompting finds unexpected connections.
  2. Categorized output: Three categories serve different user needs.
  3. Parallel efficiency: Three LLM calls run concurrently.
  4. Graceful degradation: One category failing doesn’t crash the workflow.
  5. Clean separation: Standalone workflow can be invoked independently.

Trade-offs

  • Separate invocation: Must call book_finding() separately from main research.
  • No cross-pollination: Book insights don’t automatically inform other research.
  • LLM cost: Three parallel Opus calls are expensive per invocation.