Standalone Book Finding Workflow with Parallel LLM Calls in LangGraph
When building complex AI workflows, certain specialized tasks benefit from extraction into standalone modules. Book discovery, with its unique prompting requirements and multi-category structure, is a prime example. This pattern demonstrates how to create a focused workflow with parallel LLM calls that fan out from a single starting point and converge before downstream processing.
The Problem
In a larger research workflow, book discovery was originally bundled with web and academic research. This created several issues:
- Different prompting needs: Books benefit from cross-domain thinking, not keyword searches.
- Configuration complexity: Three-way researcher allocation was hard for users to understand.
- Processing differences: Books go through PDF extraction (slower than web scraping).
- No category structure: Relevance ranking alone misses the value of categorized recommendations.
The Solution: Workflow Extraction + Parallel Categories
Extract book finding into a standalone workflow that:
- Generates recommendations via three parallel LLM calls (one per category).
- Uses state reducers to collect outputs from parallel branches.
- Synthesizes categorized markdown output.
The three categories provide complementary perspectives:
- Analogous: Books from unexpected domains that illuminate the theme.
- Inspiring: Transformative works that inspire action.
- Expressive: Fiction that captures what the theme feels like.
flowchart LR START --> A[generate_analogous] START --> B[generate_inspiring] START --> C[generate_expressive] A --> S[synthesize_output] B --> S C --> S S --> END
Implementation
State with Reducers
The key to parallel execution is using Annotated[list, add] reducers. When multiple nodes write to the same field concurrently, the reducer merges their outputs:
from typing import Annotated
from operator import add
from typing_extensions import TypedDict
from pydantic import BaseModel
class BookRecommendation(BaseModel):
title: str
author: str | None = None
explanation: str
category: str # analogous | inspiring | expressive
class BookFindingState(TypedDict):
theme: str
brief: str | None
# Single list with category field - cleaner than 3 separate lists
recommendations: Annotated[list[BookRecommendation], add]
final_output: str | NoneCategory-Specific Prompts
Each category has distinct prompting that encourages different types of thinking:
CATEGORY_PROMPTS = {
"analogous": """Find books that explore SIMILAR themes but in DIFFERENT domains.
The goal: unexpected connections that provide fresh perspective.
Examples:
- For "organizational dysfunction": books about ecological collapse or family systems
- For "creative process": books about jazz improvisation or scientific discovery
Return EXACTLY 3 recommendations as JSON.""",
"inspiring": """Find books that INSPIRE ACTION or CHANGE:
- Manifestos and calls to action
- Transformative nonfiction that changes behavior
- Practical wisdom literature
EXCLUDE pure fiction (those belong in expressive category).""",
"expressive": """Find works of FICTION that express what a theme FEELS LIKE:
- Novels that capture the phenomenological experience
- Utopian or dystopian explorations
- Stories that make abstract concepts viscerally real
EXCLUDE nonfiction (those belong in inspiring category).""",
}Parallel Node Generation with Partial
Instead of writing three nearly-identical functions, use functools.partial:
from functools import partial
async def generate_recommendations(
category: str,
state: BookFindingState,
) -> dict[str, list[BookRecommendation]]:
"""Generate recommendations for a category."""
theme = state["theme"]
system_prompt = CATEGORY_PROMPTS[category]
try:
llm = get_llm(ModelTier.OPUS)
response = await llm.ainvoke([
{"role": "system", "content": system_prompt},
{"role": "user", "content": f"Theme: {theme}"},
])
# Parse JSON response
raw_recs = json.loads(extract_json(response.content))
return {"recommendations": [
BookRecommendation(
title=r["title"],
author=r.get("author"),
explanation=r["explanation"],
category=category,
)
for r in raw_recs[:3]
]}
except Exception as e:
logger.warning(f"Failed {category}: {e}")
return {"recommendations": []} # Graceful degradationGraph Construction
The graph uses static fan-out (multiple edges from START) with automatic fan-in:
from langgraph.graph import StateGraph, START, END
def create_book_finding_graph() -> StateGraph:
builder = StateGraph(BookFindingState)
# Add parameterized nodes
for category in ["analogous", "inspiring", "expressive"]:
builder.add_node(
f"generate_{category}",
partial(generate_recommendations, category),
)
builder.add_node("synthesize_output", synthesize_output)
# Fan-out from START to all generators
for category in ["analogous", "inspiring", "expressive"]:
builder.add_edge(START, f"generate_{category}")
builder.add_edge(f"generate_{category}", "synthesize_output")
builder.add_edge("synthesize_output", END)
return builder.compile()LangGraph automatically waits for all three parallel branches to complete before running synthesize_output. The add reducer merges all recommendation lists into one.
Workflow Extraction Guidelines
When extracting specialized functionality from a larger workflow:
- Identify distinct concerns: Book finding needs different prompting than web/academic search.
- Create standalone module: Full package with state, prompts, nodes, graph.
- Simplify parent workflow: Remove extracted functionality, update routing.
- Document the split: Explain when to use each workflow.
Why This Approach Works
- Cross-domain discovery: Analogous prompting finds unexpected connections.
- Categorized output: Three categories serve different user needs.
- Parallel efficiency: Three LLM calls run concurrently.
- Graceful degradation: One category failing doesn’t crash the workflow.
- Clean separation: Standalone workflow can be invoked independently.
Trade-offs
- Separate invocation: Must call
book_finding()separately from main research. - No cross-pollination: Book insights don’t automatically inform other research.
- LLM cost: Three parallel Opus calls are expensive per invocation.