Research Scoping Patterns

source

Research scoping transforms ambiguous user intent into structured research briefs through strategic clarification dialogue. This first phase of Research Workflow Architecture determines what gets investigated, preventing the costly anti-pattern of diving into research without establishing clear boundaries and priorities.

Without scoping, research agents waste computational resources exploring tangentially related topics, accumulate context filled with irrelevant information, and produce outputs misaligned with user needs. A vague question like “research AI agents” could spawn investigation into technical architectures, business applications, ethical considerations, historical development, or implementation frameworks - but users rarely want comprehensive coverage across all dimensions. Scoping identifies which aspects actually matter.

Why Scoping Matters

Token Efficiency: Research consumes massive token budgets. Anthropic’s Multi-Agent Research Systems showed 15x token usage over single-agent baselines - but that investment only pays off when researching the right questions. Mis-scoped research burns tokens on wrong topics, discovering irrelevant findings at great expense.

Boundary Establishment: Clear scope prevents feature creep during research. Without boundaries, interesting tangents derail focused exploration. The research brief acts as a contract - investigate these aspects, ignore those dimensions. This constraint paradoxically enables better research by maintaining focus.

Context Anchoring: The brief becomes the anchor for all subsequent phases. Research supervisor references it when decomposing work. Worker agents consult it to stay on topic. Writing agent uses it to understand intended scope. This artifact provides shared understanding across phases without requiring massive context sharing.

Clarification Dialogue Techniques

Effective scoping agents employ targeted questions that reveal user intent:

Aspect Prioritization: “What specific aspects of [topic] matter most to you?” This surfaces the dimensions worth investigating deeply versus those that can be mentioned briefly or ignored entirely. For “research machine learning,” responses might indicate interest in practical applications (not theory), specific domains (healthcare, finance), or implementation details (model selection, deployment strategies).

Background Assessment: “What’s your current understanding of [topic]?” This calibrates explanation depth. Experts want cutting-edge developments and nuanced analysis. Beginners need foundational concepts and clear explanations. The brief should encode appropriate sophistication level so research agents retrieve and present information matching user background.

Scope Boundaries: “Are there specific areas you want to exclude or emphasize?” Users often know what they don’t want more clearly than what they do want. “Research quantum computing but skip the physics” or “focus on business implications, not technical details” provides critical constraints.

Depth vs Breadth: “Should we explore fewer topics deeply or cover more topics at surface level?” This classic tradeoff shapes research strategy. Depth means comprehensive investigation of narrow scope. Breadth means survey-level coverage across wider terrain. The brief should encode this preference explicitly.

Output Format: “How will you use this research?” Understanding downstream application shapes appropriate deliverables. Executive summaries differ from technical deep-dives differ from teaching materials. The scoping agent should determine whether the user needs decision support, learning resources, or comprehensive analysis.

Brief Generation

The clarification dialogue produces verbose conversation filled with context, examples, and tangential discussion. The scoping agent must compress this into a structured brief that contains essential information without conversational overhead.

Structure Example:

Research Topic: Machine learning applications in medical diagnosis
Scope: Focus on radiology and pathology; exclude genomics
Background: User has CS degree but no medical domain knowledge
Depth: Deep dive on 2-3 specific applications with technical detail
Emphasis: Practical deployment challenges, not theoretical accuracy limits
Output: Technical report for engineering team evaluating feasibility

This brief provides research direction without carrying full conversation history forward. It specifies what to investigate (radiology/pathology ML), what to skip (genomics), appropriate sophistication (technical but not medical), and intended use case (feasibility evaluation).

The compression mirrors Reducing Context strategies - preserve signal, discard noise. Every word in the brief should influence research direction. Conversational pleasantries, reasoning about questions, and meta-discussion about the scoping process itself get pruned.

The Brief as Context Anchor

The research brief serves as stable context that persists through all phases without modification. This creates several benefits:

Cache Optimization: Following Caching Context principles, the brief becomes stable prompt prefix for subsequent agents. Research supervisor, worker agents, and writing agent all include the brief in their prompts. This stable prefix enables KV-cache reuse across multiple agent invocations.

Coordination Signal: In Multi-Agent Research Systems, the brief provides shared understanding without direct agent communication. Each worker agent sees the same brief, ensuring alignment on scope and priorities even though workers can’t see each other’s contexts.

Quality Validation: The brief enables objective evaluation. Did research cover specified aspects? Match requested depth? Address stated priorities? The brief provides ground truth for assessing research quality.

Iteration Foundation: If research proves insufficient, the brief can be refined and research re-executed. This enables controlled iteration without accumulating context from failed attempts.

Anti-Pattern: No Scoping

Diving directly into research creates several failure modes:

Scope Creep: Without boundaries, research expands indefinitely. Every discovered topic reveals related topics. Interesting tangents become rabbit holes. Token budgets exhaust without comprehensive coverage.

Misalignment: Research produces findings the user doesn’t need. Technical deep-dives for users wanting executive summaries. Theoretical background for users needing practical guidance. This misalignment wastes computational resources and user time.

Context Pollution: Irrelevant research clutters context for writing phase. The writing agent must sift through verbose findings about aspects the user doesn’t care about, creating Context Distraction and Context Confusion.

Vague Deliverables: Without clear scope, final outputs lack focus. Reports meander across loosely related topics. Recommendations address questions users didn’t ask. Quality degrades when purpose isn’t defined upfront.

Prompt Patterns for Scoping Agents

Effective scoping prompts guide systematic clarification:

You are a research scoping specialist. Your goal is to transform
the user's initial query into a precise research brief.

Ask targeted questions to understand:
1. Specific aspects they care most about
2. Their background knowledge level
3. Desired depth vs breadth
4. Topics to emphasize or exclude
5. How they'll use the research

After gathering this information, produce a structured brief containing:
- Core research question
- Scope boundaries (what's in/out)
- Background level (novice/intermediate/expert)
- Depth preference (survey/balanced/deep-dive)
- Output requirements (format, length, style)
- Key priorities (what matters most)

Keep the brief concise - 200-400 words maximum.

This prompt establishes the agent’s role (scoping specialist), defines its objective (precise brief), specifies what to discover (5 key questions), and constrains output format (structured, concise).

The pattern follows Prompt Engineering principles: clear role assignment, explicit task definition, structured output requirements, and length constraints preventing verbose briefs.

Integration with Human Cognition

Scoping mirrors Human Metacognition - the awareness and regulation of one’s own thinking. Effective human researchers don’t immediately start gathering information. They first clarify what they’re trying to understand, why it matters, and what success looks like. This metacognitive awareness shapes efficient investigation.

The scoping dialogue makes this metacognitive process explicit and collaborative. Users might not initially know exactly what they want - clarification questions help them discover their actual needs. This connects to principles from OODA Loop, where orientation - making sense of observations through existing mental models - precedes decision and action.

By forcing explicit scoping, the system prevents premature optimization where research executes efficiently but targets wrong questions. Better to spend tokens on scoping dialogue that surfaces true intent than to waste 15x token budgets researching irrelevant topics.

Early scoping attempts might produce briefs that still prove too vague during research. This creates a feedback loop where research execution reveals scope gaps. Future systems could employ:

Iterative Refinement: Research supervisor identifies scope ambiguities and requests clarification before proceeding. “The brief asks about ‘ML applications in medicine’ but doesn’t specify whether to include drug discovery.” This mid-research clarification prevents wasted exploration.

Example-Based Scoping: Rather than abstract questions, present concrete example briefs for different research types. Users select the closest match, then refine. This anchors scoping dialogue in concrete patterns rather than abstract discussion.

Learned Scoping: Models fine-tuned on successful (query, dialogue, brief, research) tuples learn which questions reveal useful information. This specialization produces better scoping through pattern recognition across many research tasks.

The fundamental insight persists: clarification dialogue that produces structured briefs prevents irrelevant research. Scoping is not overhead to minimize - it’s essential architecture that determines whether subsequent research delivers value.

Gradual Notes

Recent Writing

Revisited

Space is Not Barrenness

Study the Canon

Recent Notes

Caching Context

Kubernetes Batch Jobs

Sidekiq Architecture

Sidekiq Capsules