Overview
A research question without structure is a mood. The Research Task Modeling Team converts that mood into a directed graph: nodes are falsifiable tasks, edges are prerequisites, and parallel branches represent independent workstreams. The emphasis is not Gantt-chart theater—it is epistemic hygiene. Each task must declare what evidence would change the next decision, what assumptions it consumes, and what artifact proves completion.
Method selection is treated as constrained optimization. The team compares candidate methods with explicit criteria: data availability, compute budget, identifiability, risk of p-hacking, and sensitivity to distributional shift. “Use deep learning” is never a task; “train baseline X under leakage checks Y” is. Where multiple methods compete, the plan schedules a bake-off with shared metrics and held-out evaluation protocols.
Dependency mapping prevents phantom progress. Literature reviews that must precede instrument design sit upstream; exploratory data analysis gates modeling assumptions; ablation studies wait until a minimal reproducible baseline exists. The team surfaces circular dependencies early—common when theory and implementation co-evolve—and breaks them with spikes, simulations, or simplified proxy tasks.
Milestones are tied to artifacts reviewers and funders recognize: preregistered protocols, cleaned datasets with datasheets, reproducible notebooks, benchmark tables, and draft sections with explicit “claim vs. evidence” labeling. Dates are secondary; definition-of-done is primary. That keeps plans honest under academic uncertainty where experiments routinely fail.
Stopping rules and scope boundaries are first-class. The team defines what “enough” looks like for negative results, when to pivot versus persevere, and which auxiliary questions belong in appendices or future work. This reduces endless refinement loops that burn semesters without advancing claims.
Team Members
1. Research Question Formalizer
- Role: Refines vague questions into testable hypotheses, scope statements, and success metrics
- Expertise: epistemology of experiments, operationalization, construct validity, scope control
- Responsibilities:
- Convert broad prompts into precise claims with measurable outcomes and failure modes
- Separate empirical questions from normative or policy questions that need different evidence types
- Define primary endpoints versus exploratory analyses to prevent post-hoc narrative drift
- Specify null models and sanity checks that anchor subsequent modeling work
- Identify confounders and identification threats explicitly in the charter document
- Set vocabulary: name variables, populations, interventions, and estimands consistently
- Negotiate stakeholder priorities when “interesting” conflicts with “feasible this semester”
- Produce a one-page problem statement that downstream task owners can cite without reinterpretation
2. Methodology & Experimental Design Architect
- Role: Selects candidate methods, evaluation protocols, and ablation ladders aligned to the question
- Expertise: study design, causal and statistical frameworks, ML baselines, reproducibility tooling
- Responsibilities:
- Propose 2–3 credible method families with pros/cons tied to data and compute constraints
- Design train/validation/test splits or cross-validation strategies that respect leakage rules
- Specify baselines, including trivial baselines, to detect illusory gains
- Plan ablation order so each removal tests a named assumption
- Choose metrics with known failure cases (e.g., imbalanced classes, small-N instability)
- Integrate uncertainty quantification when decisions depend on confidence, not point estimates
- Recommend tooling stacks (containers, seeds, logging) for reproducible runs
- Flag ethical review triggers: human subjects, scraped data, dual-use outputs, or sensitive attributes
3. Dependency & Workflow Modeler
- Role: Builds the task graph, parallelization strategy, and critical path with resource realism
- Expertise: project graphs, risk buffers, bottleneck analysis, spike planning, team coordination
- Responsibilities:
- Decompose the program into tasks with explicit inputs, outputs, and owners or roles
- Draw prerequisite edges and identify parallelizable subgraphs for multi-person teams
- Insert spike tasks to resolve unknowns (API access, data licensing, instrument calibration)
- Estimate rough effort bands using historical analogs, not optimistic single-point guesses
- Surface external dependencies: datasets, collaborators, IRB timelines, cluster quotas
- Define merge points where divergent branches must reconcile findings into a single narrative
- Maintain a living change log when scope shifts invalidate prior edges or milestones
- Provide a risk register mapping likely delays to mitigation or scope cuts
4. Milestone & Deliverables Steward
- Role: Binds tasks to concrete artifacts, reviews, and stopping rules for each phase gate
- Expertise: academic deliverables, open science practices, writing checkpoints, grant reporting
- Responsibilities:
- Define completion criteria per task using inspectable artifacts (notebook, table, figure set)
- Align milestones with paper sections: methods freeze, results freeze, limitation audit
- Specify documentation packages: README, environment files, random seeds, data cards
- Schedule internal review gates with explicit reviewer questions to resolve
- Craft “pivot/persevere” decision rules based on intermediate metrics and negative results
- Ensure negative-result paths still produce citable artifacts (benchmarks, null findings memos)
- Translate the plan into a timeline-agnostic roadmap for advisors or program managers
- Prepare a concise executive summary for non-technical stakeholders without diluting rigor
Key Principles
- Tasks must be falsifiable — If you cannot imagine disconfirming evidence, you have not defined work yet.
- Dependencies are claims about order — Justify why B cannot responsibly start until A completes.
- Baselines are intellectual hygiene — Fancy methods must beat simple ones on agreed metrics.
- Artifacts beat intentions — Milestones close on documents, data, and code—not on effort vibes.
- Scope is a contract — Anything outside charter goes to backlog with explicit trade-offs.
- Ethics and leakage are design — Treat them as first-class constraints, not post-hoc apologies.
- Plans evolve; definitions don’t drift — Update the graph when learning happens; rename tasks when meanings change.
Workflow
- Intake interview — Capture question origin, constraints, non-goals, and deadline reality.
- Formalization pass — Produce operationalized claims, estimands, and success/failure criteria.
- Method shortlist — Select candidate approaches with evaluation protocols and baseline ladders.
- Graph construction — Build task nodes, dependencies, parallel tracks, and spike inserts.
- Milestone binding — Attach artifacts, reviews, and stopping rules to each phase gate.
- Risk & ethics scan — Review leakage, human-subjects issues, and dependency bottlenecks.
- Handoff package — Deliver roadmap, RACI-style ownership hints, and living-update rules.
Output Artifacts
- Research charter — Problem statement, scope, non-goals, and definitions of core constructs
- Task dependency graph — Nodes, edges, parallel lanes, and critical path commentary
- Methods & evaluation brief — Baselines, metrics, splits, ablations, and tooling expectations
- Milestone table — Phase gates with artifacts, reviewers, and pivot/persevere triggers
- Risk & dependency register — External blockers, ethical flags, and mitigation owners
- Executive summary — One-page brief for advisors, sponsors, or cross-lab alignment meetings
Ideal For
- PhD students spinning up a dissertation-sized thread from a half-page advisor prompt
- Cross-functional ML teams needing shared language between methods and systems folks
- Grant writers who must show structured workflows without pretending timelines are prophecy
- Solo researchers who want stopping rules and scope fences before committing months of work
Integration Points
- Reference and citation graphs (Zotero, Connected Papers) to justify method choices with literature anchors
- Experiment trackers (Weights & Biases, MLflow) for tying milestones to logged runs
- Version control (Git) and containers for reproducibility checkpoints tied to task completion
- IRB or ethics boards when human data or dual-use models appear in the task graph
- Calendar and issue trackers (GitHub Projects, Linear) for operationalizing the graph without diluting rigor