Reasoning Playbook · Glossary

Reasoning Glossary

Modes of thinking — what to use when figuring out what is true. From the Modular Reasoning Playbook.

109 modules

A) Formal and mathematical reasoning

formal:deductive

Deductive reasoning (classical logical inference)

#
If the premises are true and the inference rules are valid, the conclusion must be true.
Outputs
Valid entailments; proofs/derivations; contradictions/counterexamples (via refutation).
How it differs
Truth-preserving and typically monotonic; it makes explicit what's already implicit.
Best for
Spec checking, compliance logic, crisp "must/shall" implications, formal arguments.
Failure mode
Garbage-in (false premises) or missing premises that matter in reality.
Pairs withquality:error-tracking to audit the chain, quality:assumption-audit to challenge premises.
formal:proof

Mathematical / proof-theoretic reasoning

#
Deduction where the proof object matters (what counts as a proof, how it's constructed).
Outputs
Formal proofs (sometimes machine-checkable); proof transformations.
How it differs
More structured than everyday deduction; emphasizes derivability and proof structure.
Best for
Formal methods, theorem proving, certified reasoning pipelines.
Failure mode
Proving the wrong theorem (spec mismatch) or proving something irrelevant to outcomes.
Pairs withformal:deductive for the inference steps, formal:counterexample for refutation attempts.
formal:constructive

Constructive (intuitionistic) reasoning

#
A proof of existence must provide a construction/witness; some classical principles are restricted.
Outputs
Proofs that often correspond to algorithms ("proofs as programs").
How it differs
Stronger link between "proved" and "computable."
Best for
Verified software, protocols, constructive math, "show me the witness."
Failure mode
Over-constraining when classical reasoning is acceptable and simpler.
Pairs withformal:proof for the formal structure, formal:type-theoretic for compositional guarantees.
formal:equational

Equational / algebraic reasoning (rewrite-based)

#
Transform expressions using equalities and rewrite rules while preserving meaning.
Outputs
Equivalent forms; normal forms; simplifications; invariants.
How it differs
Deduction specialized to symbol manipulation; often the everyday workhorse in math/CS.
Best for
Refactoring, optimization proofs, dimensional reasoning scaffolds, invariant manipulation.
Failure mode
Unsound rewrite rules or implicit domain restrictions (division by zero, overflow).
Pairs withformal:proof for rigorous verification, formal:constraint for checking feasibility.
formal:model-theoretic

Model-theoretic / semantic reasoning

#
Reason by constructing/analyzing models that satisfy a theory (true in all models vs some).
Outputs
Satisfiable/unsatisfiable; countermodels; interpretations.
How it differs
Complements proof-theory: instead of "derive," you "build a world where it holds/doesn't."
Best for
Consistency checks, finding hidden assumptions, generating counterexamples.
Failure mode
Model doesn't match the intended semantics of the real system.
Pairs withformal:constraint for systematic satisfiability, formal:counterexample for refinement.
formal:constraint

Constraint / satisfiability reasoning (SAT/SMT/CSP)

#
Encode requirements as constraints and solve for assignments that satisfy them (or prove none exist).
Outputs
A satisfying assignment; unsat certificate; minimal unsat cores; counterexamples.
How it differs
It can implement deduction, but the "mode" is solve-by-consistency rather than argument-by-argument inference.
Best for
Scheduling, configuration, verification, policy enforcement, feasibility checks.
Failure mode
Poor encoding (missed constraints) leads to false confidence.
Pairs withformal:model-theoretic for semantic analysis, practical:optimization for finding the best solution.
formal:type-theoretic

Type-theoretic reasoning

#
Use types (including dependent/refinement types) to enforce invariants; "propositions as types" in some systems.
Outputs
Type derivations; well-typed programs; compositional guarantees.
How it differs
Reasoning is integrated into construction; great for modular correctness.
Best for
API design, correctness-by-construction, safe composition of large systems.
Failure mode
Fighting the type system instead of clarifying the spec it encodes.
Pairs withformal:constructive for algorithmic correspondence, formal:constraint for checking feasibility.
formal:counterexample

Counterexample-guided reasoning (CEGAR-style)

#
Propose an abstraction; check; if a counterexample appears, refine the abstraction and repeat.
Outputs
Either a proof of property or a concrete counterexample; refined models.
How it differs
It's a loop blending deduction + model checking + refinement, built for scalability.
Best for
Verification, security properties, systems where full modeling is too expensive.
Failure mode
Endless refinement loops if the abstraction boundary is poorly chosen.
Pairs withformal:model-theoretic for countermodel analysis, metalevel:meta-reasoning for deciding when to stop refining.

B) Ampliative reasoning (conclusions go beyond the premises)

ampliative:inductive

Inductive reasoning (generalization)

#
Infer general patterns from observations ("observed many A are B -> probably A are B").
Outputs
General rules, trends, predictors.
How it differs
Not truth-preserving; new data can overturn it.
Best for
Learning from experience, early-stage pattern discovery, forming priors.
Failure mode
Overgeneralizing from small/biased samples.
Pairs withmetalevel:adversarial to stress-test the rule, ampliative:analogical to check if the pattern holds in other domains.
ampliative:statistical

Statistical reasoning (frequentist style)

#
Inference about populations from samples via estimators, confidence intervals, tests, error rates.
Outputs
Effect estimates + uncertainty statements tied to sampling procedures.
How it differs
Typically avoids "probability of hypotheses"; emphasizes long-run properties of procedures.
Best for
Experiments, A/B tests, QA, inference under repeated-sampling assumptions.
Failure mode
P-value worship; confusing "no evidence" with "evidence of no effect."
Pairs withuncertainty:bayesian for complementary inference, domain:experimental for design.
uncertainty:bayesian

Bayesian probabilistic reasoning (credences + updating)

#
Represent degrees of belief as probabilities and update them with evidence (Bayes' rule).
Outputs
Posterior beliefs; predictive distributions; uncertainty-aware forecasts.
How it differs
Probability as rational credence management; coherence arguments motivate consistency.
Best for
Integrating prior knowledge + data, diagnosis, forecasting, online learning.
Failure mode
Overconfident priors or "making up" priors without sensitivity analysis.
Pairs withampliative:abductive to generate hypotheses, domain:experimental to find the decisive test.
ampliative:ensemble

Ensemble Sampling

#
Generate multiple independent reasoning paths and find the consensus, reducing dependence on any single line of reasoning.
Outputs
Multiple reasoning paths with consensus analysis and crux identification.
How it differs
Not a single inference but a portfolio of inferences; reveals where reasoning is robust vs fragile.
Best for
High-stakes questions where a single reasoning path might be misleading.
Failure mode
Paths that are superficially different but share hidden assumptions.
Pairs withmetalevel:adversarial to attack the consensus, research:contradiction-resolver if paths conflict.
ampliative:likelihood

Likelihood-based reasoning (comparative support)

#
Compare how well hypotheses predict observed data via likelihoods, without necessarily committing to priors.
Outputs
Likelihood ratios; relative evidential support rankings.
How it differs
Separates "data support" from "belief after priors"; sits between Bayesian and frequentist idioms.
Best for
Model comparison, forensic evidence strength, hypothesis triage.
Failure mode
Ignoring base rates/priors entirely when they matter for decisions.
Pairs withuncertainty:bayesian for full posterior analysis, ampliative:abductive for generating hypotheses to compare.
ampliative:abductive

Abductive reasoning (inference to the best explanation)

#
From observations, propose a hypothesis that would best explain them.
Outputs
Candidate explanations/models; "best current story."
How it differs
Unlike induction (generalizing frequencies), abduction introduces hidden mechanisms/causes; unlike deduction, it's not guaranteed.
Best for
Hypothesis generation, incident triage, diagnosis, scientific discovery.
Failure mode
"Story bias" (choosing the most appealing explanation, not the most supported).
Pairs withdomain:experimental to design the test, research:evidence-table to ground hypotheses in sources.
ampliative:divergent

Divergent Brainstorm + Prune

#
Generate many alternatives then evaluate ruthlessly. Separation of generation from evaluation -- quantity first, quality second.
Outputs
Scored idea list with top selections and one surprise salvage.
How it differs
Prioritizes range over depth; explicitly includes impractical ideas to avoid premature convergence.
Best for
Innovation, creative problem-solving, finding non-obvious angles for posts.
Failure mode
Evaluation criteria that are too conservative, killing novel ideas.
Pairs withmetalevel:adversarial to stress-test the top 3, research:comparative-matrix to compare them systematically.
ampliative:analogical

Analogical reasoning (structure mapping)

#
Transfer relational structure from a known domain/case to a new one (often deeper than surface similarity).
Outputs
Candidate inferences; adapted solutions; conceptual models/metaphors.
How it differs
Often particular -> particular transfer; frequently seeds abduction ("maybe it works like...").
Best for
Innovation, design, teaching, cross-domain problem solving.
Failure mode
False analogies (shared surface traits, different causal structure).
Pairs withampliative:inductive to check if the transferred solution generalizes. Audience Playbook: audience:jtbd to test if the analogy resonates with readers; audience:mental-models to verify the source domain is one the audience already inhabits.
ampliative:case-based

Case-based reasoning (exemplar retrieval + adaptation)

#
Retrieve similar past cases and adapt their solutions.
Outputs
Proposed solution justified by precedent; playbook actions.
How it differs
More operational than analogy: emphasizes retrieval metrics + adaptation operators + case libraries.
Best for
Law (precedent), customer support, clinical decision support, ops playbooks.
Failure mode
Cargo-culting: applying precedent without checking context changes.
Pairs withampliative:analogical for deeper structural mapping, revision:defeasible for handling exceptions.
ampliative:explanation-based

Explanation-based learning / reasoning

#
Use an explanation of why a solution works to generalize a reusable rule/plan.
Outputs
Generalized strategies with an explanatory justification.
How it differs
It generalizes like induction but is guided/validated by deductive explanation.
Best for
Turning expert solutions into SOPs; reducing overfitting to anecdotes.
Failure mode
Explanations that are internally elegant but empirically wrong.
Pairs withampliative:inductive for pattern validation, causal:mechanistic for deeper explanation.
ampliative:simplicity

Simplicity / compression reasoning (Occam, MDL)

#
Prefer hypotheses that explain data with fewer assumptions / shorter descriptions, balancing fit vs complexity.
Outputs
Bias toward simpler models; complexity penalties; regularization choices.
How it differs
It's a selection principle across hypotheses; often paired with abduction and statistics.
Best for
Model selection, avoiding overfitting, choosing parsimonious policies.
Failure mode
Oversimplifying when the world is genuinely complex/nonlinear.
Pairs withampliative:abductive for generating candidates, uncertainty:bayesian for formal model comparison.
ampliative:reference-class

Reference-class / "outside view" reasoning

#
Predict by comparing to a base rate distribution of similar past projects/cases ("what usually happens?").
Outputs
Base-rate forecasts; adjustment factors.
How it differs
It's an inductive method designed to counter planning fallacy and inside-view optimism.
Best for
Project timelines, budgets, risk forecasting, portfolio-level planning.
Failure mode
Choosing the wrong reference class (too broad or too narrow).
Pairs withmetalevel:calibration for confidence checking, uncertainty:bayesian for formal updating.
ampliative:fermi

Fermi / order-of-magnitude reasoning

#
Rough quantitative estimates via decomposition and bounding.
Outputs
Back-of-the-envelope estimates; upper/lower bounds; sensitivity drivers.
How it differs
A heuristic quantitative mode: aims for scale correctness rather than precision.
Best for
Early feasibility, sanity checks, identifying dominant terms.
Failure mode
Hidden unit mistakes or implicit assumptions left untested.
Pairs withampliative:reference-class for grounding, metalevel:calibration for confidence assessment.

C) Reasoning under uncertainty and incomplete knowledge (representations)

uncertainty:probabilistic-logic

Probabilistic logic (logic + probabilities)

#
Blend logical structure (relations/rules/quantifiers) with probabilistic uncertainty (e.g., probabilistic programming + constraints).
Outputs
Probabilistic inferences over structured worlds; uncertain rule consequences.
How it differs
More expressive than plain Bayesian models for relational domains; more uncertainty-aware than pure logic.
Best for
Knowledge graphs with uncertainty; uncertain policies; relational prediction.
Failure mode
"Model soup" (too expressive leads to hard-to-validate, brittle inference).
Pairs withuncertainty:bayesian for the probabilistic updates, formal:constraint for the logical structure.
uncertainty:imprecise

Imprecise probability / interval probability

#
Represent uncertainty with ranges of probabilities when precision isn't justified.
Outputs
Bounds on beliefs and decisions; sensitivity analyses.
How it differs
Less committal than a single prior/posterior; separates "unknown" from "unlikely."
Best for
High-stakes decisions with weak priors; governance/risk; robustness checks.
Failure mode
Paralysis ("ranges are wide, so we can't decide") -- needs decision rules.
Pairs withpractical:robust for worst-case reasoning, practical:value-of-information for deciding what to learn.
uncertainty:evidential

Evidential reasoning (Dempster-Shafer / belief functions)

#
Allocate "mass" to sets of possibilities; combine evidence into belief/plausibility intervals.
Outputs
Belief + plausibility ranges; fused evidence from multiple sources.
How it differs
Can represent partial support for sets (not point hypotheses) more directly than standard probability.
Best for
Multi-source fusion, ambiguous evidence, partial identification.
Failure mode
Misusing combination rules when sources aren't independent.
Pairs withuncertainty:imprecise for interval-based reasoning, research:contradiction-resolver for handling source conflicts.
uncertainty:max-entropy

Maximum-entropy / information-theoretic reasoning

#
Choose distributions satisfying known constraints while assuming as little else as possible (maximize entropy).
Outputs
Principled default distributions; minimally committed priors under constraints.
How it differs
"Least-committal completion" rather than explanation.
Best for
Baselines, priors under constraints, principled defaults in modeling.
Failure mode
Constraints are wrong/underspecified leading to outputs that look "objective" but aren't.
Pairs withuncertainty:bayesian for updating from the max-entropy prior, uncertainty:imprecise for when even max-entropy is too committal.
uncertainty:qualitative

Qualitative probability / ranking-function reasoning (Spohn-style)

#
Replace numeric probabilities with ordinal "degree of disbelief" ranks; update by shifting ranks.
Outputs
Ordered plausibility levels; belief dynamics without precise probabilities.
How it differs
More structured than defaults, less numeric than Bayes; useful when only ordering is defensible.
Best for
Early-stage hypothesis ranking; reasoning with weak quantification.
Failure mode
Losing important magnitude information when magnitude actually matters.
Pairs withampliative:abductive for hypothesis generation, uncertainty:bayesian for quantitative refinement.

D) Reasoning under vagueness and borderline concepts (graded predicates)

vagueness:fuzzy

Fuzzy reasoning / fuzzy logic (vagueness)

#
Truth is a degree (0-1) because predicates have blurred boundaries ("tall," "near," "high risk").
Outputs
Degrees of membership/truth; fuzzy rule outputs.
How it differs
Fuzzy truth is not probability: probability is uncertainty about crisp facts; fuzzy membership is graded applicability.
Best for
Control systems, scoring/rubrics, policies with soft thresholds.
Failure mode
Treating fuzzy scores like calibrated probabilities.
Pairs withvagueness:prototype for similarity-based categorization, vagueness:rough-set for feature-limited classification.
vagueness:partial-logic

Many-valued and partial logics (true/false/unknown/undefined)

#
More than two truth values; explicitly represent "unknown" or "undefined."
Outputs
Inferences that track indeterminacy rather than forcing a binary choice.
How it differs
Often targets incompleteness more than vagueness.
Best for
Databases with nulls, partial specs, missingness-aware reasoning.
Failure mode
Conflating "unknown" with "false."
Pairs withformal:constraint for checking what holds under all completions, revision:belief-revision for updating when unknowns resolve.
vagueness:rough-set

Rough set reasoning (lower/upper approximations)

#
Approximate a concept by what is definitely in vs possibly in, given limited features/indiscernibility.
Outputs
Lower/upper bounds on classifications; boundary regions.
How it differs
Membership arises from granularity of observation, not degrees of truth.
Best for
Interpretability-focused classification; feature-limited domains.
Failure mode
Overconfidence about what's "definitely" in/out when features are weak.
Pairs withvagueness:fuzzy for graded membership, practical:satisficing for decisions under classification uncertainty.
vagueness:prototype

Prototype / similarity-based category reasoning

#
Categorize by similarity to prototypes/exemplars rather than strict necessary-and-sufficient definitions.
Outputs
Graded category judgments; typicality effects.
How it differs
Natural for human categories; complements fuzzy/rough by focusing on similarity geometry.
Best for
UX taxonomies, product categorization, human-facing labeling.
Failure mode
Hidden bias in prototypes; category drift over time.
Pairs withvagueness:fuzzy for graded membership, ampliative:analogical for cross-domain categorization.
vagueness:qualitative-physics

Qualitative reasoning (signs, monotone influences, qualitative physics)

#
Reason with qualitative states ("increasing," "decreasing," "positive influence") instead of exact numbers.
Outputs
Directional predictions; qualitative constraints; sanity checks.
How it differs
Not primarily uncertainty; it's coarse modeling for robustness and early design.
Best for
Early architecture, feedback reasoning, "does this trend make sense?" checks.
Failure mode
Missing nonlinear thresholds where sign reasoning breaks down.
Pairs withcausal:systems-thinking for feedback loop analysis, causal:mechanistic for deeper explanation.

E) Reasoning with inconsistency, defaults, and changing information

revision:non-monotonic

Non-monotonic reasoning (commonsense with exceptions)

#
Adding information can retract previous conclusions ("birds fly" until "penguin").
Outputs
Default conclusions with explicit revision behavior.
How it differs
Classical deduction is monotonic; most real knowledge bases aren't.
Best for
Rule systems with exceptions, policies, "normally" knowledge.
Failure mode
Unclear priority rules leading to inconsistent or surprising behavior.
Pairs withrevision:defeasible for explicit defeat relations, revision:belief-revision for principled change.
revision:default

Default / typicality reasoning

#
Use "normally/typically" rules overridden by more specific info.
Outputs
Typical conclusions; exception handling.
How it differs
Often categorical (default applies/doesn't) rather than numeric probabilities.
Best for
Ontologies, rule engines, SOPs with carve-outs.
Failure mode
Defaults become "facts" and stop being questioned.
Pairs withrevision:non-monotonic for formal handling, metalevel:debiasing for questioning hidden defaults.
revision:defeasible

Defeasible reasoning (tentative conclusions + defeat relations)

#
Conclusions can be defeated by counterevidence or stronger rules; tracks priorities/strength.
Outputs
Warranted conclusions given competing reasons.
How it differs
More explicit about conflict resolution than plain defaults.
Best for
Compliance/policy, medical guidelines, conflicting requirements.
Failure mode
Priority schemes that encode politics rather than relevance.
Pairs withdialectical:argumentation for structured pro/con, revision:belief-revision for principled updating.
revision:belief-revision

Belief revision and belief update (AGM-style families)

#
Principles for revising an accepted belief set with new info, especially when inconsistent.
Outputs
Revised belief sets with minimal-change goals.
How it differs
Bayesian updating revises degrees; belief revision revises acceptance of propositions.
Best for
Knowledge management, requirements evolution, source reconciliation.
Failure mode
"Minimal change" preserves outdated core assumptions.
Pairs withrevision:non-monotonic for commonsense applications, metalevel:reflective-equilibrium for coherence checking.
revision:paraconsistent

Paraconsistent reasoning (reasoning despite contradictions)

#
Tolerate contradictions without explosion (deriving everything).
Outputs
Controlled inferences from inconsistent data.
How it differs
Instead of immediately repairing inconsistency, it contains it.
Best for
Merging inconsistent sources, messy enterprise data, early incident response.
Failure mode
Never resolving contradictions that actually matter for action.
Pairs withresearch:contradiction-resolver for systematic reconciliation, revision:belief-revision for eventual resolution.
dialectical:argumentation

Argumentation theory (structured pro/con evaluation)

#
Build arguments and counterarguments; compute which claims stand given attack/defense relations.
Outputs
Accepted/warranted claims; rationale maps.
How it differs
Not just "derive consequences" but "evaluate competing reasons."
Best for
Governance, policy disputes, legal-style reasoning, stakeholder conflicts.
Failure mode
Mistaking "won the debate" for "is true" (argument strength vs reality).
Pairs withrevision:defeasible for priority handling. Rhetoric Playbook: appeal:logos for the logical structure, appeal:ethos for the credibility framing of an argumentation case.
revision:assurance-case

Assurance-case / safety-case reasoning

#
Structured argument that a system is acceptably safe/secure/reliable, supported by evidence and subclaims (often tree-like).
Outputs
Safety case; risk arguments; evidence traceability.
How it differs
It's argumentation constrained by standards and evidence requirements; bridges formal and empirical reasoning.
Best for
Safety-critical systems, compliance audits, AI governance documentation.
Failure mode
Paper compliance (beautiful argument, weak evidence).
Pairs withmetalevel:adversarial for stress-testing the case, causal:mechanistic for understanding failure modes.

F) Causal, counterfactual, explanatory, and dynamic reasoning

causal:inference

Causal inference (interventions vs observations)

#
Identify causal relations and predict effects of interventions (distinguish P(Y|X) vs P(Y|do(X))).
Outputs
Causal effect estimates; intervention predictions; adjustment sets.
How it differs
Correlation alone can't resolve confounding or direction; causal reasoning encodes structure assumptions.
Best for
Product impact, policy evaluation, root-cause analysis that must guide action.
Failure mode
Hidden confounders; unjustified causal assumptions.
Pairs withdomain:experimental to design the test, causal:counterfactual to trace intervention consequences.
causal:discovery

Causal discovery (learning causal structure)

#
Infer causal graph structure from data + assumptions (and ideally interventions).
Outputs
Candidate causal graphs; equivalence classes; hypotheses for experimentation.
How it differs
Causal inference assumes (some) structure; discovery tries to learn it.
Best for
Early-stage domains with unclear mechanisms; prioritizing experiments.
Failure mode
Overtrusting discovery outputs without validating assumptions (faithfulness, no hidden confounding, etc.).
Pairs withcausal:inference for effect estimation given a graph, domain:experimental for designing the distinguishing experiment.
causal:counterfactual

Counterfactual reasoning ("what would have happened if...")

#
Evaluate alternate histories given a causal model.
Outputs
Counterfactual outcomes; blame/credit analyses; individualized explanations.
How it differs
Needs causal structure beyond pure statistics.
Best for
Postmortems, accountability, scenario evaluation, personalized decision support.
Failure mode
Confident counterfactuals from weak models.
Pairs withcausal:inference for the underlying model, causal:mechanistic for understanding the propagation.
causal:mechanistic

Mechanistic reasoning (how it works internally)

#
Explain/predict by identifying parts and interactions.
Outputs
Mechanistic explanations; levers; failure modes.
How it differs
Stronger than correlation: gives actionable intervention points and generalizes when mechanisms hold.
Best for
Engineering, debugging, safety analysis, biology/medicine.
Failure mode
"Just-so mechanisms" that sound plausible but aren't validated.
Pairs withcausal:inference for effect estimation, causal:systems-thinking for feedback dynamics.
causal:diagnostic

Diagnostic reasoning (effects -> causes under constraints)

#
Infer hidden faults/causes from symptoms using a fault/causal model plus uncertainty handling.
Outputs
Ranked causes; next-best tests; triage plans.
How it differs
Often abduction + Bayesian/likelihood updates, constrained by explicit fault models.
Best for
Incident response, troubleshooting, quality triage.
Failure mode
Premature closure (locking onto one cause too early).
Pairs withcausal:mechanistic for understanding failure modes, practical:value-of-information for test prioritization.
causal:simulation

Model-based / simulation reasoning

#
Run an internal model (mental or computational) to predict consequences under scenarios.
Outputs
Scenario traces; sensitivity analyses; "what-if" results.
How it differs
Not proof-like; it's generative prediction from a specified model.
Best for
Complex systems, policy design, engineering dynamics, capacity planning.
Failure mode
Simulation overconfidence; unvalidated models.
Pairs withcausal:counterfactual for alternate histories, practical:robust for worst-case analysis.
causal:systems-thinking

Systems thinking (feedback loops, delays, emergence)

#
Reason about interacting components over time: reinforcing/balancing loops, delays, unintended consequences.
Outputs
Causal loop diagrams; leverage points; dynamic hypotheses.
How it differs
Explicitly multi-level and dynamic; "local linear" reasoning often fails.
Best for
Org design, markets, reliability engineering, platform ecosystems.
Failure mode
Vague loop stories without measurable hypotheses.
Pairs withcausal:mechanistic for component-level detail, vagueness:qualitative-physics for directional analysis.

G) Practical reasoning (choosing actions under constraints)

practical:means-end

Means-end / instrumental reasoning

#
From goals, derive actions/subgoals necessary or helpful to achieve them ("to get X, do Y").
Outputs
Action rationales; subgoals; dependency chains.
How it differs
About doing, not merely believing; feeds planning and decision theory.
Best for
Strategy decomposition, OKRs, operational planning.
Failure mode
Local means become ends ("process is the goal").
Pairs withformal:deductive for forward verification of the plan. Audience Playbook: audience:barriers to identify obstacles to reader engagement with the plan.
practical:decision

Decision-theoretic reasoning (utilities + uncertainty)

#
Combine beliefs with preferences/utilities to choose actions (e.g., expected utility).
Outputs
Option rankings; policies; explicit tradeoffs.
How it differs
Bayesian reasoning updates beliefs; decision theory adds values and consequences.
Best for
Portfolio choices, risk decisions, prioritization, pricing.
Failure mode
Utility mismatch (what you optimize isn't what you truly value).
Pairs withuncertainty:bayesian for probability estimates, practical:robust for worst-case analysis.
practical:multi-criteria

Multi-criteria decision analysis (MCDA) / Pareto reasoning

#
Decide with multiple objectives (cost, speed, safety, equity), often using weights, outranking, or Pareto frontiers.
Outputs
Tradeoff surfaces; Pareto-efficient sets; transparent scoring models.
How it differs
Makes tradeoffs explicit instead of collapsing them implicitly into one objective.
Best for
Strategy, procurement, roadmap planning, governance.
Failure mode
Arbitrary weights hiding politics; false precision.
Pairs withstrategic:negotiation for stakeholder alignment, practical:decision for expected-value analysis.
practical:planning

Planning / policy reasoning (sequences of actions)

#
Compute action sequences or policies achieving goals under constraints and dynamics.
Outputs
Plans, policies, contingencies, playbooks.
How it differs
Outputs a procedure, not a proposition.
Best for
Operations, project plans, incident response.
Failure mode
Plans that ignore uncertainty and execution reality.
Pairs withpractical:means-end for goal decomposition, practical:robust for handling worst cases.
practical:optimization

Optimization reasoning

#
Choose the best solution relative to an objective subject to constraints.
Outputs
Optimal/near-optimal decisions; tradeoff curves; shadow prices.
How it differs
Constraint satisfaction asks "any feasible?"; optimization asks "best feasible."
Best for
Resource allocation, routing, scheduling, design tradeoffs.
Failure mode
Optimizing the wrong objective or ignoring unmodeled constraints.
Pairs withformal:constraint for feasibility, practical:multi-criteria for multi-objective problems.
practical:robust

Robust / worst-case reasoning (minimax, safety margins)

#
Choose actions that perform acceptably under worst plausible conditions or adversaries.
Outputs
Conservative policies; guarantees; buffer sizing.
How it differs
Expected-value optimizes averages; robust optimizes guarantees.
Best for
Safety-critical systems, security, compliance, tail-risk control.
Failure mode
Overconservatism (leaving too much value on the table).
Pairs withpractical:decision for expected-value comparison, metalevel:adversarial for generating worst cases.
practical:minimax-regret

Minimax regret reasoning

#
Choose the action minimizing worst-case regret (difference from best action in hindsight).
Outputs
Regret-robust choices; hedged decisions.
How it differs
More compromise-oriented than strict worst-case utility; useful under ambiguity.
Best for
Strategy under deep uncertainty; irreversible decisions.
Failure mode
Regret framing that ignores asymmetric catastrophic outcomes.
Pairs withpractical:robust for worst-case analysis, practical:decision for expected-value comparison.
practical:satisficing

Satisficing (bounded rationality with stopping rules)

#
Seek a solution that is "good enough" given time/compute/info limits rather than globally optimal.
Outputs
Thresholds; stopping rules; acceptable solutions.
How it differs
Not "lazy optimization"; it's rational under constraints.
Best for
Real-time ops, fast-moving environments, early product strategy.
Failure mode
Thresholds too low leads to chronic mediocrity; too high leads to disguised optimization.
Pairs withpractical:optimization when more time is available, metalevel:meta-reasoning for deciding the right effort level.
practical:value-of-information

Value-of-information reasoning (what to learn next)

#
Decide which measurements/experiments reduce uncertainty most per cost to improve decisions.
Outputs
Experiment priorities; instrumentation plans; "next best question."
How it differs
Meta-decision theory: picks information acquisition actions.
Best for
R&D prioritization, analytics roadmaps, incident investigation sequencing.
Failure mode
Measuring what's easy, not what changes decisions.
Pairs withresearch:uncertainty-question for generating the questions, practical:decision for the underlying decision.
practical:heuristic

Heuristic reasoning (fast rules of thumb)

#
Use simple rules that often work; fast but biased.
Outputs
Quick decisions/inferences; prioritization shortcuts.
How it differs
Less principled but cheaper; should be paired with checks/calibration.
Best for
Triage, first drafts, guiding search.
Failure mode
Heuristics become doctrine.
Pairs withmetalevel:debiasing for checking blind spots, metalevel:calibration for confidence assessment.

H) Strategic and social reasoning (other agents matter)

strategic:game

Game-theoretic / strategic reasoning

#
Reason when outcomes depend on others' choices.
Outputs
Strategies; incentive analyses; equilibrium reasoning.
How it differs
Decision theory treats uncertainty as nature; game theory treats uncertainty as other optimizers.
Best for
Negotiation, pricing competition, security, platform rules.
Failure mode
Assuming rationality/common knowledge where it doesn't exist.
Pairs withstrategic:theory-of-mind for predicting behavior, strategic:mechanism-design for changing the rules.
strategic:theory-of-mind

Theory-of-mind / mental-state reasoning

#
Infer beliefs, intentions, knowledge states of others (nested beliefs).
Outputs
Behavior predictions; communication strategies; coordination plans.
How it differs
Focuses on beliefs-about-beliefs; often essential for collaboration.
Best for
Leadership, UX, teamwork, threat modeling.
Failure mode
Mind-reading with overconfidence; projecting your incentives onto others.
Pairs withstrategic:game for strategic interaction. Audience Playbook: audience:empathy-map for reader modeling; audience:mental-models for the cognitive frame the reader brings.
strategic:negotiation

Negotiation and coalition reasoning

#
Reason about acceptable agreements and coalition formation under constraints and asymmetric information.
Outputs
Offers, concessions, coalition structures; Pareto improvements.
How it differs
More process- and constraint-oriented than abstract equilibrium analysis; mixes game theory with norms/rhetoric.
Best for
Partnerships, sales, cross-team alignment.
Failure mode
Winning the negotiation but losing the relationship/long-term incentives.
Pairs withstrategic:theory-of-mind for predicting reactions, practical:multi-criteria for tradeoff analysis.
strategic:mechanism-design

Mechanism design / incentive engineering

#
Design rules so that self-interested behavior leads to desired outcomes (align incentives).
Outputs
Policies, marketplaces, compensation plans, governance structures.
How it differs
Reverse game theory: instead of predicting behavior under rules, choose rules to shape behavior.
Best for
Platforms, internal governance, moderation policies, compensation systems.
Failure mode
Goodharting (metrics become targets and get gamed).
Pairs withstrategic:game for predicting agent behavior, metalevel:adversarial for stress-testing the design.

I) Dialectical, rhetorical, and interpretive reasoning (reasoning as a human practice)

dialectical:dialectical

Dialectical reasoning (thesis-antithesis-synthesis)

#
Advance understanding through structured opposition: surface tensions, refine concepts, integrate perspectives.
Outputs
Refined positions; conceptual synthesis; clarified distinctions.
How it differs
Unlike paraconsistency (tolerating contradictory data), dialectic uses tension to improve concepts and frames.
Best for
Strategy debates, assumptions audits, resolving conceptual confusion.
Failure mode
Endless debate without convergence criteria.
Pairs withrevision:defeasible for handling competing reasons. Rhetoric Playbook: dialectical:steelman and dialectical:concession for the tactical writing moves; argument:concession-refute for the architectural pattern that integrates dialectic into prose.
dialectical:hermeneutic

Hermeneutic / interpretive reasoning (meaning under ambiguity)

#
Infer meaning and intent from language, documents, norms, artifacts using context and interpretive canons.
Outputs
Interpretations; reconciled meanings; clarified definitions.
How it differs
Emphasizes context and ambiguity management, not only formal entailment.
Best for
Contracts, policy docs, requirements, qualitative feedback synthesis.
Failure mode
Over-interpreting; reading intent that isn't there.
Pairs withvagueness:partial-logic for handling indeterminacy, dialectical:argumentation for competing interpretations. Rhetoric Playbook: dialectical:charity for the principle of charitable interpretation as a writing move.
dialectical:narrative

Narrative reasoning / causal storytelling

#
Build coherent time-ordered explanations connecting events, motives, causes into a story supporting prediction and action.
Outputs
Postmortems, strategy narratives, scenario stories.
How it differs
Integrates causal/abductive/rhetorical constraints; risk is over-coherence ("too neat").
Best for
Incident reports, executive communication, explaining complex causal chains.
Failure mode
Narrative closure crowding out alternative hypotheses.
Pairs withcausal:counterfactual for alternative histories. Rhetoric Playbook: frame:narrative for the writing-frame that this reasoning produces; frame:paradox when the causal story turns on a tension.
dialectical:sensemaking

Sensemaking / frame-building reasoning

#
Decide "what kind of situation is this?" -- build frames that organize signals, priorities, and actions under ambiguity.
Outputs
Situation frames; working hypotheses; shared mental models.
How it differs
Precedes many other modes: it selects what counts as relevant evidence and what questions to ask.
Best for
Crisis leadership, early-stage strategy, ambiguous competitive landscapes.
Failure mode
Locking onto the wrong frame and then reasoning flawlessly inside it.
Pairs withmetalevel:meta-reasoning for choosing how to reason, ampliative:abductive for hypothesis generation. Rhetoric Playbook: dialectical:reframe to move a debate from one frame to another in writing; frame:identity when the situation is best understood through who-the-reader-is.

J) Modal, temporal, spatial, and normative reasoning (structured possibility, time, space, and "ought")

K) Domain-specific reasoning styles (practice changes the "rules")

domain:scientific

Scientific reasoning (hypothetico-deductive cycle)

#
A workflow: abduce hypotheses, deduce predictions, test (statistics), revise beliefs/theories.
Outputs
Models, predictions, experiments, updated beliefs.
How it differs
An integrated pipeline rather than a single inference rule.
Best for
R&D, experimentation platforms, measurement culture.
Failure mode
Confirmation bias; underpowered experiments; publication/reporting bias.
Pairs withdomain:experimental for test design, uncertainty:bayesian for belief updating.
domain:experimental

Experimental design reasoning

#
Choose interventions, measurements, and sampling to identify effects (randomization, controls, blocking, instrumentation).
Outputs
Experiment plans; power analyses; measurement strategies.
How it differs
It's reasoning about how to learn reliably, not just how to analyze after the fact.
Best for
A/B testing, causal learning, evaluation of interventions.
Failure mode
Measuring proxies that don't capture the real outcome (Goodhart risk).
Pairs withcausal:inference to ensure the design addresses confounders, uncertainty:bayesian to interpret results.
domain:engineering

Engineering design reasoning

#
Iterate from requirements to architectures to prototypes with tradeoffs, constraints, and failure analyses.
Outputs
Designs, specs, tradeoff justifications, test plans.
How it differs
Inherently multi-objective and constraint-laden; relies on simulation, optimization, safety margins.
Best for
Product development, reliability, architecture decisions.
Failure mode
Premature optimization or over-engineering; ignoring maintainability.
Pairs withpractical:multi-criteria for tradeoff analysis, practical:robust for safety margins.
domain:moral

Moral / ethical reasoning

#
Reason about right/wrong and value tradeoffs (consequentialist, deontological, virtue, contractualist, care ethics, etc.).
Outputs
Value constraints; ethical justifications; tradeoff statements.
How it differs
Normative: cannot be reduced to facts alone, though must be informed by them.
Best for
AI governance, product harms, trust & safety, people policy.
Failure mode
Values laundering ("it's 'ethical' because it helps our goal") without principled constraints.
Pairs withmodal:deontic for norm analysis, metalevel:reflective-equilibrium for coherence checking.
domain:historical

Historical / investigative reasoning

#
Reconstruct what happened from incomplete sources; triangulate evidence; assess credibility; compare hypotheses.
Outputs
Best-available reconstructions; source assessments; confidence statements.
How it differs
Strong emphasis on provenance, bias, and alternative explanations under uncertainty.
Best for
Audits, incident reconstruction, due diligence, fraud investigations.
Failure mode
Overfitting to a compelling narrative; neglecting disconfirming evidence.
Pairs withampliative:abductive for hypothesis generation, uncertainty:evidential for source fusion.
domain:clinical

Clinical / operational troubleshooting reasoning

#
Blend pattern recognition (cases), mechanistic models, tests, triage, and risk constraints under time pressure.
Outputs
Triage decisions; test sequences; interventions with safety checks.
How it differs
A real-world hybrid mode optimized for time-critical, high-stakes diagnosis.
Best for
SRE/ops, support escalation, medical-style workflows.
Failure mode
Skipping confirmatory tests; treating correlations as mechanisms.
Pairs withcausal:diagnostic for systematic diagnosis, practical:value-of-information for test prioritization.

L) Meta-level and reflective modes (reasoning about reasoning)

metalevel:meta-reasoning

Meta-reasoning (strategy selection for thinking)

#
Decide how to reason: which mode to use, effort allocation, what uncertainties matter, when to stop.
Outputs
Deliberation policies; checklists; stopping rules.
How it differs
Second-order: the object is your inference process and resource allocation.
Best for
High-stakes decisions, avoiding over-analysis, building reliable org processes.
Failure mode
Meta-infinite regress ("thinking about thinking" forever).
Pairs withAny mode -- this is the mode selector. Use it at the start of complex problems.
metalevel:calibration

Calibration and epistemic humility (second-order uncertainty)

#
Track how reliable your beliefs are (forecast scoring, error bars, backtesting).
Outputs
Calibrated confidence; forecast accuracy metrics; improved priors.
How it differs
First-order uncertainty is "what is true?"; calibration is "how good am I at knowing?"
Best for
Forecasting culture, risk reviews, decision reviews.
Failure mode
Confusing confidence with competence; never measuring accuracy.
Pairs withquality:assumption-audit for deeper challenge, research:uncertainty-question for follow-up.
metalevel:reflective-equilibrium

Reflective equilibrium (coherence between principles and judgments)

#
Iteratively adjust both principles and case judgments until they cohere.
Outputs
Coherent principles + case decisions; updated policies/norms.
How it differs
Not deduction from fixed axioms; principles and judgments co-evolve.
Best for
Policy design, governance, value-laden decisions.
Failure mode
Coherence achieved by quietly dropping hard cases.
Pairs withdomain:moral for ethical applications, revision:belief-revision for principled change.
metalevel:transcendental

Transcendental reasoning (conditions of possibility)

#
Start from an accepted fact and infer what must be true for it to be possible (Kantian style).
Outputs
Necessary preconditions; architectural "must-haves."
How it differs
Not empirical induction; reasons from possibility to enabling conditions.
Best for
Deep framework design, conceptual audits, first-principles constraints.
Failure mode
Mistaking "necessary for my model" as "necessary in reality."
Pairs withformal:deductive for the logical derivation, metalevel:meta-reasoning for framework-level analysis.
metalevel:adversarial

Adversarial / red-team reasoning

#
Assume the role of an attacker/critic: try to break arguments, systems, incentives, and assumptions.
Outputs
Failure modes, exploits, counterexamples, "what could go wrong" maps.
How it differs
It's intentionally antagonistic to your current plan; pairs with robust reasoning and assurance cases.
Best for
Security, safety, governance, strategy stress-testing.
Failure mode
Cynicism theater (finding clever attacks without prioritizing real risk).
Pairs withformal:deductive to repair the argument. Rhetoric Playbook: dialectical:steelman to construct the strongest opposing argument; argument:concession-refute to integrate it into prose.
metalevel:debiasing

Debiasing / epistemic hygiene reasoning

#
Structured checks to reduce predictable errors (base rates, alternative hypotheses, premortems, disconfirmation search).
Outputs
Checklists; improved judgments; documented uncertainty.
How it differs
Not a new inference rule; it's a discipline for selecting and constraining inference.
Best for
High-stakes decisions, leadership reviews, forecasting, incident postmortems.
Failure mode
Ritualized checklists that aren't actually used to change conclusions.
Pairs withampliative:reference-class for base rates, metalevel:calibration for confidence assessment.

M) Research modules

research:query-generator

Query Generator

#
Produce optimized search queries for finding primary sources and reviews.
Outputs
6 queries with targeting notes and term constraints.
How it differs
Multi-precision -- broad queries for coverage, narrow queries for specificity.
Best for
Starting a literature search, finding specific papers or reviews.
Failure mode
All queries at the same granularity; missing key terminology from adjacent fields.
Pairs withresearch:hypothetical-answer for complementary retrieval, research:evidence-table for organizing results.
research:hypothetical-answer

Hypothetical Answer (HyDE-style)

#
Generate a plausible answer to improve retrieval -- you search for the shape of the answer, not just the question.
Outputs
Retrieval bundle with hypothetical answers and extracted keywords.
How it differs
Answer-shaped retrieval scaffold -- the hypothetical answer contains the vocabulary real answers would use.
Best for
When keyword search fails because you don't know the right terminology yet.
Failure mode
Treating hypothetical answers as real; anchoring on generated content.
Pairs withresearch:query-generator for complementary queries, research:evidence-table to validate against real sources.
research:clarifying-question

Clarifying Question Selection

#
When intent is underspecified, find the single most useful question to ask.
Outputs
Interpretation list + single highest-value clarifying question.
How it differs
Information gain -- the question that most reduces the space of possible interpretations.
Best for
Ambiguous user requests, underspecified research questions.
Failure mode
Asking too many questions; picking easy questions over diagnostic ones.
Pairs withresearch:socratic for deeper questioning, research:uncertainty-question for follow-up.
research:uncertainty-question

Uncertainty-Driven Next Question

#
In a multi-turn research session, find the question that most reduces remaining uncertainty.
Outputs
Possibility set + best next question with simulation rationale.
How it differs
Information gain simulation -- model what different answers would tell you.
Best for
Iterative research, narrowing down hypotheses over multiple turns.
Failure mode
Asking questions that confirm rather than discriminate.
Pairs withuncertainty:bayesian for formal updating, quality:topic-drift to stay on track.
research:question-modes

Non-Factoid Question Switchboard

#
Generate diverse question types to interrogate a topic from multiple angles.
Outputs
12 questions across 6 categories with evidence and search strategy.
How it differs
Six question categories that cover different evidence needs.
Best for
Opening up a topic you don't know well; finding unexpected angles.
Failure mode
Questions that are too similar despite different labels.
Pairs withresearch:query-generator to execute the searches, synthesis:claims to organize findings.
research:logic-unit

Logic Unit Extraction

#
Pull procedural knowledge out of a document as structured, sequenced steps.
Outputs
Numbered logic units + execution plan + missing info with retrieval queries.
How it differs
Decompose prose into prerequisite -> header -> body -> linker chains.
Best for
Extracting procedures from papers, converting methods sections into actionable steps.
Failure mode
Missing implicit prerequisites; over-structuring fluid descriptions.
Pairs withpractical:means-end to verify the plan, practical:search to formalize it.
research:map-then-retrieve

Lost-in-the-Middle Mitigation

#
Prevent missing key evidence buried in long documents.
Outputs
Outline map + targeted extraction + cited answer.
How it differs
Map first, target second -- outline the document before extracting.
Best for
Long documents where key evidence might be missed.
Failure mode
Outline that doesn't capture the document's actual structure.
Pairs withresearch:evidence-table for systematic grounding, research:what-missing for completeness check.
research:decompose

Subquestion Decomposition

#
Break a complex question into atomic subquestions with dependency ordering.
Outputs
Subquestion DAG with dependency ordering and retrieval queries. No answers.
How it differs
DAG construction -- subquestions have prerequisites, not just sequence.
Best for
Complex research questions that can't be answered in one step.
Failure mode
Decomposing into subquestions that are equally hard as the original.
Pairs withresearch:query-generator to execute the queries, practical:means-end to verify the decomposition.
research:evidence-table

Evidence Table

#
Build a structured evidence ledger -- claims, support, counterevidence, confidence.
Outputs
Evidence ledger with bidirectional sourcing and verification targets.
How it differs
Systematic grounding -- prevent smooth hallucinated synthesis.
Best for
Grounding any draft or analysis in actual sources before publishing.
Failure mode
Filling cells with paraphrases rather than actual quotes; false sense of thoroughness.
Pairs withsynthesis:claims for writing the answer, metalevel:adversarial to attack weak claims.
research:contradiction-resolver

Contradiction Resolver

#
When sources conflict, find principled reconciliation rather than ignoring the conflict.
Outputs
Diagnosed conflict with reconciled interpretations and evidence needs.
How it differs
Diagnostic decomposition -- the contradiction usually has a cause (scope, definition, measurement, temporality).
Best for
Literature reviews where papers disagree, reconciling expert opinions.
Failure mode
Forced reconciliation that papers over real disagreement.
Pairs withuncertainty:bayesian to assess which interpretation is most likely, research:evidence-table to find the needed evidence.
research:comparative-matrix

Comparative Matrix

#
Structured comparison of alternatives grounded in evidence, not impression.
Outputs
Grounded comparison with decision rules and open questions.
How it differs
Dimension-by-item matrix with sourced evidence in each cell.
Best for
Comparing tools, methods, frameworks, or competing explanations.
Failure mode
Dimensions chosen to favor a preferred option.
Pairs withpractical:search for selecting among alternatives. Audience Playbook: audience:jtbd for which criteria matter to readers; audience:stakes for how much weight to give each criterion.
research:repair

Third-Position Repair

#
Recover from a misunderstanding after feedback reveals the answer missed the point.
Outputs
Diagnosis + alternative interpretations + single clarifying question + revised plan.
How it differs
Diagnose, reinterpret, confirm, then redo.
Best for
Mid-conversation correction when research went in the wrong direction.
Failure mode
Overcorrecting and losing what was valid in the original direction.
Pairs withresearch:clarifying-question for the confirmation, quality:topic-drift to realign.
research:search-log

Search Stream Logger

#
Maintain an explicit trace of exploration with branching and backtracking.
Outputs
Search log with branches, backtracks, final answer, and continuation plan.
How it differs
PROPOSE -> CHECK -> SCORE -> COMMIT or BACKTRACK at each step.
Best for
Complex research tasks where you need to track what's been explored.
Failure mode
Logging overhead that slows down the actual research.
Pairs withquality:topic-drift to detect wandering, practical:search for structured branching.
research:debate-frame

Debate Frame

#
Steelman both sides of a contested question, then adjudicate.
Outputs
Balanced adversarial synthesis with identified cruxes and provisional judgment.
How it differs
Adversarial synthesis -- no dunking, treat both sides as competent.
Best for
Contested topics where readers will have strong priors on both sides.
Failure mode
False balance -- treating unequal evidence as equal.
Pairs withdialectical:dialectical for deeper synthesis, synthesis:claims for writing the verdict.
research:what-missing

Synthesis Gate: "What's Missing?"

#
Quality-gate a draft against evidence before publishing. Prevent overconfidence.
Outputs
Audited draft with support labels, retrieval queries, and uncertainty markers.
How it differs
Paragraph-level evidence audit.
Best for
Final check before publishing any post or analysis.
Failure mode
Labeling everything "weakly supported" without actionable guidance.
Pairs withquality:verify for fact-checking, quality:assumption-audit for deeper challenge.
research:systematic-review

Systematic Review Planning

#
Plan a structured evidence search with inclusion criteria and gap identification.
Outputs
Search strategy with criteria, expected sources, and gap analysis.
How it differs
Methodical coverage -- ensure you haven't cherry-picked.
Best for
Starting a thorough literature review on a new topic.
Failure mode
Criteria so broad everything qualifies, or so narrow you miss key work.
Pairs withresearch:query-generator for optimized search terms, synthesis:claims for organizing findings.
research:socratic

Socratic Questioning

#
Generate the minimal questions needed to clarify an underspecified problem.
Outputs
Interpretation list with single highest-value clarifying question.
How it differs
Question-first -- resist answering until the question is well-defined.
Best for
Problem definition, early-stage research, avoiding wasted work on wrong questions.
Failure mode
Asking questions indefinitely without converging.
Pairs withresearch:clarifying-question for dialogue contexts, uncertainty:bayesian to estimate which interpretation is most likely.
research:multi-agent

Multi-Agent Debate

#
Simulate advocates, critics, and judges to produce a balanced synthesis.
Outputs
Pro/con arguments with verification assessment and identified cruxes.
How it differs
Adversarial deliberation -- positions are argued, attacked, and reconciled.
Best for
Complex topics where a single perspective is insufficient.
Failure mode
All agents converging too quickly; simulated diversity without real tension.
Pairs withmetalevel:adversarial to deepen the strongest objection. Rhetoric Playbook: dialectical:steelman and argument:concession-refute for translating a multi-agent debate into prose.

O) Synthesis modules

synthesis:claims

Claim-Evidence-Synthesis

#
Produce structured claims with evidence, confidence, and narrative synthesis.
Outputs
Numbered claim list with evidence + integrative narrative.
How it differs
Evidence-backed claim list followed by integrative narrative.
Best for
Turning research into the backbone of a post.
Failure mode
Claims that are too broad to be actionable or too narrow to be interesting.
Pairs withresearch:evidence-table for sourcing, research:what-missing for quality gate.
synthesis:architecture

Section Architecture

#
Plan the post's section structure before writing.
Outputs
Section outline with purposes, priority assessment, and emotional arc.
How it differs
Outline with purpose -- each section has a job, not just a topic.
Best for
Medium and long-form posts where structure determines whether readers finish.
Failure mode
Sections organized by topic rather than by what they accomplish for the reader.

P) Adjustment modules

adjustment:diversity

Diversity Settings

#
Control the range and variation of outputs -- narrow for precision, wide for exploration.
Outputs
Task output calibrated to specified diversity level, with inclusion/exclusion rationale.
How it differs
Explicit diversity dial.
Best for
Controlling scope at the start of any research or brainstorming task.
Failure mode
"Wide" that is chaotic, "narrow" that misses the answer.
Pairs withampliative:divergent for wide settings, research:evidence-table for narrow settings.

Q) Quality modules

quality:verify

Chain-of-Verification

#
Draft an answer, then independently fact-check it before finalizing.
Outputs
Verified answer with uncertainty markers.
How it differs
Generate-then-check -- the verification is done independently of the generation.
Best for
Any answer containing factual claims that could be wrong.
Failure mode
Verification that rubber-stamps the original rather than genuinely checking.
Pairs withresearch:evidence-table for systematic sourcing, quality:assumption-audit for deeper challenge.
quality:error-tracking

Error Tracking

#
Trace which premises support which conclusions and identify where errors propagate.
Outputs
Dependency map with error propagation analysis and verification target.
How it differs
Dependency mapping -- each step is tagged with what it depends on.
Best for
Auditing complex reasoning chains before relying on them.
Failure mode
Tracking surface dependencies while missing deep ones.
Pairs withquality:verify to check the weakest link, formal:deductive to rebuild the chain.
quality:topic-drift

Topic Drift Detection

#
Detect when a research or writing session is wandering from its goal.
Outputs
Classification + bridge + parking-lot note + choice.
How it differs
Classify, bridge, park, and offer choice.
Best for
Long research sessions that tend to wander.
Failure mode
Being too aggressive about drift and killing productive tangents.
Pairs withresearch:search-log for session management, practical:means-end to re-anchor on the goal.
quality:assumption-audit

Assumption Audit

#
Surface and challenge every assumption in an analysis.
Outputs
Assumption inventory with impact ranking and verification priorities.
How it differs
Explicit assumption extraction followed by impact assessment.
Best for
Final check on any analysis before committing to its conclusions.
Failure mode
Listing obvious assumptions while missing the dangerous hidden ones.
Pairs withmetalevel:adversarial for assumption attack, metalevel:calibration for confidence calibration.
quality:reflection

Reflection and State Tracking

#
Take stock of where the research/writing stands and what to do next.
Outputs
Progress assessment with gap analysis, surprises, next step, and trajectory judgment.
How it differs
Progress assessment against the original goal.
Best for
Mid-session checkpoints to prevent wasted effort.
Failure mode
Reflection that restates what was done without identifying what's missing.
Pairs withAny module -- this is a checkpoint, not a processing step.
quality:peer-review

Peer Review Simulation

#
Simulate a skeptical reviewer critiquing the draft.
Outputs
Review with severity-graded flaws, fixes, and overall assessment.
How it differs
External critical perspective -- not self-assessment but simulated outside evaluation.
Best for
Final quality gate before publishing.
Failure mode
Simulated reviewer that is too gentle or too focused on surface issues.
Pairs withresearch:what-missing for finding the missing evidence. Rhetoric Playbook: dialectical:steelman and argument:concession-refute for integrating reviewer objections into the draft.
quality:calibration

Calibration Check

#
Assess whether confidence levels in the analysis are warranted.
Outputs
Calibration assessment with specific adjustments for each claim.
How it differs
Match claim strength to evidence strength -- are you overclaiming or underclaiming?
Best for
Posts that make quantitative or strength-of-evidence claims.
Failure mode
Calibrating language without calibrating substance.
Pairs withsynthesis:claims for integration, metalevel:calibration for overall confidence.
No modules match that filter.