Modular reasoning, prompting, Claude, Obsidian, white paper research, experience design for AI. San Francisco, California

Reasoning is a way of using language. We think in words, argue in words, evaluate in words; the form a piece of thinking takes tends to shape the thinking itself. Working with LLMs brings that quietly into focus. When you direct a model to do research, what you say, how you say it, what you instruct, and how you specify the kind of thinking you want, all shape the output in ways that are not always obvious. Modular reasoning, for me, is an extension of the language-is-the-interface view that runs through the rest of this site. Reasoning is a mode of language, and designing how a model reasons is, in the end, designing the language it uses to reason with.

Most prompting does not make this explicit. A user describes a topic, the model produces a response, and the reasoning style defaults to whatever the training distribution supplies — usually a smooth, ampliative pattern-match that reads as confident but quietly hides what it did and did not consider. Modular reasoning treats reasoning as a palette instead. Rather than asking a model to “think about X,” you specify what kind of thinking you want from it: causal tracing, adversarial critique, analogical mapping, dialectical synthesis, hermeneutic interpretation. Each mode is a named module with a small, executable prompt scaffold, and modules compose into workflows.

I have been interested in how language models “think” for some time, and have kept my own notes on reasoning modes — the different ways a piece of inquiry can be structured, each producing a different kind of argument. Some of the inspiration for the playbook I have been building comes from Jeffrey Emanuel’s taxonomy of 80 reasoning modes, which helped me see the space as a palette rather than a grab-bag. I extended that collection with five families of writing-workflow modules — research, audience, synthesis, adjustment, and quality — into a working playbook of 123 modules across 17 sections. The playbook itself lives in my Obsidian vault, alongside the research it draws on, rather than as a public artifact on this site. Each module has an ID such as causal:mechanistic, dialectical:dialectical, ampliative:analogical, research:evidence-table, quality:peer-review. The ID names the operation; the module contains the prompt.

How the prompts work

Each module is a short prompt template with the same four parts. An ID names the operation. A one-line description states what cognitive work the module does. Inputs are placeholders the user fills in ({question}, {domain}, {symptoms}, {claim}). Steps are an ordered sequence the model must follow to produce a structured output. A module is self-contained: one module, one job, one output shape.

The output of one module can feed the next. A reasoning module surfaces a question; a research module grounds it in evidence; a synthesis module turns findings into claims; a quality module audits the result for consistency. The playbook records these pairings explicitly. Each module lists the modules it pairs with and why. Chains are what make the approach useful for real writing, because no single module is enough to investigate a substantive topic.

Why modular reasoning works

The default mode of a language model is ampliative pattern-matching. It generalizes from the training distribution to the prompt and produces the most probable continuation. When the prompt is specific, that works well. When the prompt is broad, the model reaches for whatever reasoning is most frequently represented in its training corpus — which is usually a mix of summary, explanation, and mild hedging. This produces prose that reads as competent but is not doing the particular work the writer needs.

Modular reasoning solves four problems at once. It makes the lens named — every module has an ID that labels the cognitive operation. It makes the lens selectable — you pick the module based on what you are trying to find, not what the model defaults to. It makes the lens composable — modules chain into workflows, and the output of each step is a structured object the next step can use. And it makes the lens auditable — because the module is named, you can see which lens produced which claim. If the same question produces different outputs under different modules, something was hidden in the original framing. The difference between the outputs is information.

Why it is useful on a research vault

My Obsidian vault contains excerpts from over 2,500 white papers and ~900 topic notes. That is too much to read and too nuanced to search flatly. Keyword search finds statistically prominent matches. LLM-native retrieval finds semantically similar content. Neither of those attends to what kind of question you are asking. A vault can give you everything relevant to a topic; it cannot by itself give you the lens you want to look through.

Modular reasoning fills that gap. A causal module traces mechanisms — it reaches for topic notes on architecture, training dynamics, and attention distribution. A dialectical module steelmans and breaks — it reaches for evidence on both sides and looks for the boundary where one side becomes the other. An analogical module maps parallels — it reaches for human-comparison research and interpretive frames. Same vault, same question, different traversals. The experiment at The Reader Who Wasn’t Reading demonstrates this directly: one guiding question, three reasoning lenses, three different paths through the same pool of ~24 topic notes, three posts that make different arguments from the same research base.

Combining modules: question type + logic + research instruction

A single module is rarely enough for a substantive piece of writing. The playbook is designed for combinations. A typical stack has three parts. A question-type module determines what kinds of inquiries to pursue. A logic module determines how to investigate them. A research module determines how to ground the investigation in sources from the vault. The three modules pass a prompt between them: the question produces a claim, the logic interrogates the claim, the research substantiates it.

A three-module stack for writing from the vault

Question type

research:question-modes

Generate twelve questions across six categories (instruction, reason, evidence, comparison, experience, debate) so the topic is interrogated from multiple angles, not just one.

Logic

causal:mechanistic

Identify key components, map their interactions, trace the mechanism end to end, and find the failure point where the mechanism diverges from the expected behavior.

Research instruction

research:evidence-table

Build a structured table of sources with claim, evidence type, source, reliability, and how each source supports or complicates the argument. Identify gaps.

↓

Combined prompt

“Ask six categories of questions about this topic, use a causal-mechanistic lens to investigate the ones that bear on how it works, and ground every claim in an evidence table built from the vault.”

Other stacks solve other writing problems. dialectical:dialectical + metalevel:adversarial + research:evidence-table produces a thesis-antithesis-synthesis post that has survived its strongest attacks. ampliative:analogical + dialectical:sensemaking + research:question-modes produces a recognition-based post that opens with a familiar experience and maps it onto unfamiliar mechanism. The combination is the design choice. The writing follows from it.

Two example prompts

Below are two filled-in prompts from the Reader Who Wasn’t Reading experiment. Each investigates the same guiding question — when you use an LLM to read research papers and extract key points, what is it actually doing, and what kind of trust should that earn? — but through different modules. Note how each module frames the inputs and what structured output the steps produce.

causal:mechanistic filled

System or phenomenon: LLM reading of research papers — the process by which a language model processes academic text and produces a summary of key points. 1. Identify the key components: attention heads, token importance ranking, context window, training format bias, retrieval mechanism. 2. Map the interactions: which component affects which, and how? (attention heads → token selection → importance ranking → output generation) 3. Trace the mechanism end-to-end: paper text input → tokenization → attention distribution → retrieval head activation → token importance ranking → output generation → summary. 4. Identify the most likely failure point: where in this chain does “finding what’s statistically prominent” diverge from “finding what matters”? 5. Propose an intervention that tests whether the mechanism is real: what experiment would distinguish between “the model understood the paper” and “the model retrieved statistically prominent tokens”?

ampliative:analogical filled

Problem: LLM reading of research papers produces summaries that hit the right topics but miss the significance, the implicit arguments, and the evaluative stance. 1. Find 3 analogous problems in different fields: — A student who can summarize a paper’s methods but not its contribution — A keyword extraction system that finds terms but not arguments — A speed reader who gets the gist but misses the nuance 2. For each analogy: (a) explain the structural mapping, (b) describe how that field deals with the limitation, (c) adapt the solution to LLM reading. 3. Assess where each analogy breaks down — what about LLM reading has no human equivalent? 4. Identify the most productive analogy and why.

Both prompts operate on the same topic. The first produces a technical account grounded in mechanism. The second produces an accessible account grounded in recognition. Neither is right or wrong; each finds what its lens is tuned to find. The act of picking a module is the act of deciding what you want the reader to take away.

In practice

I use the modular reasoning playbook as a skill inside Claude Code, running against my Obsidian vault. A typical session: I pick a question, pick a module or a small stack of modules, and run them over the vault. The output is a structured document — claims with evidence, arguments with counterarguments, analogies with breakdown points. I then edit for voice and argumentative arc. The model drafts; I decide what is worth saying. Modular reasoning is the piece that makes the collaboration tractable. Without it, I would be prompting a generalist. With it, I am prompting a specialist whose specialty I can name.

The playbook is open. If a topic needs a mode the playbook doesn’t have, I write one. The format is the same as the existing modules: ID, description, inputs, steps, pairs-with. The discipline is additive. Each new module extends the palette.

Related: