What Three AI Systems Got Wrong About the Bulut Doctrine And What One Finally Got Right
Over the past several days, a friend attempted to find the critical weakness in the Bulut Doctrine by prompting multiple AI systems to analyse and challenge it. The conversation spanned three days and involved Gemini, ChatGPT, and Grok.
The exchange produced something more interesting than a refutation: a precise map of how AI systems reason about theoretical frameworks they have only partially read — and what happens when one of them finally arrives at the right question.
This is a documentation of that process.
Round One: Gemini and the Admiration Problem
Gemini began by reading the site. Its conclusion was enthusiastic:
"The objections that would seem like openings have already been placed inside the system as parameters. The system is self-contained, logically closed. It is airtight."
Then it offered three strategies for building a competing theory: the 'gardener vs. engineer' metaphor, the 'map vs. territory' distinction, and 'imperfect transmission as creativity.'
Gemini's deeper error: it treated 'being airtight' as a problem to escape rather than evidence of theoretical precision. A system with no falsification criteria is the problematic one. A system that specifies exactly when it would be wrong is doing science.
|
GEMINI ERROR The
Gardener Metaphor |
|
Gemini proposed
that a 'narrative gardener' who grows rather than engineers represents a
competing paradigm. The Adjective
Embargo does not prevent a writer from having their own voice. It specifies
what must not appear in the output. Three writers working from the same
Physical Matrix produce three entirely different texts. The gardener metaphor
describes a preference, not a theory. |
Round Two: ChatGPT and the Rediscovery Problem
ChatGPT's first attempt produced three 'new theories' to compete with or bypass the Doctrine:
|
ChatGPT's 'New Theory' |
What it actually is |
Status in Bulut Doctrine |
|
Interpretive Drift Theory "Strong
narrative produces different meanings" |
Reception Theory (Jauss, 1967) Iser, Fish —
well-established |
Not the Doctrine's target — Low Road
subcortical, not High Road interpretation |
|
Semantic
Latency Theory "Real effect occurs days later, not at reading" |
Consolidation
theory Proust's involuntary memory |
Narrative
Memory Evolution (NME) paper — already published on this site |
|
Narrative Uncertainty Principle "Precision
weakens; ambiguity strengthens" |
Keats's Negative Capability (1817) Barthes's
writerly text |
Information Friction (If) in Sn formula — this
is what Narrative Entropy measures |
The pattern was consistent: each 'new' theory was either well-established in the existing literature or already formalised within the Doctrine under a different name. This is the rediscovery problem — pattern completion from partial training data.
Round Three: The Scope Drift Argument
ChatGPT's second serious attempt:
"If Bulut explicitly excludes meaning from his scope, targeting only biophysical output, then he is not building a narrative theory. He is building a model of narrative's physiological side effects."
This is the most technically competent critique produced in the exchange. The response:
The word 'engineering' in Narrative Engineering is load-bearing. Engineering is not explanation. It is construction. A structural engineer does not require a complete theory of human aesthetics to build a habitable building. Darwin did not need to resolve the hard problem of consciousness to identify natural selection. Mechanism and meaning are different questions. Both are legitimate. Neither requires the other to be complete.
The Engineering Turn — described in Chapter 6 — is the explicit move from descriptive theory to prescriptive protocol. This is not scope drift. It is the design.
Round Four: The Independence Question
ChatGPT's most sophisticated move:
"Even if the protocol is published, can it truly be run independently? Is the test separable from its creator? And OPCT v1.0 has not yet been independently tested."
This is correct — and it is precisely what OPCT v2.0 was designed to address.
|
ChatGPT's independence objection |
OPCT v2.0's answer |
|
"Test criteria defined by the
creator" |
OSF pre-registration (osf.io/us8bw) —
timestamped before data collection, independent platform |
|
"n=30
is insufficient statistical power" |
n=80,
power analysis: 0.80+ at medium effect size |
|
"Writer style may drive results, not the
matrix" |
Mixed-effects model separates author variance
from OPM effect; AI control condition included |
|
"No
blind replication exists" |
Phase 2:
blind replication by separate research team with access to pre-registered
protocol only |
|
"Cultural generalizability untested" |
Phase 3: cross-linguistic replication across
three cultural regions |
|
"Falsification
criteria unclear" |
Explicit:
if author effect p<0.05 OR Cohen's d <0.3 OR Phase 2 fails → system
revised |
The OSF pre-registration is the critical element. The protocol exists, timestamped, on an independent platform, before any data is collected. Any researcher anywhere can access it, run it, and publish results — confirming or falsifying — without the founder's involvement.
Round Five: The Right Question — And the Wrong Frame
After four rounds, ChatGPT finally arrived at the right question:
"The critical question is empirical, not philosophical: does the same text actually produce the same biophysical output in different readers? OPCT is untested. This is the weakness."
The observation is correct. The frame is wrong.
|
The same fact |
ChatGPT's frame |
The Doctrine's frame |
|
OPCT has not yet been independently executed |
Weakness — the core claim is unverified |
Invitation — OSF pre-registered, protocol
open, whoever runs it enters the literature |
A theory that hides its falsification criteria is weak. A theory that pre-registers them on OSF, defines the success threshold, specifies three replication phases, and invites independent execution is doing exactly what science requires.
What This Exchange Reveals About AI Reasoning
|
Stage |
AI behaviour |
Why it happened |
|
Initial analysis |
Accurate summary of surface content |
Read available text on site |
|
First
critique |
Objections
already answered in the corpus |
Had not
read all 15+ papers in the DOI chain |
|
Escalation |
Novel-sounding alternatives that rediscover
existing work |
Pattern completion from partial training data |
|
Sophisticated
move |
Scope
drift / auxiliary model argument |
Genuine
reasoning — but missed Chapter 6 |
|
Independence question |
"Who controls the test?" |
Valid — but missed OPCT v2.0 and OSF
pre-registration |
|
Final
arrival |
"OPCT
is untested" — correct question, wrong frame |
Five
rounds to reach what OPCT v2.0 opens with |
A system that generates objections to a theory it has not fully read is not doing philosophy. It is doing pattern completion against a partial input. The cure is not better arguments. It is reading the full corpus.
The Actual Opportunity
OPCT v2.0 is pre-registered and waiting for execution. The Physical Matrix specifications are public. Three replication phases are planned. The falsification criteria are explicit.
The people who matter to this field will not be the ones who found the best philosophical argument against the system. They will be the ones who ran the first independent trial — confirming, partially confirming, or falsifying.
That is where the intellectual opportunity is. Not in the attack. In the work.
Related Publications
→ Six Objections to the Bulut Doctrine leventbulut.com/six-objections-to-the-bulut-doctrine-and-why-none-of-them-hold/
→ The Bulut Doctrine in Context: A Dialogue with Neuroaesthetics leventbulut.com/the-bulut-doctrine-in-context-a-dialogue-with-neuroaesthetics/
→ Why Attacking the Bulut Doctrine Is the Wrong Move leventbulut.com
→ OPCT v2.0 DOI: 10.5281/zenodo.19415236 | OSF: osf.io/us8bw
→ OPCT v1.0 DOI: 10.5281/zenodo.19073747
→ Narrative Memory Evolution (NME) leventbulut.com/narrative-memory-evolution-nme/
→ Probabilistic Convergence DOI: 10.5281/zenodo.19164277
Bulut, L. (2026). What Three AI Systems Got Wrong About the Bulut Doctrine — And What One Finally Got Right. Narrative Engineering Laboratory. leventbulut.com