Narrative Entropy (Sn) and Shannon Entropy (H): A Formal Comparison
Technical Report | Narrative Engineering Laboratory
Author: Levent Bulut | ORCID: 0009-0007-7500-2261
DOI: 10.5281/zenodo.19421808 | License: CC BY-NC-ND 4.0
Shannon entropy is everywhere.
Language models are evaluated with it. Literary stylometry uses it. Narrative complexity research cites it. If you have spent time in computational linguistics or information theory, H = −Σ pᵢ log pᵢ is a formula you know.
It is also a formula that cannot tell you why Pulp Fiction produces maximum tension.
That is the problem this paper addresses.
Two Metrics, One Confusion
When the Bulut Doctrine introduced Narrative Entropy (Sn) in early 2026, the most common objection was immediate: "Isn't this just Shannon entropy applied to literature?"
It is not. And the difference is not terminological.
Shannon entropy measures the statistical uncertainty of discrete symbol distributions at a single point in time. It answers one question: given the preceding context, how unpredictable is the next symbol? This is what perplexity measures in language models. This is what lexical entropy measures in stylometry.
Sn measures something categorically different: the dynamic accumulation of cognitive resistance and causal uncertainty across the temporal dimension of narrative experience.
The distinction sounds abstract. The Pulp Fiction case makes it concrete.
The Pulp Fiction Problem
Run a Shannon-based entropy analysis on Pulp Fiction.
The dialogue is colloquial. The lexicon is simple. The sentences are short. By every Shannon-derived metric, this is a low-entropy text.
Now ask a different question: why does this film produce maximum sustained tension across radically different cultural contexts, for over thirty years?
Shannon has no answer. The metric measured the wrong object.
Sn = ∫(t₀ → t₁) (If × Cb) dt
Under the Sn framework:
Information Friction (If) = 0.80 — maximum temporal reordering (chronological non-linearity: 1.0), complete withholding of the Briefcase content (information withholding: 1.0), full causal opacity across non-linear threads.
Causal Branching (Cb) = 4.5 — near-maximum simultaneous unresolved fate vectors: Vincent, Jules, Mia, Butch, the Briefcase, the Wolf, the Gimp.
Sn (raw) = 18.0 — structural maximum. The film is operating at the limit of the engagement zone, held from Heat Death solely by the gravitational mass of the Vacuum Variable (the Briefcase: Io = 2.5, maximum informational opacity).
Shannon H: moderate. Sn: maximum.
They point in opposite directions. Only one of them correctly identifies what the film is doing.
The Six Dimensions of Divergence
This is not a minor technical distinction. The two metrics differ across six fundamental dimensions:
| Dimension | Shannon Entropy (H) | Narrative Entropy (Sn) |
|---|---|---|
| Domain | Symbol distributions in communication channels | Structural complexity of narrative systems |
| Temporal structure | Static — point-in-time | Dynamic — temporal integral |
| Measurement object | Probability distribution over symbols | If × Cb product over narrative time |
| Operational unit | Bits (log₂) or nats | Accumulated cognitive resistance |
| Directionality | Non-directional | Directional — accumulation and trajectory |
| Engineering function | Compression optimization | Tension management; biophysical output |
The deepest divergence is temporal. Shannon entropy has no memory. It measures the disorder at a single moment. Sn is constitutively temporal — it measures what accumulates. A narrative that maintains If = 0.75 and Cb = 3 for its entire duration produces a fundamentally different biophysical state than one that reaches those values briefly at the climax.
This is why the integral formulation is not decorative. It is the structural claim.
What Shannon Gets Right — And Where It Stops
Shannon entropy is not wrong for narrative. It is incomplete.
It correctly captures lexical unpredictability. Melville's prose in Moby Dick scores maximum H — the lexical density and syntactic complexity are genuinely high. Dostoevsky's Crime and Punishment scores moderate H — accessible prose, deliberately constrained.
But Sn tells a different story about the same texts:
| Work | Shannon H | Sn (raw) | Relationship |
|---|---|---|---|
| Crime & Punishment | Moderate-high | 3.0 | High H, low Sn — constrained structure despite complex prose |
| Moby Dick | Maximum | 7.6 | Both high — lexical complexity and structural opacity align |
| Pulp Fiction | Moderate | 18.0 | Low H, maximum Sn — the critical divergence case |
Crime and Punishment is the revealing case on the other side. Dostoevsky's prose is lexically sophisticated — Shannon correctly identifies this. But the narrative structure is tightly constrained: near-linear chronology, single dominant causal path, no unresolved fate vectors beyond Raskolnikov's inevitable confession. Sn = 3.0. The structural tension is low despite the surface complexity.
Shannon measured the prose. Sn measured the architecture.
The Two Failure States Shannon Cannot See
Shannon entropy has no failure states for narrative. There is no concept of a narrative losing its reader through over-entropy or under-entropy.
Sn defines two:
Narrative Cold Death (Sn → 0): If ≈ 0, Cb ≈ 0. The narrative is perfectly predictable. Information flows without resistance. No cognitive heat is generated. The reader processes without engagement. Shannon H can be high in a Cold Death narrative — a lexically varied but structurally inert text.
Narrative Heat Death (Sn → ∞): If and Cb at maximum, sustained without resolution. The cognitive load exceeds structural capacity. The reader detaches. System collapse. Shannon H can be low — simple colloquial text — while the narrative is in Heat Death through structural overload.
A complete computational model of narrative engagement requires these boundary conditions. Shannon-derived metrics produce neither.
The Sn Measurement Protocol v1.0 (DOI: 10.5281/zenodo.19410663, published April 4, 2026) specifies the full execution procedure: Narrative Segment boundaries, If anchor scale, Cb branch taxonomy, six-step calculation pipeline, normalised score zones, and inter-rater reliability requirements. That protocol operationalises what this paper establishes theoretically.
For Computational Narrative Researchers
The Sn framework identifies three structural gaps in current Shannon-based approaches:
Gap 1 — Temporal structure is lost. Shannon metrics treat narrative as a bag-of-words or symbol sequence, discarding cumulative dynamics. Sn's integral formulation captures what point-in-time metrics cannot.
Gap 2 — Lexical and structural entropy are independent. Pulp Fiction demonstrates this conclusively. Using H as a proxy for narrative complexity produces systematic errors wherever surface and structural complexity diverge.
Gap 3 — No failure state model. Shannon entropy cannot identify when a narrative is losing its reader through structural overload or structural inertia. Sn's Cold Death / Heat Death framework provides these boundary conditions.
Sn is not a replacement for Shannon-based metrics. It is a complement operating at a different level of narrative structure. Both are needed. Only one of them explains Pulp Fiction.
Full Paper
Zenodo: https://doi.org/10.5281/zenodo.19421808
Cite as: Bulut, L. (2026). Narrative Entropy (Sn) and Shannon Entropy (H): A Formal Comparison — From Information Theory to Narrative Physics. Zenodo. https://doi.org/10.5281/zenodo.19421808
Tags: Publications Shannon entropy Narrative Entropy Information Theory Narrative Engineering Physics of Literature Computational Narrative
İç bağlantılar