There’s been a lot of talk recently — including this article by The Bookseller — about when AI models might be able to write a bestselling novel.
It’s a compelling idea, especially with the rapid advances in large language models (LLMs). But there’s a hidden trap in that question, one that reveals how we often misunderstand what these models are actually doing.
The problem isn’t creativity. It’s context.
LLMs don’t write like humans. And more importantly, we don’t often ask them to. If we treat a model like it can do anything just because it’s a computer, and then judge it by human standards, it’ll fail — not because it’s incapable, but because we haven’t asked it to operate under human rules.
One of the hardest tasks for a language model is generating cohesive, long-form narratives. By the time you get 60,000 to 70,000 words deep into a novel, the story’s early beats are often lost to the model.

Characters forget what they know. Plots fray. The ending doesn’t land. That’s not a failure of intelligence — it’s a failure of memory architecture. As writers know, even we forget what we wrote in Chapter 2 by the time we get to Chapter 20.
So we asked: What could be… differently?
We’ve been experimenting with new ways to manage narrative continuity in generative systems. Not just for books, but for interactive fiction and adaptive gameplay. To support our upcoming gaming platform, we built a choose-your-own-adventure authoring tool that could generate over 100 distinct narrative arcs — each one coherent, self-contained, and consistent with a fixed world state.
Rather than starting from scratch, we fed the system a completed manuscript — Fortunes Told (A Voyager’s Guide to Life Between Worlds), a new novel published by our in-house imprint, Adventures of the Persistently Impaired (…and Other Tales). The goal was to generate branching versions of that novel’s storyline: alternate decisions, consequences, character paths — all while retaining the original story’s tone, logic, and internal consistency.
That meant solving two problems at once:
How do you fragment a single linear story into hundreds of possible branches?
And how do you rebuild those branches into fully contextualized narratives that still feel authored, with complete, accurate and continuity with a continuation indistinguishable from the original work?
And this what we did:

Process Flow for Semantics
1. Semantic Similarity Analysis
Technical Implementation: Sentence transformer models (BERT-based) for semantic embedding generation and cosine similarity computation.
Methodology: The system employs pre-trained sentence transformers to encode narrative segments into high-dimensional vector representations. The analyser computes semantic similarity between generated content and the original manuscript’s thematic elements using cosine similarity measures.
Key Components:
- Sentence Transformer Models: Pre-trained for semantic embedding generation
- Cosine Similarity Computation: Mathematical distance measurement between semantic vectors
- Thematic Consistency Scoring: Quantitative assessment of narrative adherence to original themes
Algorithm Flow:
Theme Extraction: Extract themes and concepts from story bible using semantic parsing
Embedding Generation: Generate embeddings for both original themes and generated content
Similarity Computation: Compute cosine similarity matrix between theme vectors and content vectors
Scoring Calculation: Calculate weighted thematic consistency score
Semantic Drift Analysis: Calculate semantic drift to measure consistency with established narrative patternsMathematical Foundation:
- Theme Alignment:
theme_alignment = mean(cosine_similarity(content_embedding, theme_embeddings))
- Concept Coherence:
concept_coherence = mean(cosine_similarity(content_embedding, concept_embeddings))
- Semantic Drift: Measures deviation from established narrative patterns using author voice fingerprint
Fallback Mechanism: When models are unavailable, the system employs keyword-based semantic analysis using TF-IDF vectorization and Jaccard similarity coefficients
2. Stylometric Fingerprinting for Voice Authenticity

Technical Implementation: A comprehensive stylometric analysis using multi-dimensional linguistic feature extraction for author voice authentication.
Methodology: The analyser implements a sophisticated feature extraction pipeline that captures author-specific linguistic patterns including lexical diversity, syntactic complexity, and discourse markers.
Feature Extraction Pipeline:
- Lexical Features: Type-token ratio, vocabulary sophistication, word frequency distributions
- Syntactic Features: Average sentence length, POS tag distributions & dependency parsing patterns
- Discourse Features: Transition word usage, paragraph structure, narrative flow indicators
- Readability Metrics: Flesch reading ease scores & text complexity measures
- Punctuation Patterns: Character-level stylistic signatures with frequency analysis
Mathematical Foundation:
- Lexical Diversity:
D = |V| / |W|
where V = unique vocabulary, W = total words - Syntactic Complexity: Multi-dimensional feature vectors incorporating sentence length variance, subordination indices, and dependency depth
- Style Authenticity Score: Weighted combination calculated using a
consistency_score = ( (1 - min(sentence_length_diff, 1)) * 0.2 + (1 - lexical_diversity_diff) * 0.2 + (1 - readability_diff) * 0.15 + pos_similarity * 0.25 + punct_similarity * 0.1 + discourse_similarity * 0.1 )
- POS Similarity: Jensen-Shannon divergence approximation for part-of-speech tag distribution comparison
- Complexity Score:
complexity_score = (sentence_length_variance * 0.3 + lexical_diversity * 0.4 + readability_normalized * 0.3)
Discourse Marker Analysis: The system identifies and analyses discourse markers including ‘however’, ‘therefore’, ‘meanwhile’, ‘subsequently’, ‘furthermore’, ‘moreover’, ‘nevertheless’, ‘consequently’, ‘thus’, ‘hence’, ‘additionally’, ‘alternatively’, ‘specifically’, ‘particularly’, ‘indeed’, ‘certainly’, ‘obviously’, ‘clearly’, ‘evidently’.
Fallback Mechanism: Basic stylometric analysis using sentence length statistics and punctuation frequency patterns
3. Discourse Coherence Analysis for Narrative Flow

Technical Implementation: Amulti-layered discourse analysis using entity continuity tracking, topic modelling, and transition quality assessment.
Methodology: The analyser employs a three-tier analysis framework.
Tier 1: Entity Continuity Analysis
- Coreference Resolution: Tracking entity mentions across narrative segments using named entity recognition
- Entity Consistency: Validation of character and object references with introduction detection
- Reference Chain Analysis: Mapping pronoun and entity relationships
Tier 2: Topic Continuity Assessment
- Topic Modeling: Extracts key topics using lemmatization for nouns and proper nouns
- Semantic Coherence: Vector space analysis of topic transitions using set intersection
- Contextual Relevance: Topic overlap calculation:
topic_overlap = len(current_topics & preceding_topics) / len(current_topics | preceding_topics)
Tier 3: Discourse Transition Quality
- Transition Marker Analysis: Identification and classification of discourse connectives including ‘however’, ‘therefore’, ‘meanwhile’, ‘subsequently’, ‘then’, ‘next’, ‘finally’, ‘first’, ‘second’, ‘later’, ‘after’, ‘before’, ‘during’, ‘while’, ‘since’, ‘because’, ‘so’, ‘thus’, ‘hence’, ‘consequently’, ‘as a result’
- Narrative Flow Scoring: Quantitative assessment of logical progression with optimal transition ratio of 30%
- Coherence Metrics: Mathematical models for discourse quality measurement with normalization for over-transitioning
Entity Introduction : The system uses heuristic patterns to detect proper entity introduction:
a\s+{entity}
,an\s+{entity}
,the\s+{entity}
called\s+{entity}
,named\s+{entity}
,known\s+as\s+{entity}
Transition Quality Assessment: Implements adaptive scoring where transition_score ≤ optimal_ratio (0.3) results in transition_score / optimal_ratio
, otherwise 1.0 - (transition_score - optimal_ratio) / (1.0 - optimal_ratio)
to penalize excessive transitions.
Fallback Mechanism: Basic coherence analysis using transition word detection and sentence structure analysis.

Process Flow
4. Entity Relationship Tracking with Knowledge Graphs
Technical Implementation: Dynamic knowledge graph construction using for entity relationship modelling and consistency validation.
Methodology: a comprehensive knowledge graph representing characters, objects, locations, and their interdependencies.
Graph Construction Algorithm:
- Entity Extraction: Named entity recognition using transformer models
- Relationship Identification: Dependency parsing for relationship extraction
- Graph Construction: Graph instantiation with weighted edges
- Relationship Validation: Consistency checking using graph traversal algorithms
Relationship Types:
- Character Relationships: Social connections, familial bonds, hierarchical structures
- Spatial Relationships: Geographic locations, spatial proximity, containment
- Temporal Relationships: Event sequences, causality chains, temporal dependencies
- Thematic Relationships: Conceptual connections, symbolic associations
Consistency Validation:
- Contradiction Detection: Graph-based inconsistency identification
- Relationship Strength: Weighted edge analysis for relationship confidence
- Temporal Consistency: Timeline validation using directed acyclic graphs
5. Temporal Consistency Validation with Timeline Graphs
Technical Implementation:
Temporal event extraction and timeline graph construction for chronological consistency validation.
Methodology:
Temporal information extraction techniques to construct comprehensive timeline graphs and validate narrative chronology.
Temporal Analysis Components:
- Temporal Expression Recognition: Extraction of time references, dates, and temporal markers
- Event Sequence Modeling: Construction of temporal event chains
- Timeline Graph Construction: Directed graph representation of temporal relationships
- Consistency Validation: Contradiction detection in temporal sequences
Validation Algorithms:
Consistency Scoring: Quantitative assessment of temporal coherence
Temporal Ordering: Topological sorting for event sequence validation
Contradiction Detection: Graph cycle detection for temporal impossibilities

6. Impossible Transition Detection
Technical Implementation: Multi-modal analysis system for identifying logically impossible narrative transitions in world state, character development, and plot progression.
Methodology: A comprehensive world state representations and employs rule-based and statistical methods for impossibility detection.
World State Components:
- Character States: Physical condition, location, emotional state, knowledge
- Environmental States: Weather, time of day, location conditions
- Object States: Availability, condition, ownership, location
- Plot States: Revealed information, completed actions, story progression
Impossibility Detection Categories:
- Physical Impossibilities: Spatial-temporal contradictions, impossible character actions
- Logical Impossibilities: Knowledge contradictions, causal violations
- Narrative Impossibilities: Character development inconsistencies, plot holes
Detection Algorithm:
Impossibility Scoring: Quantitative assessment of transition validity.
State Extraction: Parse narrative for world state information
State Comparison: Compare current state with previous states
Rule Application: Apply logical consistency rules
7. Attention-Based Context Selection for Optimal Relevance
Technical Implementation: Advanced attention mechanism implementation for intelligent context selection using transformer-based relevance scoring.
Methodology: Attention mechanisms to dynamically select the most relevant context for narrative generation, optimizing for both coherence and efficiency.
Context Candidate Generation:
- Story Bible Segmentation: Hierarchical segmentation of narrative context
- Character Profile Extraction: Dynamic character information compilation
- Temporal Context: Relevant preceding and succeeding narrative segments
- Thematic Context: Theme-relevant story elements
Attention Mechanism:
Thematic Alignment: Content-theme similarity scoring
Relevance Scoring: Transformer-based attention weights for context importance
Dynamic Selection: Adaptive context window sizing based on narrative complexity
Temporal Weighting: Recency-based importance scaling
8. Dynamic Character Voice Modelling for Authentic Dialogue
Technical Implementation: Comprehensive character voice modeling using linguistic pattern analysis and sentiment profiling for authentic dialogue generation.
Methodology: The Modeller analyses character-specific linguistic patterns to create detailed voice profiles for maintaining dialogue authenticity.
Voice Profile Components:
- Vocabulary Preferences: Character-specific word choice patterns
- Syntactic Patterns: Sentence structure preferences and complexity
- Emotional Markers: Sentiment patterns and emotional expression styles
- Relationship Dynamics: Communication patterns based on character relationships
Analysis Techniques:
- Vocabulary Analysis: Frequency-based word preference modeling
- Syntactic Analysis: Sentence structure pattern recognition
- Emotional Profiling: Sentiment analysis and emotional range characterization
- Relationship Modeling: Context-dependent communication style analysis
Voice Authenticity Scoring:
Emotional Consistency: Sentiment alignment with character emotional profile
Vocabulary Consistency: Lexical choice alignment with character profile
Syntactic Consistency: Sentence structure pattern matching
9. Causal Chain Analysis for Consequence Tracking
Technical Implementation: Sophisticated causal relationship modeling using directed acyclic graphs for narrative consequence tracking and logical flow validation.
Methodology: The Analyser constructs comprehensive causal chains to track narrative consequences and ensure logical story progression.
Causal Event Modeling:
- Event Extraction: Identification of significant narrative events
- Causal Relationship Identification: Cause-effect relationship mapping
- Consequence Prediction: Forward-chaining causal inference
- Chain Validation: Logical consistency checking
Theme Relevance: Causal event thematic alignment
10. Coreference Resolution for Entity Tracking
TechnTechnical Implementation: Advanced coreference resolution using neural network models and heuristic approaches for comprehensive entity tracking across narrative segments.
Methodology: The Resolver employs both neural coreference models and rule-based approaches to maintain entity consistency throughout the narrative.
Coreference Resolution Pipeline:
- Mention Detection: Identification of entity mentions and pronouns
- Coreference Clustering: Grouping coreferent mentions
- Neural Resolution: Deep learning models for coreference decisions
- Heuristic Fallback: Rule-based resolution for edge cases
Neural Coreference Models:
- Mention Pair Scoring: Neural networks for coreference likelihood
- Cluster Ranking: Antecedent selection using attention mechanisms
- Gender and Number Agreement: Linguistic constraint validation
Heuristic Rules:
Number Agreement: Singular/plural consistency checking
Proximity Constraints: Distance-based coreference likelihood
11. Sentiment Analysis for Emotional Consistency
TecTechnical Implementation: Multi-layered sentiment analysis using transformer-based models and lexicon-based approaches for emotional consistency validation.
Methodology: The Analyser employs VADER sentiment analysis and custom emotional profiling to ensure narrative emotional consistency.
Sentiment Analysis Components:
- Valence Analysis: Positive/negative sentiment quantification
- Arousal Assessment: Emotional intensity measurement
- Emotion Classification: Discrete emotion category identification
- Temporal Trajectory: Emotional progression tracking
Emotional Inconsistency Detection
Sudden Shift Detection: Abrupt emotional transition identification
Trajectory Analysis: Emotional arc consistency validation
Character Consistency: Individual character emotional pattern validation

12. Named Entity Recognition for World Consistency
Technical Implementation: Named entity recognition using transformer-based models for comprehensive world consistency validation.
Methodology: The module employs transformer models enhanced with custom entity relationship tracking for world consistency maintenance.
Entity Recognition Pipeline:
- Entity Extraction: Multi-class named entity identification
- Attribute Extraction: Entity-specific attribute identification
- Relationship Mapping: Inter-entity relationship identification
- Consistency Validation: World state consistency checking
Entity Categories:
- Characters: People, fictional beings, character names
- Locations: Geographic locations, fictional places, spatial references
- Organizations: Groups, institutions, fictional organizations
- Temporal Entities: Time references, historical periods, events
World State Coherence: Global consistency checking
Attribute Contradiction Detection: Inconsistent entity descriptions
Relationship Validation: Impossible entity relationships