PI & Other Tales, Inc.• Generating Adventures with Neural Language Processing

There’s been a lot of talk recently — including this article by The Bookseller — about when AI models might be able to write a bestselling novel.

It’s a compelling idea, especially with the rapid advances in large language models (LLMs). But there’s a hidden trap in that question, one that reveals how we often misunderstand what these models are actually doing.

The problem isn’t creativity. It’s context.

LLMs don’t write like humans. And more importantly, we don’t often ask them to. If we treat a model like it can do anything just because it’s a computer, and then judge it by human standards, it’ll fail — not because it’s incapable, but because we haven’t asked it to operate under human rules.

One of the hardest tasks for a language model is generating cohesive, long-form narratives. By the time you get 60,000 to 70,000 words deep into a novel, the story’s early beats are often lost to the model.

Characters forget what they know. Plots fray. The ending doesn’t land. That’s not a failure of intelligence — it’s a failure of memory architecture. As writers know, even we forget what we wrote in Chapter 2 by the time we get to Chapter 20.

So we asked: What could be… differently?

We’ve been experimenting with new ways to manage narrative continuity in generative systems. Not just for books, but for interactive fiction and adaptive gameplay. To support our upcoming gaming platform, we built a choose-your-own-adventure authoring tool that could generate over 100 distinct narrative arcs — each one coherent, self-contained, and consistent with a fixed world state.

Rather than starting from scratch, we fed the system a completed manuscript — Fortunes Told (A Voyager’s Guide to Life Between Worlds), a new novel published by our in-house imprint, Adventures of the Persistently Impaired (…and Other Tales). The goal was to generate branching versions of that novel’s storyline: alternate decisions, consequences, character paths — all while retaining the original story’s tone, logic, and internal consistency.

That meant solving two problems at once:

How do you fragment a single linear story into hundreds of possible branches?

And how do you rebuild those branches into fully contextualized narratives that still feel authored, with complete, accurate and continuity with a continuation indistinguishable from the original work?

And this what we did:

Process Flow for Semantics

1. Semantic Similarity Analysis

Technical Implementation: Sentence transformer models (BERT-based) for semantic embedding generation and cosine similarity computation.

Methodology: The system employs pre-trained sentence transformers to encode narrative segments into high-dimensional vector representations. The analyser computes semantic similarity between generated content and the original manuscript’s thematic elements using cosine similarity measures.

Key Components:

Sentence Transformer Models: Pre-trained for semantic embedding generation
Cosine Similarity Computation: Mathematical distance measurement between semantic vectors
Thematic Consistency Scoring: Quantitative assessment of narrative adherence to original themes

Algorithm Flow:
Theme Extraction: Extract themes and concepts from story bible using semantic parsing
Embedding Generation: Generate embeddings for both original themes and generated content
Similarity Computation: Compute cosine similarity matrix between theme vectors and content vectors
Scoring Calculation: Calculate weighted thematic consistency score

Semantic Drift Analysis: Calculate semantic drift to measure consistency with established narrative patternsMathematical Foundation:

Theme Alignment: theme_alignment = mean(cosine_similarity(content_embedding, theme_embeddings))
Concept Coherence: concept_coherence = mean(cosine_similarity(content_embedding, concept_embeddings))
Semantic Drift: Measures deviation from established narrative patterns using author voice fingerprint

Fallback Mechanism: When models are unavailable, the system employs keyword-based semantic analysis using TF-IDF vectorization and Jaccard similarity coefficients

2. Stylometric Fingerprinting for Voice Authenticity

Technical Implementation: A comprehensive stylometric analysis using multi-dimensional linguistic feature extraction for author voice authentication.

Methodology: The analyser implements a sophisticated feature extraction pipeline that captures author-specific linguistic patterns including lexical diversity, syntactic complexity, and discourse markers.

Feature Extraction Pipeline:

Lexical Features: Type-token ratio, vocabulary sophistication, word frequency distributions
Syntactic Features: Average sentence length, POS tag distributions & dependency parsing patterns
Discourse Features: Transition word usage, paragraph structure, narrative flow indicators
Readability Metrics: Flesch reading ease scores & text complexity measures
Punctuation Patterns: Character-level stylistic signatures with frequency analysis

Mathematical Foundation:

Lexical Diversity: D = |V| / |W| where V = unique vocabulary, W = total words
Syntactic Complexity: Multi-dimensional feature vectors incorporating sentence length variance, subordination indices, and dependency depth
Style Authenticity Score: Weighted combination calculated using a consistency_score = ( (1 - min(sentence_length_diff, 1)) * 0.2 + (1 - lexical_diversity_diff) * 0.2 + (1 - readability_diff) * 0.15 + pos_similarity * 0.25 + punct_similarity * 0.1 + discourse_similarity * 0.1 )
POS Similarity: Jensen-Shannon divergence approximation for part-of-speech tag distribution comparison
Complexity Score: complexity_score = (sentence_length_variance * 0.3 + lexical_diversity * 0.4 + readability_normalized * 0.3)

Discourse Marker Analysis: The system identifies and analyses discourse markers including ‘however’, ‘therefore’, ‘meanwhile’, ‘subsequently’, ‘furthermore’, ‘moreover’, ‘nevertheless’, ‘consequently’, ‘thus’, ‘hence’, ‘additionally’, ‘alternatively’, ‘specifically’, ‘particularly’, ‘indeed’, ‘certainly’, ‘obviously’, ‘clearly’, ‘evidently’.

Fallback Mechanism: Basic stylometric analysis using sentence length statistics and punctuation frequency patterns

3. Discourse Coherence Analysis for Narrative Flow

Technical Implementation: Amulti-layered discourse analysis using entity continuity tracking, topic modelling, and transition quality assessment.

Methodology: The analyser employs a three-tier analysis framework.

Tier 1: Entity Continuity Analysis

Coreference Resolution: Tracking entity mentions across narrative segments using named entity recognition
Entity Consistency: Validation of character and object references with introduction detection
Reference Chain Analysis: Mapping pronoun and entity relationships

Tier 2: Topic Continuity Assessment

Topic Modeling: Extracts key topics using lemmatization for nouns and proper nouns
Semantic Coherence: Vector space analysis of topic transitions using set intersection
Contextual Relevance: Topic overlap calculation: topic_overlap = len(current_topics & preceding_topics) / len(current_topics | preceding_topics)

Tier 3: Discourse Transition Quality

Transition Marker Analysis: Identification and classification of discourse connectives including ‘however’, ‘therefore’, ‘meanwhile’, ‘subsequently’, ‘then’, ‘next’, ‘finally’, ‘first’, ‘second’, ‘later’, ‘after’, ‘before’, ‘during’, ‘while’, ‘since’, ‘because’, ‘so’, ‘thus’, ‘hence’, ‘consequently’, ‘as a result’
Narrative Flow Scoring: Quantitative assessment of logical progression with optimal transition ratio of 30%
Coherence Metrics: Mathematical models for discourse quality measurement with normalization for over-transitioning

Entity Introduction : The system uses heuristic patterns to detect proper entity introduction:

a\s+{entity}, an\s+{entity}, the\s+{entity}
called\s+{entity}, named\s+{entity}, known\s+as\s+{entity}

Transition Quality Assessment: Implements adaptive scoring where transition_score ≤ optimal_ratio (0.3) results in transition_score / optimal_ratio, otherwise 1.0 - (transition_score - optimal_ratio) / (1.0 - optimal_ratio) to penalize excessive transitions.

Fallback Mechanism: Basic coherence analysis using transition word detection and sentence structure analysis.

Process Flow

4. Entity Relationship Tracking with Knowledge Graphs

Technical Implementation: Dynamic knowledge graph construction using for entity relationship modelling and consistency validation.

Methodology: a comprehensive knowledge graph representing characters, objects, locations, and their interdependencies.

Graph Construction Algorithm:

Entity Extraction: Named entity recognition using transformer models
Relationship Identification: Dependency parsing for relationship extraction
Graph Construction: Graph instantiation with weighted edges
Relationship Validation: Consistency checking using graph traversal algorithms

Relationship Types:

Character Relationships: Social connections, familial bonds, hierarchical structures
Spatial Relationships: Geographic locations, spatial proximity, containment
Temporal Relationships: Event sequences, causality chains, temporal dependencies
Thematic Relationships: Conceptual connections, symbolic associations

Consistency Validation:

Contradiction Detection: Graph-based inconsistency identification
Relationship Strength: Weighted edge analysis for relationship confidence
Temporal Consistency: Timeline validation using directed acyclic graphs

5. Temporal Consistency Validation with Timeline Graphs

Technical Implementation:
Temporal event extraction and timeline graph construction for chronological consistency validation.

Methodology:
Temporal information extraction techniques to construct comprehensive timeline graphs and validate narrative chronology.

Temporal Analysis Components:

Temporal Expression Recognition: Extraction of time references, dates, and temporal markers
Event Sequence Modeling: Construction of temporal event chains
Timeline Graph Construction: Directed graph representation of temporal relationships
Consistency Validation: Contradiction detection in temporal sequences

Validation Algorithms:

Consistency Scoring: Quantitative assessment of temporal coherence
Temporal Ordering: Topological sorting for event sequence validation
Contradiction Detection: Graph cycle detection for temporal impossibilities

6. Impossible Transition Detection

Technical Implementation: Multi-modal analysis system for identifying logically impossible narrative transitions in world state, character development, and plot progression.

Methodology: A comprehensive world state representations and employs rule-based and statistical methods for impossibility detection.

World State Components:

Character States: Physical condition, location, emotional state, knowledge
Environmental States: Weather, time of day, location conditions
Object States: Availability, condition, ownership, location
Plot States: Revealed information, completed actions, story progression

Impossibility Detection Categories:

Physical Impossibilities: Spatial-temporal contradictions, impossible character actions
Logical Impossibilities: Knowledge contradictions, causal violations
Narrative Impossibilities: Character development inconsistencies, plot holes

Detection Algorithm:
Impossibility Scoring: Quantitative assessment of transition validity.
State Extraction: Parse narrative for world state information
State Comparison: Compare current state with previous states
Rule Application: Apply logical consistency rules

7. Attention-Based Context Selection for Optimal Relevance

Technical Implementation: Advanced attention mechanism implementation for intelligent context selection using transformer-based relevance scoring.

Methodology: Attention mechanisms to dynamically select the most relevant context for narrative generation, optimizing for both coherence and efficiency.

Context Candidate Generation:

Story Bible Segmentation: Hierarchical segmentation of narrative context
Character Profile Extraction: Dynamic character information compilation
Temporal Context: Relevant preceding and succeeding narrative segments
Thematic Context: Theme-relevant story elements

Attention Mechanism:

Thematic Alignment: Content-theme similarity scoring
Relevance Scoring: Transformer-based attention weights for context importance
Dynamic Selection: Adaptive context window sizing based on narrative complexity
Temporal Weighting: Recency-based importance scaling

8. Dynamic Character Voice Modelling for Authentic Dialogue

Technical Implementation: Comprehensive character voice modeling using linguistic pattern analysis and sentiment profiling for authentic dialogue generation.

Methodology: The Modeller analyses character-specific linguistic patterns to create detailed voice profiles for maintaining dialogue authenticity.

Voice Profile Components:

Vocabulary Preferences: Character-specific word choice patterns
Syntactic Patterns: Sentence structure preferences and complexity
Emotional Markers: Sentiment patterns and emotional expression styles
Relationship Dynamics: Communication patterns based on character relationships

Analysis Techniques:

Vocabulary Analysis: Frequency-based word preference modeling
Syntactic Analysis: Sentence structure pattern recognition
Emotional Profiling: Sentiment analysis and emotional range characterization
Relationship Modeling: Context-dependent communication style analysis

Voice Authenticity Scoring:
Emotional Consistency: Sentiment alignment with character emotional profile
Vocabulary Consistency: Lexical choice alignment with character profile
Syntactic Consistency: Sentence structure pattern matching

9. Causal Chain Analysis for Consequence Tracking

Technical Implementation: Sophisticated causal relationship modeling using directed acyclic graphs for narrative consequence tracking and logical flow validation.

Methodology: The Analyser constructs comprehensive causal chains to track narrative consequences and ensure logical story progression.

Causal Event Modeling:

Event Extraction: Identification of significant narrative events
Causal Relationship Identification: Cause-effect relationship mapping
Consequence Prediction: Forward-chaining causal inference
Chain Validation: Logical consistency checking

Theme Relevance: Causal event thematic alignment

10. Coreference Resolution for Entity Tracking

TechnTechnical Implementation: Advanced coreference resolution using neural network models and heuristic approaches for comprehensive entity tracking across narrative segments.

Methodology: The Resolver employs both neural coreference models and rule-based approaches to maintain entity consistency throughout the narrative.

Coreference Resolution Pipeline:

Mention Detection: Identification of entity mentions and pronouns
Coreference Clustering: Grouping coreferent mentions
Neural Resolution: Deep learning models for coreference decisions
Heuristic Fallback: Rule-based resolution for edge cases

Neural Coreference Models:

Mention Pair Scoring: Neural networks for coreference likelihood
Cluster Ranking: Antecedent selection using attention mechanisms
Gender and Number Agreement: Linguistic constraint validation

Heuristic Rules:
Number Agreement: Singular/plural consistency checking
Proximity Constraints: Distance-based coreference likelihood

11. Sentiment Analysis for Emotional Consistency

TecTechnical Implementation: Multi-layered sentiment analysis using transformer-based models and lexicon-based approaches for emotional consistency validation.

Methodology: The Analyser employs VADER sentiment analysis and custom emotional profiling to ensure narrative emotional consistency.

Sentiment Analysis Components:

Valence Analysis: Positive/negative sentiment quantification
Arousal Assessment: Emotional intensity measurement
Emotion Classification: Discrete emotion category identification
Temporal Trajectory: Emotional progression tracking

Emotional Inconsistency Detection
Sudden Shift Detection: Abrupt emotional transition identification
Trajectory Analysis: Emotional arc consistency validation
Character Consistency: Individual character emotional pattern validation

12. Named Entity Recognition for World Consistency

Technical Implementation: Named entity recognition using transformer-based models for comprehensive world consistency validation.

Methodology: The module employs transformer models enhanced with custom entity relationship tracking for world consistency maintenance.

Entity Recognition Pipeline:

Entity Extraction: Multi-class named entity identification
Attribute Extraction: Entity-specific attribute identification
Relationship Mapping: Inter-entity relationship identification
Consistency Validation: World state consistency checking

Entity Categories:

Characters: People, fictional beings, character names
Locations: Geographic locations, fictional places, spatial references
Organizations: Groups, institutions, fictional organizations
Temporal Entities: Time references, historical periods, events

World State Coherence: Global consistency checking
Attribute Contradiction Detection: Inconsistent entity descriptions
Relationship Validation: Impossible entity relationships