dnd-rag-g / plan_progress.md
alexchilton's picture
feat: Add character-aware gameplay and Gradio web UI (Phase 8)
84acbaa

D&D RAG System - Implementation Progress

Project Start Date: November 6, 2024 Status: โœ… Production Ready Last Updated: November 6, 2024


๐Ÿ“Š Overall Progress

Phase Status Progress Notes
Phase 1: Core Infrastructure โœ… Complete 5/5 All core systems operational
Phase 2: Data Processors โœ… Complete 4/4 All parsers with name weighting
Phase 3: Initialization โœ… Complete 2/2 Full system initialization working
Phase 4: Query Interface โœ… Complete 1/1 Interactive CLI tool added
Phase 5: GM Dialogue โœ… Complete 2/2 RAG-enhanced AI GM working
Phase 6: Character Creation โœ… Complete 2/2 Full character creator with RAG
Phase 7: Testing & Validation โœ… Complete 3/3 26+ comprehensive tests passing
Phase 8: Game Mechanics Engine ๐Ÿšง In Progress 0/5 Character-aware gameplay enhancements

Legend: โœ… Complete | ๐Ÿšง In Progress | โณ Pending | โŒ Blocked


๐Ÿ“ Phase 1: Core Infrastructure โœ… COMPLETE

โœ… 1.1 Project Structure

  • Created dnd_rag_system/ directory
  • Created config/, core/, parsers/, systems/ subdirectories
  • Created __init__.py files for all packages

โœ… 1.2 Configuration System

File: config/settings.py

  • ChromaDB configuration
  • Ollama model settings
  • Embedding model settings (all-MiniLM-L6-v2)
  • Collection naming conventions
  • Data source paths
  • Chunk size parameters (400 tokens)

โœ… 1.3 Base Parser

File: core/base_parser.py

  • BaseParser abstract class
  • PDF extraction utilities (pdfplumber)
  • Text extraction utilities
  • Common validation methods
  • Error handling framework

โœ… 1.4 Base Chunker

File: core/base_chunker.py

  • BaseChunker abstract class
  • Token estimation function
  • Chunk splitting with overlap
  • Metadata generation helpers
  • Chunk validation

โœ… 1.5 ChromaDB Manager

File: core/chroma_manager.py

  • ChromaDBManager class
  • Collection management (create, get, delete)
  • Batch add operations
  • Single/multi-collection search
  • Statistics and reporting
  • Connection pooling

๐Ÿ“š Phase 2: Data Processors โœ… COMPLETE

โœ… 2.1 Spell Parser โญ ENHANCED

File: parsers/spell_parser.py

  • Parse spells.txt (detailed descriptions)
  • Parse all_spells.txt (class/level associations)
  • Merge spell data
  • Name weighting - spell names appear 2-3ร— in chunks
  • Create spell chunks (full_spell, quick_reference, by_class)
  • Generate spell metadata (level, school, components, classes)
  • OCR error handling
  • ~86 spells โ†’ 250+ chunks

โœ… 2.2 Monster Parser โญ ENHANCED

File: initialize_rag.py (inline loader)

  • Load from extracted_monsters.txt
  • Name weighting - monster names appear 2-3ร— in chunks
  • Monster stat block parsing
  • Combat stats extraction (CR, AC, HP)
  • Monster type extraction (e.g., "Large dragon", "Medium humanoid")
  • Generate monster metadata
  • Type tags for filtering (dragon, undead, beast, etc.)
  • ~332 monsters loaded

โœ… 2.3 Class Parser โญ ENHANCED

File: initialize_rag.py (inline loader)

  • Load from extracted_classes.txt
  • Name weighting - class names appear 2-3ร— in chunks
  • Class feature extraction
  • Generate class metadata
  • ~12 classes loaded (all core D&D classes)

โœ… 2.4 Race Parser โญ NEW!

File: initialize_rag.py (inline loader with PDF extraction)

  • PDF extraction from Player's Handbook (pages 18-46)
  • Race traits extraction
  • Ability score bonuses
  • Name weighting - race names appear 2-3ร— in chunks
  • Create race chunks (description, traits)
  • Generate race metadata (ability_increases, size, speed, darkvision, languages)
  • ~9 core races โ†’ 18 chunks

๐Ÿš€ Phase 3: Initialization System โœ… COMPLETE

โœ… 3.1 Master Init Script

File: initialize_rag.py

  • Command-line argument parsing
  • ChromaDB initialization
  • Collection creation/verification
  • Selective data loading (--only flag)
  • Clear existing data (--clear flag)
  • Progress reporting
  • Error handling and recovery
  • Summary statistics report
  • All 4 collections: spells, monsters, classes, races

โœ… 3.2 Data Validation

  • Verify all source files present
  • Test full initialization
  • Benchmark loading times (~30s first run, ~5s subsequent)
  • 600+ total chunks loaded

๐Ÿ” Phase 4: Query Interface โœ… COMPLETE

โœ… 4.1 Interactive Query Tool โญ NEW!

File: query_rag.py

  • Interactive CLI mode
  • Single-query mode
  • Collection-specific search (--spell, --monster, --class, --race)
  • Search all collections
  • Result formatting with metadata
  • Relevance scores
  • Commands: /spell, /monster, /class, /race, /stats, /help, /quit
  • Beautiful formatted output

Usage:

python query_rag.py                    # Interactive mode
python query_rag.py "fireball"         # Quick search
python query_rag.py --monster "dragon" # Search monsters

๐ŸŽฎ Phase 5: GM Dialogue System โœ… COMPLETE

โœ… 5.1 RAG-Enhanced GM

File: systems/gm_dialogue.py

  • RAG-powered rule lookups in real-time
  • GM searches ChromaDB for spells, monsters, classes
  • Ollama integration
  • Context window management
  • Session state management

โœ… 5.2 Dialogue Manager

File: run_gm_dialogue.py

  • Interactive game session
  • Commands: /help, /context, /history, /rag, /save, /quit
  • Turn tracking
  • Scene state persistence

๐Ÿ‘ค Phase 6: Character Creation โœ… COMPLETE

โœ… 6.1 Character Creator

File: systems/character_creator.py

  • Interactive CLI interface
  • Race selection with RAG lookup
  • Class selection with RAG lookup
  • Ability score generation (standard array, roll, point buy)
  • Background selection
  • Equipment selection
  • Spell selection (for casters)
  • Character validation
  • JSON export
  • Character sheet display

โœ… 6.2 Character Management

File: create_character.py

  • Save/load character files
  • Character sheet viewer
  • Integration with RAG system

๐Ÿงช Phase 7: Testing & Validation โœ… COMPLETE โญ NEW!

โœ… 7.1 Comprehensive Test Suite

File: test_all_collections.py

  • 26+ automated tests
  • Name weighting validation - exact names rank first
  • Semantic search tests - related concepts found
  • Metadata extraction tests - CR, level, abilities validated
  • All 4 collections tested - spells, monsters, classes, races
  • Cross-collection search - multi-type queries
  • Pass/fail reporting with statistics
  • Detailed error messages

โœ… 7.2 Manual Test Scripts

File: test_spell_search.py

  • Detailed search results for all collections
  • Distance/relevance scores
  • Metadata display
  • Preview of results

โœ… 7.3 Integration Tests

  • Full initialization test
  • End-to-end query test
  • GM dialogue integration
  • Character creation flow

๐ŸŽฎ Phase 8: Game Mechanics Engine ๐Ÿšง IN PROGRESS

Goal: Transform AI from rule-maker to narrator by implementing programmatic game mechanics

โœ… 8.0 Character-Aware Dialogue System โญ NEW!

File: play_with_character.py

  • Load or create characters for gameplay
  • Character context passed to GM (stats, equipment, spells)
  • Three character modes: Create new, Load JSON, Quick test
  • Commands: /character, /stats, /context for character info
  • Fixed tokenizer warning suppression
  • Dynamic character support (not hardcoded to one character)
  • Proper first/second person context ("The player is X" โ†’ AI uses "you")
  • Integration testing completed (Dec 1, 2024)

๐Ÿšง 8.1 Spell System Enhancement

File: play_with_character.py (to be refactored)

  • Programmatic spell validation (check if player owns spell)
  • Spell slot tracking by level (1st: 3 slots, 2nd: 2 slots, etc.)
  • Auto-decrement slots when casting
  • Rest mechanic to restore slots
  • Spell lookup from RAG before AI narration

๐Ÿšง 8.2 Combat Mechanics

Status: โณ Pending

  • HP tracking and damage application
  • Attack roll automation (d20 + modifiers)
  • Damage roll automation (weapon dice + STR/DEX)
  • AC checks (hit/miss determination)
  • Death saves and unconsciousness

๐Ÿšง 8.3 Turn & Initiative System

Status: โณ Pending

  • Initiative roller (d20 + DEX modifier)
  • Turn order tracking
  • Action economy (action, bonus action, movement, reaction)
  • Combat state management

๐Ÿšง 8.4 Inventory System

Status: โณ Pending

  • Add/remove items programmatically
  • Equipment weight tracking
  • Item usage validation
  • Gold/currency management

๐Ÿšง 8.5 Integration & Testing

Status: โณ Pending

  • Test spell casting flow end-to-end
  • Test combat scenarios
  • Test inventory management
  • Update documentation

๐Ÿ—๏ธ Phase 8 Architecture Notes

Option B: Hybrid AI + Rules Engine (SELECTED)

Problem: AI is unreliable at following D&D rules consistently

  • Ignores spells player casts
  • Allows spells player doesn't know
  • Doesn't track resources (HP, spell slots)
  • Makes up mechanics on the fly

Solution: Intercept player actions BEFORE AI sees them

Flow:

  1. Player Input: "I cast Magic Missile at the goblin"
  2. Rules Engine (Python code):
    • Parse: Detect spell casting intent
    • Validate: Check if player owns "Magic Missile" โœ“
    • Validate: Check if player has 1st-level spell slot โœ“
    • Retrieve: Get spell details from RAG (3 darts, 1d4+1 each)
    • Roll: 3d4+3 = 11 damage (programmatically)
    • Deduct: Spell slot consumed
    • Update: Target HP reduced by 11
  3. AI Prompt: "You successfully cast Magic Missile dealing 11 force damage to the goblin. The goblin now has 5 HP remaining. Describe the magical missiles striking the goblin."
  4. AI Response: (Just narrates the flavor, mechanics already handled)

Benefits:

  • AI becomes a narrator, not a rules engine
  • Mechanics are deterministic and accurate
  • AI can focus on storytelling
  • Players can trust the rules

Alternatives Rejected:

  • Option A (Pure AI): Too unreliable, tested and failed
  • Option C (Post-process AI): Too hard to fix bad outputs

๐Ÿ“ฆ Supporting Files โœ… COMPLETE

โœ… Dependencies

File: requirements.txt

  • chromadb
  • sentence-transformers
  • pdfplumber
  • ollama (Python client)
  • All dependencies working

โœ… Documentation

  • README.md with full installation instructions
  • Quick start guide
  • Usage examples for all tools
  • Troubleshooting section
  • plan_progress.md (this file)

๐ŸŽฏ Success Metrics

Metric Target Current Status
Init Time (full) < 5 min ~30s โœ… Exceeded
Query Latency < 500ms ~100-200ms โœ… Exceeded
Name Weighting Exact match ranks #1 โœ… Working โœ… Complete
Total Chunks ~600 612+ โœ… Complete
Test Coverage > 80% 26+ tests โœ… Complete
Collections 4 collections 4 active โœ… Complete

๐ŸŽจ Key Features Implemented

โœ… Name-Weighted Retrieval

  • Spells: Name appears 3ร— (SPELL: name, name, name)
  • Monsters: Name appears 3ร— (MONSTER: name, name, name) + type extraction
  • Classes: Name appears 3ร— (CLASS: name, name, name)
  • Races: Name appears 3ร— (RACE: name, name, name) + trait extraction

โœ… Multiple Chunk Types Per Entity

  • Spells: full_spell, quick_reference, by_class
  • Monsters: monster_stats with type tags
  • Classes: class_features
  • Races: race_description, race_traits

โœ… Rich Metadata

  • Spells: level, school, casting_time, range, components, duration, classes, ritual, concentration
  • Monsters: challenge_rating, monster_type (size + type), type tags
  • Classes: name, content_type
  • Races: ability_increases, size, speed, darkvision, languages

๐Ÿ“ Notes & Decisions

Design Decisions

  • Database: ChromaDB for persistence and semantic search
  • Embeddings: sentence-transformers/all-MiniLM-L6-v2 for speed/quality balance
  • LLM: Ollama with Qwen3-4B-RPG-Roleplay-V2 for D&D-tuned responses
  • Collection Strategy: Separate collections per content type for clean organization
  • Name Weighting: Entity names repeated 2-3ร— at chunk start for better exact-match retrieval
  • Multiple Chunks: Each entity creates multiple specialized chunks for different use cases

Key Improvements (Nov 6, 2024)

  1. โœ… Spell Parser Upgrade - Now uses sophisticated SpellParser class instead of inline code
  2. โœ… Name Weighting - All entity types now have weighted names for better retrieval
  3. โœ… Race Extraction - Full race data extracted from PDF with traits and metadata
  4. โœ… Monster Type Extraction - Automatic extraction of size and creature type
  5. โœ… Interactive Query Tool - New CLI for exploring the RAG system
  6. โœ… Comprehensive Tests - 26+ automated tests validating all functionality

Known Issues (Phase 8 Discovery - Dec 1, 2024)

  • AI Unreliability: Pure AI approach fails to consistently enforce D&D rules
    • Ignores valid spell casts (Magic Missile cast was turned into melee combat)
    • Allows invalid spells (Let Elara cast Fireball, which she doesn't know)
    • No resource tracking (spell slots, HP, gold)
  • Solution: Moving to Hybrid Architecture (Option B) with programmatic rules engine

Current Work (Phase 8)

  • ๐Ÿšง Spell Validation System: Programmatically check spell ownership before AI generation
  • ๐Ÿšง Resource Tracking: HP, spell slots, inventory management
  • ๐Ÿšง Combat Mechanics: Attack/damage rolls, initiative, turn tracking
  • ๐Ÿšง Rules Engine: Intercept player actions, apply mechanics, then AI narrates

Future Enhancements (Post-Phase 8)

  • โณ Subrace Support: High Elf, Mountain Dwarf, etc. with specific abilities
  • โณ Advanced Filtering: Search by CR range, spell level range, class, type
  • โณ Web UI: Web interface for GM dialogue
  • โณ Multiplayer Support: Multi-player sessions
  • โณ Custom Content Import: User-created monsters/spells
  • โณ Voice Interface: Voice commands for GM dialogue
  • โณ Map/Battle Grid Integration: Visual battle maps

๐Ÿ“… Timeline

Date Milestone
2024-11-06 09:00 Project started, directory structure created
2024-11-06 12:00 Phase 1-2 complete (core infrastructure + basic parsers)
2024-11-06 15:00 Phase 3-6 complete (initialization, query, GM, character creator)
2024-11-06 18:00 Major upgrades: Name weighting, race extraction, comprehensive tests
2024-11-06 20:00 Phase 7 complete: All tests passing, documentation updated
2024-11-06 21:00 V1.0 COMPLETE - Production ready!
2024-12-01 18:00 Phase 8 started: Character-aware dialogue system created
2024-12-01 19:00 Testing reveals AI reliability issues with game mechanics
2024-12-01 19:30 Architecture decision: Option B (Hybrid Rules Engine) selected

๐Ÿš€ Production Deployment Checklist

  • All 4 collections operational
  • Name weighting implemented for all entity types
  • Comprehensive test suite (26+ tests passing)
  • Interactive query tool
  • Documentation complete
  • GM dialogue system working
  • Character creator working
  • All dependencies installed
  • Error handling in place
  • Performance targets met

๐Ÿ“Š Statistics

Collection Counts

  • Spells: 86 spells โ†’ 250+ chunks
  • Monsters: 332 monsters โ†’ 332 chunks
  • Classes: 12 classes โ†’ 12 chunks
  • Races: 9 races โ†’ 18 chunks
  • Total: ~612+ chunks in ChromaDB

Test Results

  • Total Tests: 26+
  • Pass Rate: 100%
  • Collections Tested: 4/4
  • Features Validated: Name weighting, semantic search, metadata extraction, cross-collection search

Status: ๐Ÿšง V2.0 IN DEVELOPMENT (Phase 8: Game Mechanics Engine) Current Focus: Implementing hybrid AI + Rules Engine architecture Next Steps:

  1. Spell validation system
  2. Spell slot tracking
  3. HP and damage mechanics
  4. Combat turn system
  5. Inventory management

Last Updated: December 1, 2024 19:30