Spaces:
Build error
D&D RAG System - Implementation Progress
Project Start Date: November 6, 2024 Status: โ Production Ready Last Updated: November 6, 2024
๐ Overall Progress
| Phase | Status | Progress | Notes |
|---|---|---|---|
| Phase 1: Core Infrastructure | โ Complete | 5/5 | All core systems operational |
| Phase 2: Data Processors | โ Complete | 4/4 | All parsers with name weighting |
| Phase 3: Initialization | โ Complete | 2/2 | Full system initialization working |
| Phase 4: Query Interface | โ Complete | 1/1 | Interactive CLI tool added |
| Phase 5: GM Dialogue | โ Complete | 2/2 | RAG-enhanced AI GM working |
| Phase 6: Character Creation | โ Complete | 2/2 | Full character creator with RAG |
| Phase 7: Testing & Validation | โ Complete | 3/3 | 26+ comprehensive tests passing |
| Phase 8: Game Mechanics Engine | ๐ง In Progress | 0/5 | Character-aware gameplay enhancements |
Legend: โ Complete | ๐ง In Progress | โณ Pending | โ Blocked
๐ Phase 1: Core Infrastructure โ COMPLETE
โ 1.1 Project Structure
- Created
dnd_rag_system/directory - Created
config/,core/,parsers/,systems/subdirectories - Created
__init__.pyfiles for all packages
โ 1.2 Configuration System
File: config/settings.py
- ChromaDB configuration
- Ollama model settings
- Embedding model settings (all-MiniLM-L6-v2)
- Collection naming conventions
- Data source paths
- Chunk size parameters (400 tokens)
โ 1.3 Base Parser
File: core/base_parser.py
-
BaseParserabstract class - PDF extraction utilities (pdfplumber)
- Text extraction utilities
- Common validation methods
- Error handling framework
โ 1.4 Base Chunker
File: core/base_chunker.py
-
BaseChunkerabstract class - Token estimation function
- Chunk splitting with overlap
- Metadata generation helpers
- Chunk validation
โ 1.5 ChromaDB Manager
File: core/chroma_manager.py
-
ChromaDBManagerclass - Collection management (create, get, delete)
- Batch add operations
- Single/multi-collection search
- Statistics and reporting
- Connection pooling
๐ Phase 2: Data Processors โ COMPLETE
โ 2.1 Spell Parser โญ ENHANCED
File: parsers/spell_parser.py
- Parse
spells.txt(detailed descriptions) - Parse
all_spells.txt(class/level associations) - Merge spell data
- Name weighting - spell names appear 2-3ร in chunks
- Create spell chunks (full_spell, quick_reference, by_class)
- Generate spell metadata (level, school, components, classes)
- OCR error handling
- ~86 spells โ 250+ chunks
โ 2.2 Monster Parser โญ ENHANCED
File: initialize_rag.py (inline loader)
- Load from
extracted_monsters.txt - Name weighting - monster names appear 2-3ร in chunks
- Monster stat block parsing
- Combat stats extraction (CR, AC, HP)
- Monster type extraction (e.g., "Large dragon", "Medium humanoid")
- Generate monster metadata
- Type tags for filtering (dragon, undead, beast, etc.)
- ~332 monsters loaded
โ 2.3 Class Parser โญ ENHANCED
File: initialize_rag.py (inline loader)
- Load from
extracted_classes.txt - Name weighting - class names appear 2-3ร in chunks
- Class feature extraction
- Generate class metadata
- ~12 classes loaded (all core D&D classes)
โ 2.4 Race Parser โญ NEW!
File: initialize_rag.py (inline loader with PDF extraction)
- PDF extraction from Player's Handbook (pages 18-46)
- Race traits extraction
- Ability score bonuses
- Name weighting - race names appear 2-3ร in chunks
- Create race chunks (description, traits)
- Generate race metadata (ability_increases, size, speed, darkvision, languages)
- ~9 core races โ 18 chunks
๐ Phase 3: Initialization System โ COMPLETE
โ 3.1 Master Init Script
File: initialize_rag.py
- Command-line argument parsing
- ChromaDB initialization
- Collection creation/verification
- Selective data loading (--only flag)
- Clear existing data (--clear flag)
- Progress reporting
- Error handling and recovery
- Summary statistics report
- All 4 collections: spells, monsters, classes, races
โ 3.2 Data Validation
- Verify all source files present
- Test full initialization
- Benchmark loading times (~30s first run, ~5s subsequent)
- 600+ total chunks loaded
๐ Phase 4: Query Interface โ COMPLETE
โ 4.1 Interactive Query Tool โญ NEW!
File: query_rag.py
- Interactive CLI mode
- Single-query mode
- Collection-specific search (--spell, --monster, --class, --race)
- Search all collections
- Result formatting with metadata
- Relevance scores
- Commands: /spell, /monster, /class, /race, /stats, /help, /quit
- Beautiful formatted output
Usage:
python query_rag.py # Interactive mode
python query_rag.py "fireball" # Quick search
python query_rag.py --monster "dragon" # Search monsters
๐ฎ Phase 5: GM Dialogue System โ COMPLETE
โ 5.1 RAG-Enhanced GM
File: systems/gm_dialogue.py
- RAG-powered rule lookups in real-time
- GM searches ChromaDB for spells, monsters, classes
- Ollama integration
- Context window management
- Session state management
โ 5.2 Dialogue Manager
File: run_gm_dialogue.py
- Interactive game session
- Commands: /help, /context, /history, /rag, /save, /quit
- Turn tracking
- Scene state persistence
๐ค Phase 6: Character Creation โ COMPLETE
โ 6.1 Character Creator
File: systems/character_creator.py
- Interactive CLI interface
- Race selection with RAG lookup
- Class selection with RAG lookup
- Ability score generation (standard array, roll, point buy)
- Background selection
- Equipment selection
- Spell selection (for casters)
- Character validation
- JSON export
- Character sheet display
โ 6.2 Character Management
File: create_character.py
- Save/load character files
- Character sheet viewer
- Integration with RAG system
๐งช Phase 7: Testing & Validation โ COMPLETE โญ NEW!
โ 7.1 Comprehensive Test Suite
File: test_all_collections.py
- 26+ automated tests
- Name weighting validation - exact names rank first
- Semantic search tests - related concepts found
- Metadata extraction tests - CR, level, abilities validated
- All 4 collections tested - spells, monsters, classes, races
- Cross-collection search - multi-type queries
- Pass/fail reporting with statistics
- Detailed error messages
โ 7.2 Manual Test Scripts
File: test_spell_search.py
- Detailed search results for all collections
- Distance/relevance scores
- Metadata display
- Preview of results
โ 7.3 Integration Tests
- Full initialization test
- End-to-end query test
- GM dialogue integration
- Character creation flow
๐ฎ Phase 8: Game Mechanics Engine ๐ง IN PROGRESS
Goal: Transform AI from rule-maker to narrator by implementing programmatic game mechanics
โ 8.0 Character-Aware Dialogue System โญ NEW!
File: play_with_character.py
- Load or create characters for gameplay
- Character context passed to GM (stats, equipment, spells)
- Three character modes: Create new, Load JSON, Quick test
- Commands:
/character,/stats,/contextfor character info - Fixed tokenizer warning suppression
- Dynamic character support (not hardcoded to one character)
- Proper first/second person context ("The player is X" โ AI uses "you")
- Integration testing completed (Dec 1, 2024)
๐ง 8.1 Spell System Enhancement
File: play_with_character.py (to be refactored)
- Programmatic spell validation (check if player owns spell)
- Spell slot tracking by level (1st: 3 slots, 2nd: 2 slots, etc.)
- Auto-decrement slots when casting
- Rest mechanic to restore slots
- Spell lookup from RAG before AI narration
๐ง 8.2 Combat Mechanics
Status: โณ Pending
- HP tracking and damage application
- Attack roll automation (d20 + modifiers)
- Damage roll automation (weapon dice + STR/DEX)
- AC checks (hit/miss determination)
- Death saves and unconsciousness
๐ง 8.3 Turn & Initiative System
Status: โณ Pending
- Initiative roller (d20 + DEX modifier)
- Turn order tracking
- Action economy (action, bonus action, movement, reaction)
- Combat state management
๐ง 8.4 Inventory System
Status: โณ Pending
- Add/remove items programmatically
- Equipment weight tracking
- Item usage validation
- Gold/currency management
๐ง 8.5 Integration & Testing
Status: โณ Pending
- Test spell casting flow end-to-end
- Test combat scenarios
- Test inventory management
- Update documentation
๐๏ธ Phase 8 Architecture Notes
Option B: Hybrid AI + Rules Engine (SELECTED)
Problem: AI is unreliable at following D&D rules consistently
- Ignores spells player casts
- Allows spells player doesn't know
- Doesn't track resources (HP, spell slots)
- Makes up mechanics on the fly
Solution: Intercept player actions BEFORE AI sees them
Flow:
- Player Input: "I cast Magic Missile at the goblin"
- Rules Engine (Python code):
- Parse: Detect spell casting intent
- Validate: Check if player owns "Magic Missile" โ
- Validate: Check if player has 1st-level spell slot โ
- Retrieve: Get spell details from RAG (3 darts, 1d4+1 each)
- Roll: 3d4+3 = 11 damage (programmatically)
- Deduct: Spell slot consumed
- Update: Target HP reduced by 11
- AI Prompt: "You successfully cast Magic Missile dealing 11 force damage to the goblin. The goblin now has 5 HP remaining. Describe the magical missiles striking the goblin."
- AI Response: (Just narrates the flavor, mechanics already handled)
Benefits:
- AI becomes a narrator, not a rules engine
- Mechanics are deterministic and accurate
- AI can focus on storytelling
- Players can trust the rules
Alternatives Rejected:
- Option A (Pure AI): Too unreliable, tested and failed
- Option C (Post-process AI): Too hard to fix bad outputs
๐ฆ Supporting Files โ COMPLETE
โ Dependencies
File: requirements.txt
- chromadb
- sentence-transformers
- pdfplumber
- ollama (Python client)
- All dependencies working
โ Documentation
- README.md with full installation instructions
- Quick start guide
- Usage examples for all tools
- Troubleshooting section
- plan_progress.md (this file)
๐ฏ Success Metrics
| Metric | Target | Current | Status |
|---|---|---|---|
| Init Time (full) | < 5 min | ~30s | โ Exceeded |
| Query Latency | < 500ms | ~100-200ms | โ Exceeded |
| Name Weighting | Exact match ranks #1 | โ Working | โ Complete |
| Total Chunks | ~600 | 612+ | โ Complete |
| Test Coverage | > 80% | 26+ tests | โ Complete |
| Collections | 4 collections | 4 active | โ Complete |
๐จ Key Features Implemented
โ Name-Weighted Retrieval
- Spells: Name appears 3ร (SPELL: name, name, name)
- Monsters: Name appears 3ร (MONSTER: name, name, name) + type extraction
- Classes: Name appears 3ร (CLASS: name, name, name)
- Races: Name appears 3ร (RACE: name, name, name) + trait extraction
โ Multiple Chunk Types Per Entity
- Spells: full_spell, quick_reference, by_class
- Monsters: monster_stats with type tags
- Classes: class_features
- Races: race_description, race_traits
โ Rich Metadata
- Spells: level, school, casting_time, range, components, duration, classes, ritual, concentration
- Monsters: challenge_rating, monster_type (size + type), type tags
- Classes: name, content_type
- Races: ability_increases, size, speed, darkvision, languages
๐ Notes & Decisions
Design Decisions
- Database: ChromaDB for persistence and semantic search
- Embeddings: sentence-transformers/all-MiniLM-L6-v2 for speed/quality balance
- LLM: Ollama with Qwen3-4B-RPG-Roleplay-V2 for D&D-tuned responses
- Collection Strategy: Separate collections per content type for clean organization
- Name Weighting: Entity names repeated 2-3ร at chunk start for better exact-match retrieval
- Multiple Chunks: Each entity creates multiple specialized chunks for different use cases
Key Improvements (Nov 6, 2024)
- โ
Spell Parser Upgrade - Now uses sophisticated
SpellParserclass instead of inline code - โ Name Weighting - All entity types now have weighted names for better retrieval
- โ Race Extraction - Full race data extracted from PDF with traits and metadata
- โ Monster Type Extraction - Automatic extraction of size and creature type
- โ Interactive Query Tool - New CLI for exploring the RAG system
- โ Comprehensive Tests - 26+ automated tests validating all functionality
Known Issues (Phase 8 Discovery - Dec 1, 2024)
- AI Unreliability: Pure AI approach fails to consistently enforce D&D rules
- Ignores valid spell casts (Magic Missile cast was turned into melee combat)
- Allows invalid spells (Let Elara cast Fireball, which she doesn't know)
- No resource tracking (spell slots, HP, gold)
- Solution: Moving to Hybrid Architecture (Option B) with programmatic rules engine
Current Work (Phase 8)
- ๐ง Spell Validation System: Programmatically check spell ownership before AI generation
- ๐ง Resource Tracking: HP, spell slots, inventory management
- ๐ง Combat Mechanics: Attack/damage rolls, initiative, turn tracking
- ๐ง Rules Engine: Intercept player actions, apply mechanics, then AI narrates
Future Enhancements (Post-Phase 8)
- โณ Subrace Support: High Elf, Mountain Dwarf, etc. with specific abilities
- โณ Advanced Filtering: Search by CR range, spell level range, class, type
- โณ Web UI: Web interface for GM dialogue
- โณ Multiplayer Support: Multi-player sessions
- โณ Custom Content Import: User-created monsters/spells
- โณ Voice Interface: Voice commands for GM dialogue
- โณ Map/Battle Grid Integration: Visual battle maps
๐ Timeline
| Date | Milestone |
|---|---|
| 2024-11-06 09:00 | Project started, directory structure created |
| 2024-11-06 12:00 | Phase 1-2 complete (core infrastructure + basic parsers) |
| 2024-11-06 15:00 | Phase 3-6 complete (initialization, query, GM, character creator) |
| 2024-11-06 18:00 | Major upgrades: Name weighting, race extraction, comprehensive tests |
| 2024-11-06 20:00 | Phase 7 complete: All tests passing, documentation updated |
| 2024-11-06 21:00 | V1.0 COMPLETE - Production ready! |
| 2024-12-01 18:00 | Phase 8 started: Character-aware dialogue system created |
| 2024-12-01 19:00 | Testing reveals AI reliability issues with game mechanics |
| 2024-12-01 19:30 | Architecture decision: Option B (Hybrid Rules Engine) selected |
๐ Production Deployment Checklist
- All 4 collections operational
- Name weighting implemented for all entity types
- Comprehensive test suite (26+ tests passing)
- Interactive query tool
- Documentation complete
- GM dialogue system working
- Character creator working
- All dependencies installed
- Error handling in place
- Performance targets met
๐ Statistics
Collection Counts
- Spells: 86 spells โ 250+ chunks
- Monsters: 332 monsters โ 332 chunks
- Classes: 12 classes โ 12 chunks
- Races: 9 races โ 18 chunks
- Total: ~612+ chunks in ChromaDB
Test Results
- Total Tests: 26+
- Pass Rate: 100%
- Collections Tested: 4/4
- Features Validated: Name weighting, semantic search, metadata extraction, cross-collection search
Status: ๐ง V2.0 IN DEVELOPMENT (Phase 8: Game Mechanics Engine) Current Focus: Implementing hybrid AI + Rules Engine architecture Next Steps:
- Spell validation system
- Spell slot tracking
- HP and damage mechanics
- Combat turn system
- Inventory management
Last Updated: December 1, 2024 19:30