dnd-rag-g

Build error

App Files Files Community

dnd-rag-g / plan_progress.md

alexchilton

feat: Add character-aware gameplay and Gradio web UI (Phase 8)

84acbaa 5 months ago

preview code

raw

history blame contribute delete

16.8 kB

D&D RAG System - Implementation Progress

Project Start Date: November 6, 2024 Status: ✅ Production Ready Last Updated: November 6, 2024

📊 Overall Progress

Phase	Status	Progress	Notes
Phase 1: Core Infrastructure	✅ Complete	5/5	All core systems operational
Phase 2: Data Processors	✅ Complete	4/4	All parsers with name weighting
Phase 3: Initialization	✅ Complete	2/2	Full system initialization working
Phase 4: Query Interface	✅ Complete	1/1	Interactive CLI tool added
Phase 5: GM Dialogue	✅ Complete	2/2	RAG-enhanced AI GM working
Phase 6: Character Creation	✅ Complete	2/2	Full character creator with RAG
Phase 7: Testing & Validation	✅ Complete	3/3	26+ comprehensive tests passing
Phase 8: Game Mechanics Engine	🚧 In Progress	0/5	Character-aware gameplay enhancements

Legend: ✅ Complete | 🚧 In Progress | ⏳ Pending | ❌ Blocked

📁 Phase 1: Core Infrastructure ✅ COMPLETE

✅ 1.1 Project Structure

Created dnd_rag_system/ directory
Created config/, core/, parsers/, systems/ subdirectories
Created __init__.py files for all packages

✅ 1.2 Configuration System

File: config/settings.py

ChromaDB configuration
Ollama model settings
Embedding model settings (all-MiniLM-L6-v2)
Collection naming conventions
Data source paths
Chunk size parameters (400 tokens)

✅ 1.3 Base Parser

File: core/base_parser.py

BaseParser abstract class
PDF extraction utilities (pdfplumber)
Text extraction utilities
Common validation methods
Error handling framework

✅ 1.4 Base Chunker

File: core/base_chunker.py

BaseChunker abstract class
Token estimation function
Chunk splitting with overlap
Metadata generation helpers
Chunk validation

✅ 1.5 ChromaDB Manager

File: core/chroma_manager.py

ChromaDBManager class
Collection management (create, get, delete)
Batch add operations
Single/multi-collection search
Statistics and reporting
Connection pooling

📚 Phase 2: Data Processors ✅ COMPLETE

✅ 2.1 Spell Parser ⭐ ENHANCED

File: parsers/spell_parser.py

Parse spells.txt (detailed descriptions)
Parse all_spells.txt (class/level associations)
Merge spell data
Name weighting - spell names appear 2-3× in chunks
Create spell chunks (full_spell, quick_reference, by_class)
Generate spell metadata (level, school, components, classes)
OCR error handling
~86 spells → 250+ chunks

✅ 2.2 Monster Parser ⭐ ENHANCED

File: initialize_rag.py (inline loader)

Load from extracted_monsters.txt
Name weighting - monster names appear 2-3× in chunks
Monster stat block parsing
Combat stats extraction (CR, AC, HP)
Monster type extraction (e.g., "Large dragon", "Medium humanoid")
Generate monster metadata
Type tags for filtering (dragon, undead, beast, etc.)
~332 monsters loaded

✅ 2.3 Class Parser ⭐ ENHANCED

File: initialize_rag.py (inline loader)

Load from extracted_classes.txt
Name weighting - class names appear 2-3× in chunks
Class feature extraction
Generate class metadata
~12 classes loaded (all core D&D classes)

✅ 2.4 Race Parser ⭐ NEW!

File: initialize_rag.py (inline loader with PDF extraction)

PDF extraction from Player's Handbook (pages 18-46)
Race traits extraction
Ability score bonuses
Name weighting - race names appear 2-3× in chunks
Create race chunks (description, traits)
Generate race metadata (ability_increases, size, speed, darkvision, languages)
~9 core races → 18 chunks

🚀 Phase 3: Initialization System ✅ COMPLETE

✅ 3.1 Master Init Script

File: initialize_rag.py

Command-line argument parsing
ChromaDB initialization
Collection creation/verification
Selective data loading (--only flag)
Clear existing data (--clear flag)
Progress reporting
Error handling and recovery
Summary statistics report
All 4 collections: spells, monsters, classes, races

✅ 3.2 Data Validation

Verify all source files present
Test full initialization
Benchmark loading times (~30s first run, ~5s subsequent)
600+ total chunks loaded

🔍 Phase 4: Query Interface ✅ COMPLETE

✅ 4.1 Interactive Query Tool ⭐ NEW!

File: query_rag.py

Interactive CLI mode
Single-query mode
Collection-specific search (--spell, --monster, --class, --race)
Search all collections
Result formatting with metadata
Relevance scores
Commands: /spell, /monster, /class, /race, /stats, /help, /quit
Beautiful formatted output

Usage:

python query_rag.py                    # Interactive mode
python query_rag.py "fireball"         # Quick search
python query_rag.py --monster "dragon" # Search monsters

🎮 Phase 5: GM Dialogue System ✅ COMPLETE

✅ 5.1 RAG-Enhanced GM

File: systems/gm_dialogue.py

RAG-powered rule lookups in real-time
GM searches ChromaDB for spells, monsters, classes
Ollama integration
Context window management
Session state management

✅ 5.2 Dialogue Manager

File: run_gm_dialogue.py

Interactive game session
Commands: /help, /context, /history, /rag, /save, /quit
Turn tracking
Scene state persistence

👤 Phase 6: Character Creation ✅ COMPLETE

✅ 6.1 Character Creator

File: systems/character_creator.py

Interactive CLI interface
Race selection with RAG lookup
Class selection with RAG lookup
Ability score generation (standard array, roll, point buy)
Background selection
Equipment selection
Spell selection (for casters)
Character validation
JSON export
Character sheet display

✅ 6.2 Character Management

File: create_character.py

Save/load character files
Character sheet viewer
Integration with RAG system

🧪 Phase 7: Testing & Validation ✅ COMPLETE ⭐ NEW!

✅ 7.1 Comprehensive Test Suite

File: test_all_collections.py

26+ automated tests
Name weighting validation - exact names rank first
Semantic search tests - related concepts found
Metadata extraction tests - CR, level, abilities validated
All 4 collections tested - spells, monsters, classes, races
Cross-collection search - multi-type queries
Pass/fail reporting with statistics
Detailed error messages

✅ 7.2 Manual Test Scripts

File: test_spell_search.py

Detailed search results for all collections
Distance/relevance scores
Metadata display
Preview of results

✅ 7.3 Integration Tests

Full initialization test
End-to-end query test
GM dialogue integration
Character creation flow

🎮 Phase 8: Game Mechanics Engine 🚧 IN PROGRESS

Goal: Transform AI from rule-maker to narrator by implementing programmatic game mechanics

✅ 8.0 Character-Aware Dialogue System ⭐ NEW!

File: play_with_character.py

Load or create characters for gameplay
Character context passed to GM (stats, equipment, spells)
Three character modes: Create new, Load JSON, Quick test
Commands: /character, /stats, /context for character info
Fixed tokenizer warning suppression
Dynamic character support (not hardcoded to one character)
Proper first/second person context ("The player is X" → AI uses "you")
Integration testing completed (Dec 1, 2024)

🚧 8.1 Spell System Enhancement

File: play_with_character.py (to be refactored)

Programmatic spell validation (check if player owns spell)
Spell slot tracking by level (1st: 3 slots, 2nd: 2 slots, etc.)
Auto-decrement slots when casting
Rest mechanic to restore slots
Spell lookup from RAG before AI narration

🚧 8.2 Combat Mechanics

Status: ⏳ Pending

HP tracking and damage application
Attack roll automation (d20 + modifiers)
Damage roll automation (weapon dice + STR/DEX)
AC checks (hit/miss determination)
Death saves and unconsciousness

🚧 8.3 Turn & Initiative System

Status: ⏳ Pending

Initiative roller (d20 + DEX modifier)
Turn order tracking
Action economy (action, bonus action, movement, reaction)
Combat state management

🚧 8.4 Inventory System

Status: ⏳ Pending

Add/remove items programmatically
Equipment weight tracking
Item usage validation
Gold/currency management

🚧 8.5 Integration & Testing

Status: ⏳ Pending

Test spell casting flow end-to-end
Test combat scenarios
Test inventory management
Update documentation

🏗️ Phase 8 Architecture Notes

Option B: Hybrid AI + Rules Engine (SELECTED)

Problem: AI is unreliable at following D&D rules consistently

Ignores spells player casts
Allows spells player doesn't know
Doesn't track resources (HP, spell slots)
Makes up mechanics on the fly

Solution: Intercept player actions BEFORE AI sees them

Flow:

Player Input: "I cast Magic Missile at the goblin"
Rules Engine (Python code):
- Parse: Detect spell casting intent
- Validate: Check if player owns "Magic Missile" ✓
- Validate: Check if player has 1st-level spell slot ✓
- Retrieve: Get spell details from RAG (3 darts, 1d4+1 each)
- Roll: 3d4+3 = 11 damage (programmatically)
- Deduct: Spell slot consumed
- Update: Target HP reduced by 11
AI Prompt: "You successfully cast Magic Missile dealing 11 force damage to the goblin. The goblin now has 5 HP remaining. Describe the magical missiles striking the goblin."
AI Response: (Just narrates the flavor, mechanics already handled)

Benefits:

AI becomes a narrator, not a rules engine
Mechanics are deterministic and accurate
AI can focus on storytelling
Players can trust the rules

Alternatives Rejected:

Option A (Pure AI): Too unreliable, tested and failed
Option C (Post-process AI): Too hard to fix bad outputs

📦 Supporting Files ✅ COMPLETE

✅ Dependencies

File: requirements.txt

chromadb
sentence-transformers
pdfplumber
ollama (Python client)
All dependencies working

✅ Documentation

README.md with full installation instructions
Quick start guide
Usage examples for all tools
Troubleshooting section
plan_progress.md (this file)

🎯 Success Metrics

Metric	Target	Current	Status
Init Time (full)	< 5 min	~30s	✅ Exceeded
Query Latency	< 500ms	~100-200ms	✅ Exceeded
Name Weighting	Exact match ranks #1	✅ Working	✅ Complete
Total Chunks	~600	612+	✅ Complete
Test Coverage	> 80%	26+ tests	✅ Complete
Collections	4 collections	4 active	✅ Complete

🎨 Key Features Implemented

✅ Name-Weighted Retrieval

Spells: Name appears 3× (SPELL: name, name, name)
Monsters: Name appears 3× (MONSTER: name, name, name) + type extraction
Classes: Name appears 3× (CLASS: name, name, name)
Races: Name appears 3× (RACE: name, name, name) + trait extraction

✅ Multiple Chunk Types Per Entity

Spells: full_spell, quick_reference, by_class
Monsters: monster_stats with type tags
Classes: class_features
Races: race_description, race_traits

✅ Rich Metadata

Spells: level, school, casting_time, range, components, duration, classes, ritual, concentration
Monsters: challenge_rating, monster_type (size + type), type tags
Classes: name, content_type
Races: ability_increases, size, speed, darkvision, languages

📝 Notes & Decisions

Design Decisions

Database: ChromaDB for persistence and semantic search
Embeddings: sentence-transformers/all-MiniLM-L6-v2 for speed/quality balance
LLM: Ollama with Qwen3-4B-RPG-Roleplay-V2 for D&D-tuned responses
Collection Strategy: Separate collections per content type for clean organization
Name Weighting: Entity names repeated 2-3× at chunk start for better exact-match retrieval
Multiple Chunks: Each entity creates multiple specialized chunks for different use cases

Key Improvements (Nov 6, 2024)

✅ Spell Parser Upgrade - Now uses sophisticated SpellParser class instead of inline code
✅ Name Weighting - All entity types now have weighted names for better retrieval
✅ Race Extraction - Full race data extracted from PDF with traits and metadata
✅ Monster Type Extraction - Automatic extraction of size and creature type
✅ Interactive Query Tool - New CLI for exploring the RAG system
✅ Comprehensive Tests - 26+ automated tests validating all functionality

Known Issues (Phase 8 Discovery - Dec 1, 2024)

AI Unreliability: Pure AI approach fails to consistently enforce D&D rules
- Ignores valid spell casts (Magic Missile cast was turned into melee combat)
- Allows invalid spells (Let Elara cast Fireball, which she doesn't know)
- No resource tracking (spell slots, HP, gold)
Solution: Moving to Hybrid Architecture (Option B) with programmatic rules engine

Current Work (Phase 8)

🚧 Spell Validation System: Programmatically check spell ownership before AI generation
🚧 Resource Tracking: HP, spell slots, inventory management
🚧 Combat Mechanics: Attack/damage rolls, initiative, turn tracking
🚧 Rules Engine: Intercept player actions, apply mechanics, then AI narrates

Future Enhancements (Post-Phase 8)

⏳ Subrace Support: High Elf, Mountain Dwarf, etc. with specific abilities
⏳ Advanced Filtering: Search by CR range, spell level range, class, type
⏳ Web UI: Web interface for GM dialogue
⏳ Multiplayer Support: Multi-player sessions
⏳ Custom Content Import: User-created monsters/spells
⏳ Voice Interface: Voice commands for GM dialogue
⏳ Map/Battle Grid Integration: Visual battle maps

📅 Timeline

Date	Milestone
2024-11-06 09:00	Project started, directory structure created
2024-11-06 12:00	Phase 1-2 complete (core infrastructure + basic parsers)
2024-11-06 15:00	Phase 3-6 complete (initialization, query, GM, character creator)
2024-11-06 18:00	Major upgrades: Name weighting, race extraction, comprehensive tests
2024-11-06 20:00	Phase 7 complete: All tests passing, documentation updated
2024-11-06 21:00	V1.0 COMPLETE - Production ready!
2024-12-01 18:00	Phase 8 started: Character-aware dialogue system created
2024-12-01 19:00	Testing reveals AI reliability issues with game mechanics
2024-12-01 19:30	Architecture decision: Option B (Hybrid Rules Engine) selected

🚀 Production Deployment Checklist

All 4 collections operational
Name weighting implemented for all entity types
Comprehensive test suite (26+ tests passing)
Interactive query tool
Documentation complete
GM dialogue system working
Character creator working
All dependencies installed
Error handling in place
Performance targets met

📊 Statistics

Collection Counts

Spells: 86 spells → 250+ chunks
Monsters: 332 monsters → 332 chunks
Classes: 12 classes → 12 chunks
Races: 9 races → 18 chunks
Total: ~612+ chunks in ChromaDB

Test Results

Total Tests: 26+
Pass Rate: 100%
Collections Tested: 4/4
Features Validated: Name weighting, semantic search, metadata extraction, cross-collection search

Status: 🚧 V2.0 IN DEVELOPMENT (Phase 8: Game Mechanics Engine) Current Focus: Implementing hybrid AI + Rules Engine architecture Next Steps:

Spell validation system
Spell slot tracking
HP and damage mechanics
Combat turn system
Inventory management

Last Updated: December 1, 2024 19:30