# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Development Commands ### Running the Application ```bash python app.py ``` The application will start a Gradio web interface at http://127.0.0.1:7860 ### Environment Setup ```bash # Using uv (recommended) uv venv -p 3.12 source .venv/bin/activate # On Windows: .venv\Scripts\activate uv pip install -r requirements.txt # Using pip pip install -r requirements.txt ``` ### Environment Variables Create a `.env` file with: ``` OPENAI_API_KEY=your_api_key_here ``` ## Architecture Overview This is an AI Course Assessment Generator that creates learning objectives and multiple-choice questions from course materials. The system uses OpenAI's language models with structured output generation via the `instructor` library. ### Core Workflow 1. **Content Processing**: Upload course materials (.vtt, .srt, .ipynb) → Extract and tag content with XML source references 2. **Learning Objective Generation**: Generate base objectives → Group and rank → Enhance with incorrect answer suggestions 3. **Question Generation**: Create multiple-choice questions from objectives → Quality assessment → Ranking and grouping 4. **Assessment Export**: Save final assessment to JSON format ### Key Architecture Patterns **Modular Prompt System**: The `prompts/` directory contains reusable prompt components that are imported and combined in generation modules. This allows for consistent quality standards across different generation tasks. **Orchestrator Pattern**: Both `LearningObjectiveGenerator` and `QuizGenerator` act as orchestrators that coordinate calls to specialized generation functions rather than implementing generation logic directly. **Structured Output**: All LLM interactions use Pydantic models with the `instructor` library to ensure consistent, validated output formats. **Source Tracking**: Content is wrapped in XML tags (e.g., `content`) throughout the pipeline to maintain traceability from source files to generated questions. ## Key Components ### Main Generators - `LearningObjectiveGenerator` (`learning_objective_generator/generator.py`): Orchestrates learning objective generation, grouping, and enhancement - `QuizGenerator` (`quiz_generator/generator.py`): Orchestrates question generation, quality assessment, and ranking ### Data Models (`models/`) - Learning objectives progress from `BaseLearningObjective` → `LearningObjective` (with incorrect answers) → `GroupedLearningObjective` - Questions progress from `MultipleChoiceQuestion` → `RankedMultipleChoiceQuestion` → `GroupedMultipleChoiceQuestion` - Final output is an `Assessment` containing both objectives and questions ### Generation Pipeline 1. **Base Generation**: Create initial learning objectives from content 2. **Grouping & Ranking**: Group similar objectives and select best in each group 3. **Enhancement**: Add incorrect answer suggestions to selected objectives 4. **Question Generation**: Create multiple-choice questions with feedback 5. **Quality Assessment**: Use LLM judge to evaluate question quality 6. **Final Ranking**: Rank and group questions for output ### UI Structure (`ui/`) - `app.py`: Gradio interface with tabs for objectives, questions, and export - Handler modules process user interactions and coordinate with generators - State management tracks data between UI components ## Development Notes ### Model Configuration - Default model: `gpt-5` with temperature `1.0` - Separate model selection for incorrect answer generation (typically `o1`) - Quality assessment often uses `gpt-5-mini` for cost efficiency ### Content Processing - Supports `.vtt/.srt` subtitle files and `.ipynb` Jupyter notebooks - All content is tagged with XML source references for traceability - Content processor handles multiple file formats uniformly ### Quality Standards The system enforces educational quality through modular prompt components: - General quality standards apply to all generated content - Specific standards for questions, correct answers, and incorrect answers - Bloom's taxonomy integration for appropriate learning levels - Example-based prompting for consistency