quiz-generator-v3 / README.md
ecuartasm's picture
Rename app title to Quiz Generator V3
b848ebd
---
title: Quiz Generator V3
emoji: πŸ“š
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.32.1
app_file: app.py
pinned: false
license: apache-2.0
---
# AI Course Assessment Generator
An AI-powered tool that creates learning objectives and multiple-choice quiz questions from course materials. Supports both **automatic generation** from uploaded content and **manual entry** of learning objectives, producing fully enriched outputs with correct and incorrect answer suggestions ready for quiz generation.
---
## Features
### Tab 1 β€” Generate Learning Objectives
**Two modes of operation:**
- **Generate from course materials** β€” Upload course files and let the AI extract and generate learning objectives automatically through a multi-run, multi-stage pipeline.
- **Use my own learning objectives** β€” Enter your own learning objectives in a text field (one per line). The app searches the uploaded course materials for relevant source references, generates a correct answer for each objective, and produces incorrect answer options β€” the same full pipeline as automatic generation.
**Always-visible controls:**
- Mode selector (Generate / Use my own)
- Upload Course Materials
- Number of Learning Objectives per Run *(generate mode)* / Learning Objectives text field *(manual mode)*
- Generate Learning Objectives / Process Learning Objectives button
- Generate all button *(works in both modes)*
**Advanced Options** *(collapsible, closed by default):*
- Number of Generation Runs
- Model
- Model for Incorrect Answer Suggestions
- Temperature
**Shared capabilities (both modes):**
- All output in the same JSON format, ready to feed directly into Tab 2
- "Generate all" button runs the full end-to-end pipeline (learning objectives β†’ quiz questions) in a single click, in either mode
### Tab 2 β€” Generate Questions
- Takes the learning objectives JSON produced in Tab 1 as input
- Generates multiple-choice questions with 4 options, per-option feedback, and source references
- Automatic ranking and grouping of generated questions by quality
- Outputs: ranked best-in-group questions, all grouped questions, and a human-readable formatted quiz
**Always-visible controls:**
- Learning Objectives JSON input
- Number of questions
- Generate Questions button
**Advanced Options** *(collapsible, closed by default):*
- Model
- Temperature
- Number of Question Generation Runs
### Tab 3 β€” Propose / Edit Question
- Load the formatted quiz from Tab 2 or upload a `.md` / `.yml` quiz file *(file upload is inside a collapsible section)*
- Review and edit questions one at a time with Previous / Accept & Next navigation
- Download the final edited quiz
---
## Generation Pipeline (Learning Objectives)
### Automatic generation mode
1. **Content extraction** β€” Uploads are parsed (`.vtt`, `.srt`, `.ipynb`, `.md`) and wrapped with XML source tags for full traceability
2. **Multi-run base generation** β€” Multiple independent runs produce candidate objectives (Bloom's taxonomy aware, one action verb, multiple-choice assessable)
3. **Correct answer generation** β€” A concise correct answer (~20 words) is generated for each objective from the course content
4. **Grouping & ranking** β€” Similar objectives are clustered; the best representative in each group is selected
5. **Incorrect answer generation** β€” Three plausible distractors are generated for each best-in-group objective, matching the correct answer in length, style, and complexity
6. **Iterative improvement** β€” Each distractor is evaluated and regenerated until it meets quality standards
### User-provided objectives mode
1. **Objective parsing** β€” Text is split by newlines; common leading labels are stripped automatically:
- Numbered: `1.`, `2)`, `3:`
- Lettered: `a.`, `b)`, `c:`
- Plain (no label)
2. **Source finding** β€” For each objective, the LLM searches the uploaded course materials to identify the most relevant source file(s)
3. **Correct answer generation** β€” Same function as the automatic flow, grounded in the course content
4. **Incorrect answer generation** β€” Same three-distractor generation as automatic flow
5. **Iterative improvement** β€” Same quality improvement loop
6. All objectives are treated as best-in-group (the user has already curated them), so no grouping/filtering step is applied
**Example accepted input formats:**
```
Identify key upstream and downstream collaborators for data engineers
Identify the stages of the data engineering lifecycle
Articulate a mental framework for building data engineering solutions
```
```
1. Identify key upstream and downstream collaborators for data engineers
2. Identify the stages of the data engineering lifecycle
3. Articulate a mental framework for building data engineering solutions
```
```
a. Identify key upstream and downstream collaborators for data engineers
b. Identify the stages of the data engineering lifecycle
c. Articulate a mental framework for building data engineering solutions
```
---
## Setup
### Prerequisites
- Python 3.12 (recommended) or 3.8+
- An OpenAI API key
### Installation
**Using uv (recommended):**
```bash
uv venv -p 3.12
source .venv/bin/activate # Windows: .venv\Scripts\activate
uv pip install -r requirements.txt
```
**Using pip:**
```bash
pip install -r requirements.txt
```
### Environment variables
Create a `.env` file in the project root:
```
OPENAI_API_KEY=your_api_key_here
```
---
## Running the app
```bash
python app.py
```
Opens the Gradio interface at [http://127.0.0.1:7860](http://127.0.0.1:7860).
---
## Supported file formats
| Format | Description |
|--------|-------------|
| `.vtt` | WebVTT subtitle files (timestamps stripped) |
| `.srt` | SRT subtitle files (timestamps stripped) |
| `.ipynb` | Jupyter notebooks (markdown and code cells extracted) |
| `.md` | Markdown files |
All content is wrapped with XML source tags (`<source file="filename">…</source>`) so every generated objective and question can be traced back to its origin file.
---
## Project structure
```
quiz_generator_ECM/
β”‚
β”œβ”€β”€ app.py # Entry point β€” loads .env and launches Gradio
β”‚
β”œβ”€β”€ models/ # Pydantic data models
β”‚ β”œβ”€β”€ learning_objectives.py # BaseLearningObjective β†’ LearningObjective β†’ Grouped*
β”‚ β”œβ”€β”€ questions.py # MultipleChoiceQuestion β†’ Ranked* β†’ Grouped*
β”‚ β”œβ”€β”€ assessment.py # Assessment (objectives + questions)
β”‚ └── config.py # Model list and temperature availability map
β”‚
β”œβ”€β”€ prompts/ # Reusable prompt components
β”‚ β”œβ”€β”€ learning_objectives.py # Bloom's taxonomy, quality standards, examples
β”‚ β”œβ”€β”€ incorrect_answers.py # Distractor guidelines and examples
β”‚ β”œβ”€β”€ questions.py # Question and answer quality standards
β”‚ └── all_quality_standards.py # General quality standards
β”‚
β”œβ”€β”€ learning_objective_generator/ # Learning objective pipeline
β”‚ β”œβ”€β”€ generator.py # LearningObjectiveGenerator orchestrator
β”‚ β”œβ”€β”€ base_generation.py # Base generation, correct answers, source finding
β”‚ β”œβ”€β”€ enhancement.py # Incorrect answer generation
β”‚ β”œβ”€β”€ grouping_and_ranking.py # Similarity grouping and best-in-group selection
β”‚ └── suggestion_improvement.py # Iterative distractor quality improvement
β”‚
β”œβ”€β”€ quiz_generator/ # Question generation pipeline
β”‚ β”œβ”€β”€ generator.py # QuizGenerator orchestrator
β”‚ β”œβ”€β”€ question_generation.py # Multiple-choice question generation
β”‚ β”œβ”€β”€ question_improvement.py # Question quality assessment and improvement
β”‚ β”œβ”€β”€ question_ranking.py # Ranking and grouping of questions
β”‚ β”œβ”€β”€ feedback_questions.py # Feedback-based question regeneration
β”‚ └── assessment.py # Assessment compilation and export
β”‚
└── ui/ # Gradio interface and handlers
β”œβ”€β”€ app.py # UI layout, mode toggle, event wiring
β”œβ”€β”€ objective_handlers.py # Handlers for both objective modes + Generate all
β”œβ”€β”€ question_handlers.py # Question generation handler
β”œβ”€β”€ content_processor.py # File parsing and XML source tagging
β”œβ”€β”€ edit_handlers.py # Question editing flow (Tab 3)
β”œβ”€β”€ formatting.py # Quiz formatting for UI display
β”œβ”€β”€ state.py # Global state (file contents, objectives)
└── run_manager.py # Run tracking and output saving
```
---
## Data models
Learning objectives progress through these stages:
```
BaseLearningObjectiveWithoutCorrectAnswer
└─ id, learning_objective, source_reference
↓
BaseLearningObjective
└─ + correct_answer
↓
LearningObjective (output of Tab 1, input to Tab 2)
└─ + incorrect_answer_options, in_group, group_members, best_in_group
```
Questions follow an equivalent progression:
```
MultipleChoiceQuestion
└─ id, question_text, options (text + is_correct + feedback),
learning_objective_id, correct_answer, source_reference
↓
RankedMultipleChoiceQuestion
└─ + rank, ranking_reasoning, in_group, group_members, best_in_group
```
---
## Model configuration
Default model: `gpt-5.2`
Default temperature: `1.0` (ignored for models that do not support it, such as `o1`, `o3-mini`, `gpt-5`, `gpt-5.1`, `gpt-5.2`)
You can set different models for the main generation step and the incorrect answer suggestion step, which is useful for using a more creative model for distractors.
---
## Requirements
| Package | Version |
|---------|---------|
| Python | 3.8+ (3.12 recommended) |
| gradio | 4.19.2+ |
| pydantic | 2.8.0+ |
| openai | 1.52.0+ |
| nbformat | 5.9.2+ |
| instructor | 1.7.9+ |
| python-dotenv | 1.0.0+ |