AIO2025 Final Exam Dataset
Version: BUILD_V2_STRICT Total Records: 145 Compliance: PASS_10_10 (verified against Rules.md) Last Updated: 2026-06-12
Dataset Description
A high-quality Vietnamese AI/ML exam and quiz dataset from the AIO2025 program, covering modules on Data Science, Machine Learning, Deep Learning, Computer Vision, NLP, and MLOps.
Dataset Summary
| Metric | Value |
|---|---|
| Total Questions | 145 |
| Sources | vietnamese_exam, exam, quiz |
| Modules | 12 |
| Difficulty Levels | EASY=17, MEDIUM=127, HARD=1 |
Module Distribution
- Module_3A: 41 questions
- Module_4: 29 questions
- Module_5: 13 questions
- Module_6: 8 questions
- Module_7: 7 questions
- Module_8: 7 questions
- Module_9: 8 questions
- Module_10: 5 questions
- Module_11: 8 questions
- Module_12: 7 questions
- Module_13: 4 questions
- Module_14: 8 questions
Answer Distribution
| Answer | Count |
|---|---|
| A | 43 |
| B | 45 |
| C | 41 |
| D | 16 |
Dataset Structure
Each record contains:
| Field | Type | Description |
|---|---|---|
id |
string | Unique identifier |
question |
string | Question text (may contain code/math) |
option_a |
string | Option A |
option_b |
string | Option B |
option_c |
string | Option C |
option_d |
string | Option D |
answer |
string | Correct answer: A, B, C, or D |
answer_index |
integer | 0=A, 1=B, 2=C, 3=D |
source |
string | Source type: exam, quiz, vietnamese_exam |
module |
string | Module taxonomy |
difficulty |
string | EASY / MEDIUM / HARD |
pdf_page |
integer | Source PDF page number |
Module Taxonomy
| Module | Type | Description |
|---|---|---|
| Module_3A | vietnamese_exam | Vietnamese-format exam questions |
| Module_4 | exam | AIO25M04 exam |
| Module_5-10 | quiz | Quiz modules (qz_1 through qz_6) |
| Module_11-14 | quiz | Quiz modules (qz_7 through qz_10) |
Quality Assurance
- Compliance Score: 100% overall (Rules.md) -- benchmark verified
- Benchmark: PASS_10_10, C=0, M=0, m=0
- All questions have exactly 4 answer options (100% coverage)
- All answers are verified and consistent with options (145/145)
- No duplicate IDs
- No UNKNOWN difficulty labels
Usage
from datasets import load_dataset
ds = load_dataset("vudang449/AIO2025-Final-Exam-Dataset")
print(ds["train"][0])
Citation
@misc{AIO2025_Final_Exam,
title={AIO2025 Final Exam Dataset},
author={AIO2025},
year={2025},
url={https://huggingface.co/datasets/vudang449/AIO2025-Final-Exam-Dataset}
}
License
CC BY 4.0
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support