Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,89 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
datasets:
|
| 4 |
+
- sweatSmile/neet-biology-qa
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
base_model:
|
| 8 |
+
- distilbert/distilbert-base-uncased
|
| 9 |
+
pipeline_tag: question-answering
|
| 10 |
+
library_name: transformers
|
| 11 |
+
tags:
|
| 12 |
+
- neet
|
| 13 |
+
- biology
|
| 14 |
+
- exam
|
| 15 |
+
- bio
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
DistilBERT NEET Biology MCQ Classifier
|
| 19 |
+
|
| 20 |
+
This model is a fine-tuned version of DistilBERT (base uncased) specifically trained to classify the correct option for NEET-style multiple-choice biology questions. It selects the best answer among four choices (A, B, C, D).
|
| 21 |
+
|
| 22 |
+
-------------------------------------------------------------------------
|
| 23 |
+
Training Data
|
| 24 |
+
|
| 25 |
+
Source: sweatSmile / NEET Biology QA Dataset
|
| 26 |
+
|
| 27 |
+
Domain: NEET (Undergraduate Medical Entrance Exam) – Biology
|
| 28 |
+
|
| 29 |
+
Format: Each question has 4 options with one correct answer
|
| 30 |
+
|
| 31 |
+
Dataset Size: 793 questions
|
| 32 |
+
|
| 33 |
+
Split: 80% train / 20% validation
|
| 34 |
+
|
| 35 |
+
-------------------------------------------------------------------------
|
| 36 |
+
Training Configuration
|
| 37 |
+
|
| 38 |
+
Base Model: distilbert-base-uncased
|
| 39 |
+
|
| 40 |
+
Epochs: 10
|
| 41 |
+
|
| 42 |
+
Batch Size: 4
|
| 43 |
+
|
| 44 |
+
Learning Rate: 5e-5
|
| 45 |
+
|
| 46 |
+
Weight Decay: 0.01
|
| 47 |
+
|
| 48 |
+
Task Type: Multiple Choice Classification
|
| 49 |
+
|
| 50 |
+
-------------------------------------------------------------------------
|
| 51 |
+
Results
|
| 52 |
+
|
| 53 |
+
Validation Accuracy 72.96% (~73%)
|
| 54 |
+
|
| 55 |
+
Final Training Loss ~0.35
|
| 56 |
+
|
| 57 |
+
-------------------------------------------------------------------------
|
| 58 |
+
Limitations
|
| 59 |
+
|
| 60 |
+
Trained on a relatively small dataset (793 questions).
|
| 61 |
+
|
| 62 |
+
Limited to NEET-level biology content; not suitable for physics or chemistry.
|
| 63 |
+
|
| 64 |
+
Does not support:
|
| 65 |
+
|
| 66 |
+
Assertion-reasoning questions
|
| 67 |
+
|
| 68 |
+
Diagram-based questions
|
| 69 |
+
|
| 70 |
+
Paragraph/Case study type questions
|
| 71 |
+
|
| 72 |
+
-------------------------------------------------------------------------
|
| 73 |
+
Intended Use
|
| 74 |
+
|
| 75 |
+
Educational Research
|
| 76 |
+
|
| 77 |
+
AI-powered NEET Biology assistants
|
| 78 |
+
|
| 79 |
+
MCQ practice evaluation
|
| 80 |
+
|
| 81 |
+
Baseline model for future fine-tuning with larger datasets
|
| 82 |
+
|
| 83 |
+
-------------------------------------------------------------------------
|
| 84 |
+
NOTE:
|
| 85 |
+
|
| 86 |
+
Not recommended as a final exam-ready solution without further fine-tuning and validation.
|
| 87 |
+
|
| 88 |
+
-------------------------------------------------------------------------
|
| 89 |
+
License: MIT
|