🎓 Math Misconception Classifier (Llama-3 8B)

This model is fine-tuned to identify specific mathematical misconceptions in student explanations. It was developed as part of a Final Year Computer Science Major Project.

🚀 Model Details

Developed by: Priyanshu Bhusan
Model Type: Large Language Model (Fine-tuned for Classification)
Base Model: Meta Llama-3 8B (4-bit Quantized)
Task: Mapping student math misunderstandings to specific categories.
Language: English
Fine-tuning Technique: LoRA (Low-Rank Adaptation) via Unsloth.

📊 Performance Metrics

Based on the validation set (MAP Kaggle dataset):

Top-1 Accuracy: 88.00%
MAP@3 Score: 0.9090
Weighted F1-Score: 0.7448

🛠️ Training Setup

Hardware: NVIDIA L4/T4 GPU (Lightning AI Studio)
Optimization: Unsloth 4-bit kernels for memory efficiency.
Training Steps: 60 steps (Initial Trial)
Learning Rate: 2e-4 with Linear Scheduler.

🧠 How to Use

from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "vijigishu/math-misconception-llama-3",
    max_seq_length = 512,
    load_in_4bit = True,
)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vijigishu/math-misconception-llama-3

Base model

meta-llama/Meta-Llama-3-8B

Quantized

unsloth/llama-3-8b-bnb-4bit

Adapter

(310)

this model