πŸŽ“ Math Misconception Classifier (Llama-3 8B)

This model is fine-tuned to identify specific mathematical misconceptions in student explanations. It was developed as part of a Final Year Computer Science Major Project.

πŸš€ Model Details

  • Developed by: Priyanshu Bhusan
  • Model Type: Large Language Model (Fine-tuned for Classification)
  • Base Model: Meta Llama-3 8B (4-bit Quantized)
  • Task: Mapping student math misunderstandings to specific categories.
  • Language: English
  • Fine-tuning Technique: LoRA (Low-Rank Adaptation) via Unsloth.

πŸ“Š Performance Metrics

Based on the validation set (MAP Kaggle dataset):

  • Top-1 Accuracy: 88.00%
  • MAP@3 Score: 0.9090
  • Weighted F1-Score: 0.7448

πŸ› οΈ Training Setup

  • Hardware: NVIDIA L4/T4 GPU (Lightning AI Studio)
  • Optimization: Unsloth 4-bit kernels for memory efficiency.
  • Training Steps: 60 steps (Initial Trial)
  • Learning Rate: 2e-4 with Linear Scheduler.

🧠 How to Use

from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "vijigishu/math-misconception-llama-3",
    max_seq_length = 512,
    load_in_4bit = True,
)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for vijigishu/math-misconception-llama-3

Adapter
(306)
this model