Spaces:

ayushsaun
/

AutoGrader

Sleeping

App Files Files Community

ayushsaun commited on Jan 26

Commit

b340140

1 Parent(s): 39445dc

Initial AutoGrader setup

Browse files

Files changed (5) hide show

.gitattributes +0 -35
README.md +62 -7
app.py +52 -0
main.py +115 -0
requirements.txt +4 -0

.gitattributes DELETED Viewed

@@ -1,35 +0,0 @@
-*.7z filter=lfs diff=lfs merge=lfs -text
-*.arrow filter=lfs diff=lfs merge=lfs -text
-*.bin filter=lfs diff=lfs merge=lfs -text
-*.bz2 filter=lfs diff=lfs merge=lfs -text
-*.ckpt filter=lfs diff=lfs merge=lfs -text
-*.ftz filter=lfs diff=lfs merge=lfs -text
-*.gz filter=lfs diff=lfs merge=lfs -text
-*.h5 filter=lfs diff=lfs merge=lfs -text
-*.joblib filter=lfs diff=lfs merge=lfs -text
-*.lfs.* filter=lfs diff=lfs merge=lfs -text
-*.mlmodel filter=lfs diff=lfs merge=lfs -text
-*.model filter=lfs diff=lfs merge=lfs -text
-*.msgpack filter=lfs diff=lfs merge=lfs -text
-*.npy filter=lfs diff=lfs merge=lfs -text
-*.npz filter=lfs diff=lfs merge=lfs -text
-*.onnx filter=lfs diff=lfs merge=lfs -text
-*.ot filter=lfs diff=lfs merge=lfs -text
-*.parquet filter=lfs diff=lfs merge=lfs -text
-*.pb filter=lfs diff=lfs merge=lfs -text
-*.pickle filter=lfs diff=lfs merge=lfs -text
-*.pkl filter=lfs diff=lfs merge=lfs -text
-*.pt filter=lfs diff=lfs merge=lfs -text
-*.pth filter=lfs diff=lfs merge=lfs -text
-*.rar filter=lfs diff=lfs merge=lfs -text
-*.safetensors filter=lfs diff=lfs merge=lfs -text
-saved_model/**/* filter=lfs diff=lfs merge=lfs -text
-*.tar.* filter=lfs diff=lfs merge=lfs -text
-*.tar filter=lfs diff=lfs merge=lfs -text
-*.tflite filter=lfs diff=lfs merge=lfs -text
-*.tgz filter=lfs diff=lfs merge=lfs -text
-*.wasm filter=lfs diff=lfs merge=lfs -text
-*.xz filter=lfs diff=lfs merge=lfs -text
-*.zip filter=lfs diff=lfs merge=lfs -text
-*.zst filter=lfs diff=lfs merge=lfs -text
-*tfevents* filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,14 +1,69 @@
 ---
 title: AutoGrader
-emoji: 🚀
-colorFrom: red
-colorTo: pink
 sdk: gradio
-sdk_version: 6.4.0
 app_file: app.py
 pinned: false
-license: apache-2.0
-short_description: CPU-only LLM autograder
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: AutoGrader
+emoji: 🧠
+colorFrom: indigo
+colorTo: purple
 sdk: gradio
+sdk_version: 4.44.0
 app_file: app.py
 pinned: false
 ---
+# AutoGrader
+AutoGrader is a **CPU-only, LLM-based academic grading system** that evaluates student submissions using a provided **question paper and rubric**, while also awarding marks for **logically correct alternative solutions**.
+## Key Features
+- Runs entirely on **CPU** (Hugging Face Spaces compatible)
+- **Rubric-aware** grading with flexibility for alternative correct answers
+- **Prompt-controlled evaluation** (e.g. grade only Q2, grade Q2 & Q4, custom marks)
+- **Multiple model options** (user-selectable)
+- **Less deterministic grading** via controlled sampling
+- **Structured JSON output** for reliable parsing
+- Works via **Hugging Face API** (can be called from Kaggle or other platforms)
+## Supported Models
+- Phi-3-mini (fast, CPU-friendly)
+- Mistral-7B-Instruct (higher quality, slower on CPU)
+Model weights are **not stored in this repository** and are automatically downloaded from the Hugging Face Hub at runtime.
+## How It Works
+1. The student submission is first analyzed to extract key ideas.
+2. A second evaluation step grades the answer using the rubric and grading instructions.
+3. Marks are assigned fairly, even for solutions not explicitly listed in the rubric.
+4. Output is returned as **strict JSON**.
+## Input Fields
+- **Question Paper**
+- **Rubric**
+- **Grading Instruction**
+  Example:
+  `Grade only Question 2 out of 20 marks`
+- **Student Submission**
+## Output
+A structured JSON containing:
+- Total marks
+- Per-question marks
+- Short justification
+## Limitations
+- CPU-only inference means **higher latency** for larger models
+- LLM-based grading is **not fully deterministic**
+- Designed for academic assistance, not high-stakes automated grading without review
+## License & Usage
+This project uses open-source models from the Hugging Face Hub.
+Please ensure model licenses are respected when deploying or redistributing.
+---
+Built for flexible, research-oriented automated assessment.

app.py ADDED Viewed

	@@ -0,0 +1,52 @@

+import gradio as gr
+from main import grade_submission, MODEL_MAP
+with gr.Blocks() as demo:
+    gr.Markdown("## AutoGrader (CPU-only, Rubric-Aware, Flexible)")
+    model = gr.Dropdown(
+        choices=list(MODEL_MAP.keys()),
+        value="Phi-3-mini",
+        label="Select Model"
+    )
+    question_paper = gr.Textbox(
+        label="Question Paper",
+        lines=5
+    )
+    rubric = gr.Textbox(
+        label="Rubric",
+        lines=5
+    )
+    grading_instruction = gr.Textbox(
+        label="Grading Instruction (e.g. 'Grade only Q2 out of 20')",
+        lines=2
+    )
+    student_answer = gr.Textbox(
+        label="Student Submission",
+        lines=6
+    )
+    output = gr.Textbox(
+        label="Grading Output (JSON)",
+        lines=12
+    )
+    grade = gr.Button("Grade")
+    grade.click(
+        fn=grade_submission,
+        inputs=[
+            model,
+            question_paper,
+            rubric,
+            student_answer,
+            grading_instruction
+        ],
+        outputs=output
+    )
+demo.launch()

main.py ADDED Viewed

	@@ -0,0 +1,115 @@

+import json
+import re
+from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
+MODEL_MAP = {
+    "Phi-3-mini": "microsoft/Phi-3-mini-4k-instruct",
+    "Mistral-7B-Instruct": "mistralai/Mistral-7B-Instruct-v0.2",
+}
+_pipelines = {}
+def load_pipeline(model_name):
+    if model_name in _pipelines:
+        return _pipelines[model_name]
+    model_id = MODEL_MAP[model_name]
+    tokenizer = AutoTokenizer.from_pretrained(model_id)
+    model = AutoModelForCausalLM.from_pretrained(
+        model_id,
+        device_map="cpu",
+        torch_dtype="auto"
+    )
+    pipe = pipeline(
+        "text-generation",
+        model=model,
+        tokenizer=tokenizer,
+        max_new_tokens=600,
+        temperature=0.5,
+        top_p=0.9,
+        do_sample=True
+    )
+    _pipelines[model_name] = pipe
+    return pipe
+def extract_json(text):
+    match = re.search(r"\{[\s\S]*\}", text)
+    if not match:
+        return None
+    try:
+        return json.loads(match.group())
+    except:
+        return None
+def grade_submission(
+    model_name,
+    question_paper,
+    rubric,
+    student_answer,
+    grading_instruction
+):
+    pipe = load_pipeline(model_name)
+    understanding_prompt = f"""
+Read the student submission and extract the key ideas and steps used to answer the questions.
+Student Submission:
+{student_answer}
+Output STRICT JSON:
+{{
+  "key_points": "concise summary of the student's approach and ideas"
+}}
+"""
+    understanding_raw = pipe(understanding_prompt)[0]["generated_text"]
+    understanding = extract_json(understanding_raw)
+    if understanding is None:
+        understanding = {"key_points": "Unable to reliably extract"}
+    grading_prompt = f"""
+You are an academic autograder.
+Question Paper:
+{question_paper}
+Rubric:
+{rubric}
+Grading Instruction:
+{grading_instruction}
+Student Key Points:
+{understanding["key_points"]}
+Rules:
+- Follow the rubric
+- Award marks for logically correct alternative solutions
+- Do not penalize different notation or ordering
+- Grade only what is requested
+- Be fair and consistent
+Output STRICT JSON ONLY:
+{{
+  "total_marks": number,
+  "per_question": {{
+    "Q1": number,
+    "Q2": number
+  }},
+  "reasoning": "short justification"
+}}
+"""
+    grading_raw = pipe(grading_prompt)[0]["generated_text"]
+    grading = extract_json(grading_raw)
+    if grading is None:
+        return json.dumps({
+            "error": "Failed to generate valid grading output"
+        }, indent=2)
+    return json.dumps(grading, indent=2)

requirements.txt ADDED Viewed

	@@ -0,0 +1,4 @@

+torch
+transformers
+accelerate
+gradio