dalinstone commited on
Commit
c95db0c
Β·
1 Parent(s): 4d436a8

this adds the main file

Browse files
Files changed (3) hide show
  1. main_grader.py +334 -0
  2. readme.md +6 -7
  3. requirements.txt +6 -0
main_grader.py ADDED
@@ -0,0 +1,334 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # main_grader.py
2
+
3
+ import gradio as gr
4
+ import docx
5
+ import asyncio
6
+ from concurrent.futures import ThreadPoolExecutor
7
+ from google import generativeai as genai
8
+ import json
9
+ from dataclasses import dataclass, field
10
+ from typing import List, Dict, Optional
11
+ import os
12
+ from pathlib import Path
13
+ import time
14
+
15
+ # --- Configuration and Prompts ---
16
+
17
+ # This rubric is embedded in the prompt sent to the Gemini API.
18
+ GRADING_RUBRIC = """
19
+ ### NURSING ESSAY GRADING RUBRIC
20
+
21
+ 1. **Content & Analysis (40 points):**
22
+ * **Thesis/Argument:** Clear, focused, and relevant thesis statement. (10 points)
23
+ * **Evidence & Support:** Strong use of credible, evidence-based sources to support claims. (15 points)
24
+ * **Critical Thinking:** Demonstrates deep analysis, synthesis of ideas, and understanding of complex health science concepts. (15 points)
25
+
26
+ 2. **Organization & Structure (20 points):**
27
+ * **Introduction:** Engaging introduction with a clear purpose and thesis. (5 points)
28
+ * **Body Paragraphs:** Logical flow, clear topic sentences, and well-developed paragraphs. (10 points)
29
+ * **Conclusion:** Effectively summarizes the argument and provides a sense of closure. (5 points)
30
+
31
+ 3. **APA Formatting & Citations (25 points):**
32
+ * **Reference List:** Correctly formatted according to APA 7th edition. (10 points)
33
+ * **In-Text Citations:** Accurate and correctly placed in-text citations. (10 points)
34
+ * **General Formatting:** Correct title page, running head, font, and margins. (5 points)
35
+
36
+ 4. **Clarity & Mechanics (15 points):**
37
+ * **Grammar & Spelling:** Free of significant errors in grammar, punctuation, and spelling. (5 points)
38
+ * **Sentence Structure:** Clear, varied, and concise sentence structure. (5 points)
39
+ * **Professional Tone:** Maintains a scholarly and professional tone appropriate for nursing. (5 points)
40
+ """
41
+
42
+ # The main prompt for the Gemini API.
43
+ # It instructs the AI on its role, the rubric, and the required JSON output format.
44
+ GEMINI_PROMPT = f"""
45
+ You are an expert-level college university grader for a nursing department. Your task is to evaluate a short health science essay based on a strict rubric and provide your feedback in a structured JSON format.
46
+
47
+ **DO NOT** provide any introductory text, conversational pleasantries, or explanations outside of the requested JSON structure. Your entire response must be a single, valid JSON object.
48
+
49
+ **Use the following rubric to grade the essay:**
50
+ {GRADING_RUBRIC}
51
+
52
+ **Instructions:**
53
+ 1. Read the entire essay provided below.
54
+ 2. Assess the essay against each category in the rubric.
55
+ 3. Calculate the total points lost and the final grade out of 100.
56
+ 4. Provide brief, specific comments explaining why points were deducted in each category.
57
+ 5. Write a 2-3 sentence summary of the overall grade.
58
+ 6. Format your entire output as a single JSON object with the following keys and value types:
59
+ - `finalGrade`: (Integer) The final score from 0-100.
60
+ - `pointDeductions`: (Object) An object where keys are the main rubric categories ("Content & Analysis", "Organization & Structure", "APA Formatting & Citations", "Clarity & Mechanics") and values are the integer number of points lost for that category.
61
+ - `feedback`: (Object) An object with the same keys as `pointDeductions`, where values are brief string comments explaining the point deductions for that category. If no points are lost, the comment should be "No points deducted."
62
+ - `summary`: (String) A 2-3 sentence summary of the paper's performance and the rationale for the grade.
63
+
64
+ **Example of the required JSON output format:**
65
+ {{
66
+ "finalGrade": 88,
67
+ "pointDeductions": {{
68
+ "Content & Analysis": 2,
69
+ "Organization & Structure": 0,
70
+ "APA Formatting & Citations": 8,
71
+ "Clarity & Mechanics": 2
72
+ }},
73
+ "feedback": {{
74
+ "Content & Analysis": "The thesis was slightly unfocused, but the evidence used was strong.",
75
+ "Organization & Structure": "No points deducted.",
76
+ "APA Formatting & Citations": "Multiple errors in the reference list formatting and three missing in-text citations.",
77
+ "Clarity & Mechanics": "Minor grammatical errors and occasional awkward phrasing."
78
+ }},
79
+ "summary": "This is a strong paper with excellent critical analysis. The final grade was primarily impacted by significant APA formatting errors, which should be the main focus for improvement."
80
+ }}
81
+
82
+ ---
83
+ **ESSAY TO GRADE:**
84
+
85
+ """
86
+
87
+
88
+ # --- Data Structures ---
89
+
90
+ @dataclass
91
+ class GradingResult:
92
+ """Holds the structured result of a single graded essay."""
93
+ file_name: str
94
+ success: bool
95
+ grade: Optional[int] = None
96
+ deductions: Dict[str, int] = field(default_factory=dict)
97
+ feedback: Dict[str, str] = field(default_factory=dict)
98
+ summary: Optional[str] = None
99
+ error_message: Optional[str] = None
100
+
101
+
102
+ # --- Core Logic Classes ---
103
+
104
+ class EssayParser:
105
+ """Parses text content from a .docx file."""
106
+ @staticmethod
107
+ def parse_docx(file_path: str) -> str:
108
+ """Extracts all text from a Word document."""
109
+ try:
110
+ doc = docx.Document(file_path)
111
+ return "\n".join([para.text for para in doc.paragraphs if para.text])
112
+ except Exception as e:
113
+ # Handles cases where the file is corrupted or not a valid docx
114
+ raise IOError(f"Could not read file: {os.path.basename(file_path)}. Error: {e}")
115
+
116
+
117
+ class GeminiGrader:
118
+ """Manages interaction with the Google Gemini API for grading."""
119
+ def __init__(self, api_key: str):
120
+ """Initializes the Gemini model."""
121
+ try:
122
+ genai.configure(api_key=api_key)
123
+ # Configuration for safer, more deterministic output
124
+ generation_config = {
125
+ "temperature": 0.1,
126
+ "top_p": 0.95,
127
+ "top_k": 40,
128
+ }
129
+ # Safety settings to prevent the model from refusing to grade
130
+ safety_settings = [
131
+ {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
132
+ {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
133
+ {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
134
+ {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
135
+ ]
136
+ self.model = genai.GenerativeModel(
137
+ model_name="gemini-1.5-pro-latest",
138
+ generation_config=generation_config,
139
+ safety_settings=safety_settings
140
+ )
141
+ except Exception as e:
142
+ raise ValueError(f"Failed to configure Gemini API: {e}")
143
+
144
+ def grade_essay(self, essay_text: str, file_name: str) -> GradingResult:
145
+ """
146
+ Sends the essay to Gemini for grading and parses the JSON response.
147
+ This is a synchronous method designed to be run in a thread pool.
148
+ """
149
+ prompt_with_essay = f"{GEMINI_PROMPT}\n{essay_text}"
150
+ try:
151
+ response = self.model.generate_content(prompt_with_essay)
152
+ # Clean the response to ensure it's valid JSON
153
+ cleaned_response = response.text.strip().replace("```json", "").replace("```", "")
154
+ data = json.loads(cleaned_response)
155
+
156
+ # Validate the structure of the returned JSON
157
+ required_keys = ["finalGrade", "pointDeductions", "feedback", "summary"]
158
+ if not all(key in data for key in required_keys):
159
+ raise KeyError("The model's response was missing one or more required keys.")
160
+
161
+ return GradingResult(
162
+ file_name=file_name,
163
+ success=True,
164
+ grade=data["finalGrade"],
165
+ deductions=data["pointDeductions"],
166
+ feedback=data["feedback"],
167
+ summary=data["summary"]
168
+ )
169
+ except json.JSONDecodeError:
170
+ return GradingResult(
171
+ file_name=file_name,
172
+ success=False,
173
+ error_message="Failed to parse the model's response. The output was not valid JSON."
174
+ )
175
+ except Exception as e:
176
+ return GradingResult(
177
+ file_name=file_name,
178
+ success=False,
179
+ error_message=f"An API or model error occurred: {str(e)}"
180
+ )
181
+
182
+
183
+ # --- Gradio Application ---
184
+
185
+ async def grade_papers_concurrently(
186
+ files: List[gr.File], api_key: str, progress=gr.Progress(track_tqdm=True)
187
+ ) -> (str, str):
188
+ """
189
+ The main asynchronous function that orchestrates the grading process.
190
+ It's triggered by the Gradio button click.
191
+ """
192
+ start_time = time.time()
193
+
194
+ if not api_key:
195
+ raise gr.Error("Google API Key is required.")
196
+ if not files:
197
+ raise gr.Error("Please upload at least one Word document.")
198
+
199
+ try:
200
+ grader = GeminiGrader(api_key)
201
+ except ValueError as e:
202
+ raise gr.Error(str(e))
203
+
204
+ file_paths = [file.name for file in files]
205
+ total_files = len(file_paths)
206
+
207
+ # Use a ThreadPoolExecutor to run synchronous tasks concurrently
208
+ with ThreadPoolExecutor(max_workers=10) as executor:
209
+ # Create a future for each file processing task
210
+ loop = asyncio.get_event_loop()
211
+ tasks = [
212
+ loop.run_in_executor(
213
+ executor,
214
+ process_single_file,
215
+ file_path,
216
+ grader
217
+ )
218
+ for file_path in file_paths
219
+ ]
220
+
221
+ results = []
222
+ # Process results as they are completed
223
+ for i, future in enumerate(asyncio.as_completed(tasks)):
224
+ progress(i + 1, desc=f"Grading paper {i+1}/{total_files}...")
225
+ result = await future
226
+ results.append(result)
227
+
228
+ # --- Format the final output ---
229
+ successful_grades = [res for res in results if res.success]
230
+ failed_grades = [res for res in results if not res.success]
231
+
232
+ output_markdown = ""
233
+ for result in successful_grades:
234
+ output_markdown += f"### βœ… Grade for: **{result.file_name}**\n"
235
+ output_markdown += f"**Final Grade:** {result.grade}/100\n\n"
236
+
237
+ # Format point deductions
238
+ deductions_str = ""
239
+ for category, points in result.deductions.items():
240
+ if points > 0:
241
+ deductions_str += f"- **{category}:** Lost {points} points. *Reason: {result.feedback.get(category, 'N/A')}*\n"
242
+ if not deductions_str:
243
+ deductions_str = "Excellent work! No points were deducted.\n"
244
+
245
+ output_markdown += "**Point Deductions Breakdown:**\n" + deductions_str + "\n"
246
+ output_markdown += f"**Summary:** {result.summary}\n"
247
+ output_markdown += "---\n"
248
+
249
+ if failed_grades:
250
+ output_markdown += "### ❌ Failed Papers\n"
251
+ for result in failed_grades:
252
+ output_markdown += f"- **File:** {result.file_name}\n"
253
+ output_markdown += f" - **Error:** {result.error_message}\n"
254
+ output_markdown += "---\n"
255
+
256
+ end_time = time.time()
257
+ runtime = f"Total runtime: {end_time - start_time:.2f} seconds."
258
+
259
+ status = (
260
+ f"Grading complete. {len(successful_grades)} papers graded successfully, "
261
+ f"{len(failed_grades)} failed."
262
+ )
263
+
264
+ return output_markdown, f"{status}\n{runtime}"
265
+
266
+
267
+ def process_single_file(file_path: str, grader: GeminiGrader) -> GradingResult:
268
+ """
269
+ Synchronous wrapper function to parse and grade one file.
270
+ This function is what runs in each thread of the ThreadPoolExecutor.
271
+ """
272
+ file_name = os.path.basename(file_path)
273
+ try:
274
+ essay_text = EssayParser.parse_docx(file_path)
275
+ if not essay_text.strip():
276
+ return GradingResult(
277
+ file_name=file_name,
278
+ success=False,
279
+ error_message="The document is empty or contains no readable text."
280
+ )
281
+ return grader.grade_essay(essay_text, file_name)
282
+ except Exception as e:
283
+ return GradingResult(file_name=file_name, success=False, error_message=str(e))
284
+
285
+
286
+ # --- Build the Gradio Interface ---
287
+
288
+ with gr.Blocks(theme=gr.themes.Soft(), title="Nursing Essay Grader") as demo:
289
+ gr.Markdown(
290
+ """
291
+ # πŸ“ Gemini-Powered Nursing Essay Grader
292
+ Upload one or more student essays in Word format (`.docx`) to have them graded by AI.
293
+ 1. Enter your Google API Key (enabling the Gemini API in your Google Cloud project is required).
294
+ 2. Upload the `.docx` files.
295
+ 3. Click "Grade All Papers". The results will appear below.
296
+ """
297
+ )
298
+
299
+ with gr.Row():
300
+ api_key_input = gr.Textbox(
301
+ label="Google API Key",
302
+ placeholder="Enter your Google API Key here",
303
+ type="password",
304
+ scale=1
305
+ )
306
+
307
+ file_uploads = gr.File(
308
+ label="Upload Word Document Essays",
309
+ file_count="multiple",
310
+ file_types=[".docx"],
311
+ type="filepath" # Use filepath for easier handling
312
+ )
313
+
314
+ grade_button = gr.Button("πŸš€ Grade All Papers", variant="primary")
315
+
316
+ gr.Markdown("---")
317
+ gr.Markdown("## πŸ“Š Grading Results")
318
+
319
+ results_output = gr.Markdown(label="Formatted Grades")
320
+
321
+ status_output = gr.Textbox(
322
+ label="Runtime Status",
323
+ lines=2,
324
+ interactive=False
325
+ )
326
+
327
+ grade_button.click(
328
+ fn=grade_papers_concurrently,
329
+ inputs=[file_uploads, api_key_input],
330
+ outputs=[results_output, status_output]
331
+ )
332
+
333
+ if __name__ == "__main__":
334
+ demo.launch(debug=True)
readme.md CHANGED
@@ -1,13 +1,12 @@
1
  ---
2
- title: My First Space
3
- emoji: πŸš€
4
- colorFrom: blue
5
- colorTo: red
6
  sdk: gradio
7
- sdk_version: 4.0.0
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- # My First Hugging Face Space
13
- This is a simple Gradio app that greets users.
 
1
  ---
2
+ title: Arbiterscripti
3
+ emoji: πŸ“‰
4
+ colorFrom: yellow
5
+ colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 5.35.0
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ gradio
2
+ google-generativeai==0.5.*
3
+ python-docx
4
+ asyncio
5
+ json
6
+