Spaces:

UIIAmerica
/

MedVidBench-Leaderboard

Sleeping

MedGRPO Team commited on 11 days ago

Commit

6d8dbb2

1 Parent(s): c8f4cad

Fix CVS_acc to use raw accuracy instead of component_balanced_accuracy

- Change extraction to match 'accuracy:' but exclude 'component_balanced_accuracy'
- Now extracts 0.9136 (raw accuracy) instead of 0.8956 (component_balanced)
- Matches expected table value of 0.914

Files changed (1) hide show

app.py +2 -2

app.py CHANGED Viewed

@@ -725,8 +725,8 @@ def parse_evaluation_output(output: str) -> Dict[str, float]:
                 except:
                     pass
-            # CVS Assessment: Extract accuracy
-            elif current_task == "cvs_assessment" and "accuracy" in line.lower():
                 try:
                     value = float(line.split(":")[-1].strip())
                     metrics["cvs_acc"] = value

                 except:
                     pass
+            # CVS Assessment: Extract accuracy (not component_balanced_accuracy)
+            elif current_task == "cvs_assessment" and "accuracy:" in line and "component_balanced" not in line:
                 try:
                     value = float(line.split(":")[-1].strip())
                     metrics["cvs_acc"] = value