Spaces:

aurigin
/

Hackathon_Truth_Vs_Machine

Sleeping

App Files Files Community

Nicolas Wagner commited on Nov 26, 2025

Commit

dcb04e7

1 Parent(s): a2556f7

textual update

Browse files

Files changed (4) hide show

src/about.py +24 -70
src/populate.py +15 -2
src/submission/submit_csv.py +3 -11
src/submission/validate_csv.py +5 -5

src/about.py CHANGED Viewed

@@ -1,7 +1,7 @@
 TITLE = """<h1 id="space-title">Truth vs. Machine Hackathon Leaderboard</h1>"""
 INTRODUCTION_TEXT = """
-Welcome to the Truth vs. Machine Hackathon Leaderboard! This leaderboard tracks teams competing in an audio deepfake detection challenge. Teams submit predictions on audio samples to determine whether they are real or fake, and the leaderboard displays the best performance metrics for each team.
 """
 LLM_BENCHMARKS_TEXT = """
@@ -9,12 +9,12 @@ LLM_BENCHMARKS_TEXT = """
 ### 1. Register Your Team
 - Go to the "Register Team" tab
-- Enter your team name and number of teammates
-- **Save your token immediately** - you'll need it to submit predictions
-- You won't be able to see your token again after registration
-### 2. Explore the Data
-Check out this [Exploratory Notebook](https://colab.research.google.com/drive/16O_P901xLdjkka8Xi4CfysF6h8l8q28H?usp=sharing) to understand the dataset and get started with your analysis.
 ### 3. Prepare Your Predictions
 Create a CSV file with two columns:
@@ -24,80 +24,34 @@ Create a CSV file with two columns:
 Example CSV format:
 ```csv
 id,label
-550e8400-e29b-41d4-a716-446655440000,0.0
-550e8400-e29b-41d4-a716-446655440001,1.0
-550e8400-e29b-41d4-a716-446655440002,0.0
-550e8400-e29b-41d4-a716-446655440003,1.0
 ```
 ### 4. Submit Your Predictions
 - Go to the "Submit Predictions" tab
-- Enter your team token
-- Upload your CSV file
-- Your submission will be automatically evaluated
-### 5. Evaluation Metrics
-Your predictions are evaluated on:
-- **Accuracy**: Percentage of correct predictions
-- **F1 Score**: Harmonic mean of precision and recall
-- **Precision**: True positives / (True positives + False positives)
-- **Recall**: True positives / (True positives + False negatives)
-### 6. Leaderboard Updates
-- Only your **best** scores are displayed on the leaderboard
-- A submission is accepted only if it improves your accuracy or F1 score
-- The leaderboard is sorted by best F1 score (primary metric)
-- If F1 score is tied, earlier submission date is used as a tiebreaker
-- **Rate Limit**: You can submit once every 15 minutes
-## 🏆 Prize Distribution & Evaluation Criteria
 Prizes are awarded based on the **F1 Score** metric:
-- **1st Prize**: Team with the highest F1 score
-- **2nd Prize**: Team with the second highest F1 score
-- **Tiebreaker**: In case of equal F1 scores, the team that submitted their winning score **earlier** will be ranked higher
-The final rankings will be determined at the end of the hackathon based on each team's best F1 score.
-## Important Notes
-- True labels are kept private and not accessible to participants
-- You can submit once every **15 minutes** - plan your submissions carefully
-- Only your best scores count on the leaderboard
-- Make sure your CSV file format is correct before submitting
-- **All IDs from the test set must be present in your submission**
-"""
-EVALUATION_QUEUE_TEXT = """
-## Submission Guidelines
-### CSV File Requirements
-- Must contain exactly two columns: `id` and `label`
-- `id` must be UUID strings matching the test set exactly
-- `label` must be exactly `0.0` (real) or `1.0` (fake)
-- No missing values allowed
-- **All IDs from the test set must be included** in your submission
-- No unknown IDs are allowed (only IDs from the test set)
-### Label Format
-Accepted formats for labels:
-- **Only**: `0.0` (real) or `1.0` (fake)
-- Any other format will be rejected
-### Scoring
-- Submissions are evaluated immediately upon upload
-- Scores are computed using accuracy, F1 score, precision, and recall
-- Only submissions that improve your best accuracy or F1 score are accepted
-- Rejected submissions are logged but don't update the leaderboard
-- **Rate Limit**: Teams can submit once every 15 minutes
-## 🏆 Prize Distribution & Evaluation Criteria
-Prizes are awarded based on the **F1 Score** metric:
-- **1st Prize**: Team with the highest F1 score
-- **2nd Prize**: Team with the second highest F1 score
-- **Tiebreaker**: In case of equal F1 scores, the team that submitted their winning score **earlier** will be ranked higher
-The final rankings will be determined at the end of the hackathon based on each team's best F1 score.
 """

 TITLE = """<h1 id="space-title">Truth vs. Machine Hackathon Leaderboard</h1>"""
 INTRODUCTION_TEXT = """
+Welcome to the Truth vs. Machine Hackathon Leaderboard! This leaderboard tracks teams competing in an audio deepfake detection challenge. Teams submit predictions on audio samples to determine whether they are real or fake, and the leaderboard displays the submission with the best F1 scores for each team.
 """
 LLM_BENCHMARKS_TEXT = """
 ### 1. Register Your Team
 - Go to the "Register Team" tab
+- Enter your team name and the total number of teammates
+- **Save your token** - you'll need it to submit predictions and you won't be able to see your token again after registration
+### 2. Exploratory Notebook
+To get you started quickly, we have prepared an [Exploratory Notebook](https://colab.research.google.com/drive/16O_P901xLdjkka8Xi4CfysF6h8l8q28H?usp=sharing)
+Feel free to use your computer instead of Google Colab to run the notebook
 ### 3. Prepare Your Predictions
 Create a CSV file with two columns:
 Example CSV format:
 ```csv
 id,label
+f7e3a2c1,0.0
+8b1c4d2e,1.0
+7f5b9e8a,0.0
+c2fa163b,1.0
 ```
+- True labels are kept private and not accessible to participants
 ### 4. Submit Your Predictions
 - Go to the "Submit Predictions" tab
+- Enter your team token, upload your CSV file and submit
+- Your submission will be automatically evaluated - There is a **rate limit** of 1 valid submission per 15 minutes per team
+- Only your **best** scores, selected based on F1 score, are displayed on the leaderboard
+## 🏆 Prize Distribution
 Prizes are awarded based on the **F1 Score** metric:
+- 🥇 **1st Prize**: 75 CHF digitec giftcard per team member
+- 🥈 **2nd Prize**: 20 CHF digitec giftcard per team member
+- 🥉 **3rd Prize**: 20 CHF digitec giftcard per team member
+The final rankings will be set at the end of the hackathon, any submissions after the deadline won't count towards the prizes.
+**Tiebreaker**: In case of equal F1 scores, the team that submitted their winning score **earlier** will be ranked higher
+We will also award a **Creative Prize** to the team that submits the most creative solution:
+- 🎨 **Creative Prize**: 20 CHF digitec giftcard per team member
+To select the teams, we sadly do not have the time to evaluate each solution, so we will ask only the 8 teams with the highest F1 scores to present.
 """

src/populate.py CHANGED Viewed

@@ -19,6 +19,15 @@ def get_leaderboard_df(results_path: str, cols: list) -> pd.DataFrame:
         by=[TeamColumn.best_f1.name, TeamColumn.best_submission_date.name],
         ascending=[False, True],
     )
     df = df[cols].round(decimals=4)
     return df
@@ -59,8 +68,12 @@ def get_submission_queue_df(save_path: str, cols: list) -> list[pd.DataFrame]:
         except Exception:
             continue
-    accepted_list = [s for s in all_submissions if s[SubmissionQueueColumn.status.name] == "ACCEPTED"]
-    rejected_list = [s for s in all_submissions if s[SubmissionQueueColumn.status.name] == "REJECTED"]
     df_accepted = (
         pd.DataFrame.from_records(accepted_list, columns=cols) if accepted_list else pd.DataFrame(columns=cols)

         by=[TeamColumn.best_f1.name, TeamColumn.best_submission_date.name],
         ascending=[False, True],
     )
+    team_name_col = TeamColumn.team_name.name
+    if team_name_col in df.columns and len(df) > 0:
+        medals = ["🥇", "🥈", "🥉"]
+        for idx in range(min(3, len(df))):
+            current_name = str(df.iloc[idx][team_name_col])
+            if not any(current_name.startswith(medal) for medal in medals):
+                df.iloc[idx, df.columns.get_loc(team_name_col)] = f"{medals[idx]} {current_name}"
     df = df[cols].round(decimals=4)
     return df
         except Exception:
             continue
+    accepted_list = [
+        s for s in all_submissions if s[SubmissionQueueColumn.status.name] in ["ACCEPTED", "ACCEPTED, BUT WORST"]
+    ]
+    rejected_list = [
+        s for s in all_submissions if s[SubmissionQueueColumn.status.name] not in ["ACCEPTED", "ACCEPTED, BUT WORST"]
+    ]
     df_accepted = (
         pd.DataFrame.from_records(accepted_list, columns=cols) if accepted_list else pd.DataFrame(columns=cols)

src/submission/submit_csv.py CHANGED Viewed

@@ -87,18 +87,10 @@ def should_update_scores(new_scores: dict, best_scores: dict | None) -> bool:
     if best_scores is None:
         return True
-    new_accuracy = new_scores.get("accuracy", 0.0)
     new_f1 = new_scores.get("f1", 0.0)
-    best_accuracy = best_scores.get("best_accuracy", 0.0)
     best_f1 = best_scores.get("best_f1", 0.0)
-    if new_accuracy > best_accuracy:
-        return True
-    if new_accuracy == best_accuracy and new_f1 > best_f1:
-        return True
-    return False
 def check_rate_limit(team_name: str) -> tuple[bool, str]:
@@ -173,10 +165,10 @@ def submit_csv(token: str, csv_content: str) -> tuple[bool, str]:
         status = "ACCEPTED"
         message = f"Submission accepted! Your scores: Accuracy={scores['accuracy']:.4f}, F1={scores['f1']:.4f}, Precision={scores['precision']:.4f}, Recall={scores['recall']:.4f}, TP={scores['tp']}, FP={scores['fp']}, FN={scores['fn']}, TN={scores['tn']}"
     else:
-        status = "REJECTED"
         best_acc = best_scores.get("best_accuracy", 0.0) if best_scores else 0.0
         best_f1 = best_scores.get("best_f1", 0.0) if best_scores else 0.0
-        message = f"Submission rejected. Your scores (Accuracy={scores['accuracy']:.4f}, F1={scores['f1']:.4f}) did not improve your best scores (Accuracy={best_acc:.4f}, F1={best_f1:.4f})."
     save_submission(team_name, token_hash, csv_content, scores, status)

     if best_scores is None:
         return True
     new_f1 = new_scores.get("f1", 0.0)
     best_f1 = best_scores.get("best_f1", 0.0)
+    return new_f1 > best_f1
 def check_rate_limit(team_name: str) -> tuple[bool, str]:
         status = "ACCEPTED"
         message = f"Submission accepted! Your scores: Accuracy={scores['accuracy']:.4f}, F1={scores['f1']:.4f}, Precision={scores['precision']:.4f}, Recall={scores['recall']:.4f}, TP={scores['tp']}, FP={scores['fp']}, FN={scores['fn']}, TN={scores['tn']}"
     else:
+        status = "ACCEPTED, BUT WORST"
         best_acc = best_scores.get("best_accuracy", 0.0) if best_scores else 0.0
         best_f1 = best_scores.get("best_f1", 0.0) if best_scores else 0.0
+        message = f"Submission accepted but did not improve your best score. Your scores (Accuracy={scores['accuracy']:.4f}, F1={scores['f1']:.4f}) vs. your best scores (Accuracy={best_acc:.4f}, F1={best_f1:.4f})."
     save_submission(team_name, token_hash, csv_content, scores, status)

src/submission/validate_csv.py CHANGED Viewed

@@ -8,13 +8,13 @@ def normalize_label(label: any) -> float | None:
         return None
     if isinstance(label, (int, float)):
-        if label == 0.0 or label == 1.0:
             return float(label)
         return None
     if isinstance(label, str):
         label_stripped = label.strip()
-        if label_stripped in ["0.0", "1.0"]:
             return float(label_stripped)
         return None
@@ -39,14 +39,14 @@ def validate_csv(csv_content: str, true_labels: dict[str, float]) -> tuple[bool,
     if df.empty:
         return False, "CSV is empty", None
-    df["id"] = df["id"].astype(str).str.strip()
     if df["id"].isna().any():
         return False, "id column contains missing values", None
     if df["label"].isna().any():
         return False, "label column contains missing values", None
     normalized_labels = []
     invalid_labels = []
@@ -55,7 +55,7 @@ def validate_csv(csv_content: str, true_labels: dict[str, float]) -> tuple[bool,
         label = normalize_label(row["label"])
         if label is None:
-            invalid_labels.append(f"Row {idx + 1}: invalid label value '{row['label']}' (must be 0.0 or 1.0)")
         else:
             normalized_labels.append(label)

         return None
     if isinstance(label, (int, float)):
+        if label in [0, 1, 0.0, 1.0]:
             return float(label)
         return None
     if isinstance(label, str):
         label_stripped = label.strip()
+        if label_stripped in ["0", "1", "0.0", "1.0"]:
             return float(label_stripped)
         return None
     if df.empty:
         return False, "CSV is empty", None
     if df["id"].isna().any():
         return False, "id column contains missing values", None
     if df["label"].isna().any():
         return False, "label column contains missing values", None
+    df["id"] = df["id"].astype(str).str.strip()
     normalized_labels = []
     invalid_labels = []
         label = normalize_label(row["label"])
         if label is None:
+            invalid_labels.append(f"Row {idx + 1}: invalid label value '{row['label']}' (must be 0, 1, 0.0, or 1.0)")
         else:
             normalized_labels.append(label)