kevinkyi
/

Homework2_Finetuning

@@ -1,98 +1,71 @@
 ---
 library_name: transformers
-pipeline_tag: text-classification
-license: mit
 tags:
-  - distilbert
-  - sentiment
-  - football
-  - fine-tuning
-model_name: DistilBERT Football Sentiment (Positive vs Negative)
-language:
-  - en
 ---
-# DistilBERT Football Sentiment — Positive vs Negative
-## Purpose
-Fine-tune a compact transformer (DistilBERT) to classify short football-related comments as **positive (1)** or **negative (0)**. This supports a course assignment on text modeling and evaluation.
-## Dataset
-- **Source:** `james-kramer/football_news` on Hugging Face.
-- **Schema:** `text` (string), `label` (0/1).
-- **Task:** Binary sentiment classification (`0=negative`, `1=positive`).
-- **Splits:** Stratified **80/10/10** (train/val/test) created in this notebook.
-- **Cleaning:** Strip text, drop empty/NA rows.
-## Preprocessing
-- **Tokenizer:** `distilbert-base-uncased` (uncased), `max_length=256`, truncation.
-- **Label mapping:** `{0: "negative", 1: "positive"}`.
-## Training Setup
-- **Base model:** `distilbert-base-uncased`
-- **Epochs:** 5
-- **Batch size:** 16
-- **Learning rate:** 3e-05
-- **Weight decay:** 0.01
-- **Warmup ratio:** 0.1
-- **Early stopping:** patience = 2 (monitor F1 on validation)
-- **Seed:** 42
-- **Hardware:** Google Colab (GPU)
-## Metrics (Held-out Test)
-```json
-{
-  "eval_loss": 0.0029852271545678377,
-  "eval_accuracy": 1.0,
-  "eval_precision": 1.0,
-  "eval_recall": 1.0,
-  "eval_f1": 1.0,
-  "eval_runtime": 0.3123,
-  "eval_samples_per_second": 352.273,
-  "eval_steps_per_second": 22.417,
-  "epoch": 4.0
-}
-```
-## Confusion Matrix & Errors
-The Colab notebook includes a confusion matrix for validation and test, plus a short error analysis with example misclassifications and hypotheses (e.g., injury news phrased neutrally but labeled negative).
-|           | Pred 0 | Pred 1 |
-|-----------|-------:|-------:|
-| **True 0**|   55   |   0    |
-| **True 1**|   0    |   55   |
-## Brief Error Analysis (Concrete Examples & Hypotheses)
-Below are several misclassified examples with likely causes and fixes:
-1. **Text:** "<paste misclassified sentence #1>"
-   **True:** 0 (negative) • **Pred:** 1 (positive)
-   **Why it failed (hypothesis):** "<e.g., neutral phrasing with positive words outweighed injury cue>"
-   **Potential fix:** "<e.g., add more injury/neutral-negative examples; reweight class; augment with negation patterns>"
-2. **Text:** "<paste misclassified sentence #2>"
-   **True:** 1 (positive) • **Pred:** 0 (negative)
-   **Why it failed (hypothesis):** "<e.g., sarcasm or mixed sentiment>"
-   **Potential fix:** "<e.g., include sarcastic examples; leverage larger model or polarity lexicon features>"
-3. **Text:** "<paste misclassified sentence #3>"
-   **True:** <0/1> • **Pred:** <1/0>
-   **Why it failed (hypothesis):** "<e.g., domain shift, team/league slang>"
-   **Potential fix:** "<e.g., add domain-specific samples; modest LR warmup or longer training>"
-## Limitations & Ethics
-- Dataset size and labeling style can lead to unstable metrics; neutral/ambiguous tone is hard.
-- Sports injury and team-management news may bias wording and labels.
-- For coursework only; not for production or sensitive decisions.
-## Reproducibility
-- Python: 3.12
-- Transformers: >=4.41
-- Datasets: >=2.19
-- Seed: 42
-## License
-- Code & weights: MIT (adjust per course guidelines)
-- Dataset: see the original dataset's license/terms
-## AI Assistance Disclosure
-- GenAI tools assisted with notebook structure and documentation; modeling choices and evaluation were implemented and verified by the author.

 ---
 library_name: transformers
+license: apache-2.0
+base_model: distilbert-base-uncased
 tags:
+- generated_from_trainer
+metrics:
+- accuracy
+- precision
+- recall
+- f1
+model-index:
+- name: Homework2_Finetuning
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# Homework2_Finetuning
+This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.0030
+- Accuracy: 1.0
+- Precision: 1.0
+- Recall: 1.0
+- F1: 1.0
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 5
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1     |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
+| 0.1158        | 1.0   | 55   | 0.0315          | 0.9909   | 0.9821    | 1.0    | 0.9910 |
+| 0.0043        | 2.0   | 110  | 0.0083          | 1.0      | 1.0       | 1.0    | 1.0    |
+| 0.002         | 3.0   | 165  | 0.0017          | 1.0      | 1.0       | 1.0    | 1.0    |
+| 0.0014        | 4.0   | 220  | 0.0012          | 1.0      | 1.0       | 1.0    | 1.0    |
+### Framework versions
+- Transformers 4.45.2
+- Pytorch 2.8.0+cu126
+- Datasets 2.21.0
+- Tokenizers 0.20.3