Koushim commited on
Commit
78fdad8
Β·
verified Β·
1 Parent(s): d787ef5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -3
README.md CHANGED
@@ -1,3 +1,121 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ datasets: ag_news
4
+ tags:
5
+ - text-classification
6
+ - topic-classification
7
+ - ag-news
8
+ - distilbert
9
+ - transformers
10
+ - pytorch
11
+ license: apache-2.0
12
+ model-index:
13
+ - name: DistilBERT AG News Classifier
14
+ results:
15
+ - task:
16
+ name: Topic Classification
17
+ type: text-classification
18
+ dataset:
19
+ name: AG News
20
+ type: ag_news
21
+ metrics:
22
+ - name: Accuracy
23
+ type: accuracy
24
+ value: 0.81
25
+ ---
26
+
27
+ # πŸ“° DistilBERT Fine-Tuned on AG News with and without Label Smoothing
28
+
29
+ This repository provides two fine-tuned [DistilBERT](https://huggingface.co/distilbert-base-uncased) models for **topic classification** on the [AG News](https://huggingface.co/datasets/ag_news) dataset:
30
+
31
+ - βœ… `model_no_smoothing`: Fine-tuned **without label smoothing**
32
+ - πŸ§ͺ `model_label_smoothing`: Fine-tuned **with label smoothing** (`smoothing=0.1`)
33
+
34
+ Both models use the same tokenizer (`distilbert-base-uncased`) and were trained using PyTorch and Hugging Face `Trainer`.
35
+
36
+ ---
37
+
38
+ ## 🧠 Model Details
39
+
40
+ | Model Name | Label Smoothing | Validation Loss | Epochs | Learning Rate |
41
+ |------------------------|-----------------|------------------|--------|----------------|
42
+ | `model_no_smoothing` | ❌ No | 0.1792 | 1 | 2e-5 |
43
+ | `model_label_smoothing`| βœ… Yes (0.1) | 0.5413 | 1 | 2e-5 |
44
+
45
+ - Base model: `distilbert-base-uncased`
46
+ - Task: 4-class topic classification
47
+ - Dataset: AG News (train: 120k, test: 7.6k)
48
+
49
+ ---
50
+
51
+ ## πŸ“¦ Repository Structure
52
+
53
+ ```
54
+
55
+ /
56
+ β”œβ”€β”€ model\_no\_smoothing/ # Model A - no smoothing
57
+ β”œβ”€β”€ model\_label\_smoothing/ # Model B - label smoothing
58
+ β”œβ”€β”€ tokenizer/ # Tokenizer files (shared)
59
+ └── README.md
60
+
61
+ ````
62
+
63
+ ---
64
+
65
+ ## πŸ§ͺ How to Use
66
+
67
+ ### Load Model A (No Smoothing)
68
+
69
+ ```python
70
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
71
+
72
+ model_name = "Koushim/distilbert-agnews/model_no_smoothing"
73
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
74
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
75
+
76
+ inputs = tokenizer("Breaking news in the tech world!", return_tensors="pt")
77
+ outputs = model(**inputs)
78
+ pred = outputs.logits.argmax(dim=1).item()
79
+ ````
80
+
81
+ ### Load Model B (Label Smoothing)
82
+
83
+ ```python
84
+ model_name = "Koushim/distilbert-agnews/model_label_smoothing"
85
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
86
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
87
+ ```
88
+
89
+ ---
90
+
91
+ ## 🏷️ Class Labels
92
+
93
+ 0. World
94
+ 1. Sports
95
+ 2. Business
96
+ 3. Sci/Tech
97
+
98
+ ---
99
+
100
+ ## βš™οΈ Training Configuration
101
+
102
+ * Framework: PyTorch + πŸ€— Transformers
103
+ * Optimizer: AdamW
104
+ * Batch size: 16 (train/eval)
105
+ * Epochs: 1
106
+ * Learning rate: 2e-5
107
+ * Max sequence length: 256
108
+ * Loss: CrossEntropy (custom for smoothing)
109
+
110
+ ---
111
+
112
+ ## πŸ“„ License
113
+
114
+ Apache 2.0
115
+
116
+ ---
117
+
118
+ ## ✍️ Author
119
+
120
+ * Hugging Face: [Koushim](https://huggingface.co/Koushim)
121
+ * Trained with `transformers.Trainer`