oliviermills commited on
Commit
6f900e2
·
verified ·
1 Parent(s): 0cc7d15

Training complete - F1: 0.8754

Browse files
Files changed (4) hide show
  1. README.md +173 -37
  2. config_setfit.json +2 -2
  3. model.safetensors +1 -1
  4. model_head.pkl +1 -1
README.md CHANGED
@@ -1,65 +1,201 @@
1
  ---
2
- license: mit
3
- library_name: setfit
4
  tags:
5
  - setfit
6
  - sentence-transformers
7
  - text-classification
8
- - multi-label
9
- - water-conflict
 
 
 
 
 
 
 
 
 
 
 
 
10
  metrics:
11
- - f1
12
  - accuracy
13
- language:
14
- - en
 
 
15
  ---
16
 
17
- # Water Conflict Multi-Label Classifier
 
 
 
 
18
 
19
- This model classifies news headlines about water-related conflicts into three categories:
20
- - **Trigger**: Water resource as a conflict trigger
21
- - **Casualty**: Water infrastructure as a casualty/target
22
- - **Weapon**: Water used as a weapon/tool
23
 
24
  ## Model Details
25
 
26
- - **Base Model**: BAAI/bge-small-en-v1.5
27
- - **Architecture**: SetFit with One-vs-Rest multi-label strategy
28
- - **Training Approach**: Few-shot learning optimized (SetFit reaches peak performance with small samples)
29
- - **Training Data**: 510 examples (sampled from ~5,000 labeled headlines)
30
- - **Performance**: F1 (micro) = 0.8319, Accuracy = 0.8333
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
- ## Usage
 
 
 
 
 
 
 
 
33
 
34
  ```python
35
  from setfit import SetFitModel
36
 
 
37
  model = SetFitModel.from_pretrained("baobabtech/water-conflict-classifier")
 
 
 
38
 
39
- headlines = [
40
- "Taliban attack workers at the Kajaki Dam in Afghanistan",
41
- "New water treatment plant opens in California"
42
- ]
43
 
44
- predictions = model.predict(headlines)
45
- print(predictions)
46
- ```
 
 
47
 
48
- ## Training Metrics
 
49
 
50
- - Accuracy (exact match): 0.8333
51
- - F1 (micro): 0.8319
52
- - F1 (macro): 0.6755
53
- - Hamming Loss: 0.0704
54
 
55
- ## Label Distribution
 
56
 
57
- | Label | F1 Score | Support |
58
- |-------|----------|---------|
59
- | Trigger | 0.8837 | 21 |
60
- | Casualty | 0.8571 | 30 |
61
- | Weapon | 0.2857 | 5 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
 
63
  ## Citation
64
 
65
- Based on ACLED (Armed Conflict Location & Event Data Project) data.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  tags:
3
  - setfit
4
  - sentence-transformers
5
  - text-classification
6
+ - generated_from_setfit_trainer
7
+ widget:
8
+ - text: Gaddafi cuts of water to Libya's capital
9
+ - text: Grenade blast in water tank leaves 40 families without water in Potrerito,
10
+ Valle del Cauca, Colombia
11
+ - text: Silvan Dam construction site attacked
12
+ - text: in the afternoon, US forces destroy (likely through airstrikes) 2 suspected
13
+ Houthi patrol boats in an unidentified area in the South Red Sea while Houthi
14
+ media reported 3 air raids on As Salif coastal district (coded to As Salif Port)
15
+ (Al Hudaydah). Casaulties unknown.
16
+ - text: a group of Fulani men clashed with and killed a suspected Fulani bull thief
17
+ in the Goure Kele district of Sakabansi (Nikki, Borgou). He was found dead in
18
+ his house after being struck with a machete during the clash by one of the members
19
+ of the group, who then fled.
20
  metrics:
 
21
  - accuracy
22
+ pipeline_tag: text-classification
23
+ library_name: setfit
24
+ inference: false
25
+ base_model: BAAI/bge-small-en-v1.5
26
  ---
27
 
28
+ # SetFit with BAAI/bge-small-en-v1.5
29
+
30
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.
31
+
32
+ The model has been trained using an efficient few-shot learning technique that involves:
33
 
34
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
35
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
 
 
36
 
37
  ## Model Details
38
 
39
+ ### Model Description
40
+ - **Model Type:** SetFit
41
+ - **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
42
+ - **Classification head:** a OneVsRestClassifier instance
43
+ - **Maximum Sequence Length:** 512 tokens
44
+ - **Number of Classes:** 3 classes
45
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
46
+ <!-- - **Language:** Unknown -->
47
+ <!-- - **License:** Unknown -->
48
+
49
+ ### Model Sources
50
+
51
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
52
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
53
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
54
+
55
+ ## Uses
56
 
57
+ ### Direct Use for Inference
58
+
59
+ First install the SetFit library:
60
+
61
+ ```bash
62
+ pip install setfit
63
+ ```
64
+
65
+ Then you can load this model and run inference.
66
 
67
  ```python
68
  from setfit import SetFitModel
69
 
70
+ # Download from the 🤗 Hub
71
  model = SetFitModel.from_pretrained("baobabtech/water-conflict-classifier")
72
+ # Run inference
73
+ preds = model("Silvan Dam construction site attacked")
74
+ ```
75
 
76
+ <!--
77
+ ### Downstream Use
 
 
78
 
79
+ *List how someone could finetune this model on their own dataset.*
80
+ -->
81
+
82
+ <!--
83
+ ### Out-of-Scope Use
84
 
85
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
86
+ -->
87
 
88
+ <!--
89
+ ## Bias, Risks and Limitations
 
 
90
 
91
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
92
+ -->
93
 
94
+ <!--
95
+ ### Recommendations
96
+
97
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
98
+ -->
99
+
100
+ ## Training Details
101
+
102
+ ### Training Set Metrics
103
+ | Training set | Min | Median | Max |
104
+ |:-------------|:----|:--------|:----|
105
+ | Word count | 4 | 25.9533 | 236 |
106
+
107
+ ### Training Hyperparameters
108
+ - batch_size: (32, 32)
109
+ - num_epochs: (1, 1)
110
+ - max_steps: -1
111
+ - sampling_strategy: undersampling
112
+ - body_learning_rate: (2e-05, 1e-05)
113
+ - head_learning_rate: 0.01
114
+ - loss: CosineSimilarityLoss
115
+ - distance_metric: cosine_distance
116
+ - margin: 0.25
117
+ - end_to_end: False
118
+ - use_amp: False
119
+ - warmup_proportion: 0.1
120
+ - l2_weight: 0.01
121
+ - seed: 42
122
+ - eval_max_steps: -1
123
+ - load_best_model_at_end: True
124
+
125
+ ### Training Results
126
+ | Epoch | Step | Training Loss | Validation Loss |
127
+ |:------:|:----:|:-------------:|:---------------:|
128
+ | 0.0007 | 1 | 0.2168 | - |
129
+ | 0.0339 | 50 | 0.2108 | - |
130
+ | 0.0679 | 100 | 0.1126 | - |
131
+ | 0.1018 | 150 | 0.0719 | - |
132
+ | 0.1358 | 200 | 0.0616 | - |
133
+ | 0.1697 | 250 | 0.0518 | - |
134
+ | 0.2037 | 300 | 0.0454 | - |
135
+ | 0.2376 | 350 | 0.0393 | - |
136
+ | 0.2716 | 400 | 0.0324 | - |
137
+ | 0.3055 | 450 | 0.0265 | - |
138
+ | 0.3394 | 500 | 0.0279 | - |
139
+ | 0.3734 | 550 | 0.0231 | - |
140
+ | 0.4073 | 600 | 0.0231 | - |
141
+ | 0.4413 | 650 | 0.0228 | - |
142
+ | 0.4752 | 700 | 0.0272 | - |
143
+ | 0.5092 | 750 | 0.0216 | - |
144
+ | 0.5431 | 800 | 0.0186 | - |
145
+ | 0.5771 | 850 | 0.0195 | - |
146
+ | 0.6110 | 900 | 0.0174 | - |
147
+ | 0.6449 | 950 | 0.0163 | - |
148
+ | 0.6789 | 1000 | 0.0174 | - |
149
+ | 0.7128 | 1050 | 0.0148 | - |
150
+ | 0.7468 | 1100 | 0.0167 | - |
151
+ | 0.7807 | 1150 | 0.0158 | - |
152
+ | 0.8147 | 1200 | 0.0146 | - |
153
+ | 0.8486 | 1250 | 0.0146 | - |
154
+ | 0.8826 | 1300 | 0.0145 | - |
155
+ | 0.9165 | 1350 | 0.0138 | - |
156
+ | 0.9504 | 1400 | 0.0142 | - |
157
+ | 0.9844 | 1450 | 0.013 | - |
158
+ | 1.0 | 1473 | - | 0.0577 |
159
+
160
+ ### Framework Versions
161
+ - Python: 3.12.12
162
+ - SetFit: 1.1.3
163
+ - Sentence Transformers: 5.1.2
164
+ - Transformers: 4.57.3
165
+ - PyTorch: 2.9.1+cu128
166
+ - Datasets: 4.4.1
167
+ - Tokenizers: 0.22.1
168
 
169
  ## Citation
170
 
171
+ ### BibTeX
172
+ ```bibtex
173
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
174
+ doi = {10.48550/ARXIV.2209.11055},
175
+ url = {https://arxiv.org/abs/2209.11055},
176
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
177
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
178
+ title = {Efficient Few-Shot Learning Without Prompts},
179
+ publisher = {arXiv},
180
+ year = {2022},
181
+ copyright = {Creative Commons Attribution 4.0 International}
182
+ }
183
+ ```
184
+
185
+ <!--
186
+ ## Glossary
187
+
188
+ *Clearly define terms in order to be accessible across audiences.*
189
+ -->
190
+
191
+ <!--
192
+ ## Model Card Authors
193
+
194
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
195
+ -->
196
+
197
+ <!--
198
+ ## Model Card Contact
199
+
200
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
201
+ -->
config_setfit.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
- "normalize_embeddings": false,
3
  "labels": [
4
  "Trigger",
5
  "Casualty",
6
  "Weapon"
7
- ]
 
8
  }
 
1
  {
 
2
  "labels": [
3
  "Trigger",
4
  "Casualty",
5
  "Weapon"
6
+ ],
7
+ "normalize_embeddings": false
8
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:95c79c45ab3f36cf71006e48a80c98a9711d197508e41f5ee17dfb35fe8c5757
3
  size 133462128
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:810e83a30f42979ba8a25d2e797843dd802456fc79565ebc7fec264d993a23b7
3
  size 133462128
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:732b2b4465f09f482ba088de92615478585bd94873cbdeb65e2cc00e65cef30f
3
  size 11236
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bbf961500280819b966a72f6006acf90bb4de6ba9ea63df1282ebacab1309ae0
3
  size 11236