oliviermills commited on
Commit
fadb552
·
verified ·
1 Parent(s): 06d1a90

Training complete - F1: 0.8688

Browse files
Files changed (3) hide show
  1. README.md +314 -232
  2. model.safetensors +1 -1
  3. model_head.pkl +1 -1
README.md CHANGED
@@ -1,268 +1,350 @@
1
  ---
 
2
  license: cc-by-nc-4.0
3
- library_name: setfit
4
  tags:
5
  - setfit
6
  - sentence-transformers
7
  - text-classification
8
- - multi-label
9
- - water-conflict
 
 
 
 
 
 
 
 
10
  metrics:
11
- - f1
12
  - accuracy
13
- language:
14
- - en
15
- widget:
16
- - text: "Military attack workers at the Kajaki Dam in Afghanistan"
17
- - text: "Violent protests erupt over dam construction in Sudan"
18
- - text: "New water treatment plant opens in California"
19
- - text: "Armed groups cut off water supply to villages in Syria"
20
- - text: "Government announces new irrigation subsidies"
21
  ---
22
 
23
- # Water Conflict Multi-Label Classifier
24
 
25
- ## 🔬 Experimental Research
26
 
27
- > This experimental research draws on Pacific Institute's [Water Conflict Chronology](https://www.worldwater.org/water-conflict/), which tracks water-related conflicts spanning over 4,500 years of human history. The work is conducted independently and is not affiliated with Pacific Institute.
28
 
29
- This model is designed to assist researchers in classifying water-related conflict events at scale using tiny/small models that can classify 100s of headlines per second.
 
30
 
31
- The Pacific Institute maintains the world's most comprehensive open-source record of water-related conflicts, documenting over 2,700 events across 4,500 years of history. This is not a commercial product and is not intended for commercial use.
32
 
33
- ## 📋 Model Description
 
 
 
 
 
 
 
 
34
 
35
- This SetFit-based model classifies news headlines about water-related conflicts into three categories:
36
 
37
- - **Trigger**: Water resource as a conflict trigger
38
- - **Casualty**: Water infrastructure as a casualty/target
39
- - **Weapon**: Water used as a weapon/tool
40
 
41
- These categories align with the Pacific Institute's Water Conflict Chronology framework for understanding how water intersects with security and conflict.
42
 
43
- ## 🏗️ Model Details
44
 
45
- - **Base Model**: [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
46
- - **Architecture**: SetFit with One-vs-Rest multi-label strategy
47
- - **Training Approach**: Few-shot learning optimized (SetFit reaches peak performance with small samples)
48
- - **Training samples**: 1200 examples
49
- - **Test samples**: 519 (held-out, never seen during training)
50
- - **Training time**: ~2-5 minutes on A10G GPU
51
- - **Model size**: 33M Parameters, ~133MB
52
- - **Inference speed**: ~5-10ms per headline on CPU
53
 
54
- ## 💻 Usage
 
 
55
 
56
- ### Quick Start
57
 
58
  ```python
59
  from setfit import SetFitModel
60
 
61
- # Load the trained model from HF Hub
62
  model = SetFitModel.from_pretrained("baobabtech/water-conflict-classifier")
63
-
64
- # Predict on headlines
65
- headlines = [
66
- "Military attack workers at the Kajaki Dam in Afghanistan",
67
- "New water treatment plant opens in California"
68
- ]
69
-
70
- predictions = model.predict(headlines)
71
- print(predictions)
72
- # Output: [[1, 1, 0], [0, 0, 0]]
73
- # Format: [Trigger, Casualty, Weapon]
74
- ```
75
-
76
- ### Interpreting Results
77
-
78
- The model returns a list of binary predictions for each label:
79
-
80
- ```python
81
- label_names = ['Trigger', 'Casualty', 'Weapon']
82
-
83
- for headline, pred in zip(headlines, predictions):
84
- labels = [label_names[i] for i, val in enumerate(pred) if val == 1]
85
- print(f"Headline: {headline}")
86
- print(f"Labels: {', '.join(labels) if labels else 'None'}")
87
- print()
88
  ```
89
 
90
- ### Batch Processing
91
-
92
- ```python
93
- import pandas as pd
94
-
95
- # Load your data
96
- df = pd.read_csv("your_headlines.csv")
97
-
98
- # Predict in batches
99
- predictions = model.predict(df['headline'].tolist())
100
-
101
- # Add predictions to dataframe
102
- df['trigger'] = [p[0] for p in predictions]
103
- df['casualty'] = [p[1] for p in predictions]
104
- df['weapon'] = [p[2] for p in predictions]
105
- ```
106
-
107
- ### Example Outputs
108
-
109
- | Headline | Trigger | Casualty | Weapon |
110
- |----------|---------|----------|--------|
111
- | "Armed groups blow up water pipeline in Iraq" | | | ✓ |
112
- | "New water treatment plant opens in California" | ✗ | ✗ | ✗ |
113
- | "Protests erupt over dam construction in Ethiopia" | ✓ | ✗ | ✗ |
114
-
115
- ## 📈 Evaluation Results
116
-
117
- Evaluated on a held-out test set of 519 samples (30% of total data, stratified by label combinations).
118
-
119
- ### Overall Performance
120
-
121
- | Metric | Score |
122
- |--------|-------|
123
- | Exact Match Accuracy | 0.8170 |
124
- | Hamming Loss | 0.0829 |
125
- | F1 (micro) | 0.8623 |
126
- | F1 (macro) | 0.8142 |
127
- | F1 (samples) | 0.7048 |
128
-
129
- ### Per-Label Performance
130
-
131
- | Label | Precision | Recall | F1 | Support |
132
- |-------|-----------|--------|-----|---------|
133
- | Trigger | 0.8889 | 0.8736 | 0.8812 | 174 |
134
- | Casualty | 0.8908 | 0.9099 | 0.9002 | 233 |
135
- | Weapon | 0.5797 | 0.7692 | 0.6612 | 52 |
136
-
137
- ### Training Details
138
-
139
- - **Training samples**: 1200 examples
140
- - **Test samples**: 519 examples (held-out before sampling)
141
- - **Base model**: BAAI/bge-small-en-v1.5 (33M params)
142
- - **Batch size**: 64
143
- - **Epochs**: 1
144
- - **Iterations**: 20 (contrastive pair generation)
145
- - **Sampling strategy**: undersampling (balances positive/negative pairs)
146
- - **Training Dataset**: [baobabtech/water-conflict-training-data](https://huggingface.co/datasets/baobabtech/water-conflict-training-data) (version: d2.0)
147
-
148
-
149
- ### 📈 Experiment Tracking
150
-
151
- All training runs are automatically tracked in a public dataset for experiment comparison:
152
-
153
- - **Evals Dataset**: [baobabtech/water-conflict-classifier-evals](https://huggingface.co/datasets/baobabtech/water-conflict-classifier-evals)
154
- - **Tracked Metrics**: F1 scores, accuracy, per-label performance, and all hyperparameters
155
- - **Compare Experiments**: View how different configurations (sample size, epochs, batch size) affect performance
156
- - **Reproducibility**: Full training configs logged for each version
157
-
158
- You can explore past experiments and compare model performance across versions using the evals dataset.
159
-
160
-
161
- ## 📊 Data Sources
162
-
163
- ### Positive Examples (Water Conflict Headlines)
164
- Pacific Institute (2025). *Water Conflict Chronology*. Pacific Institute, Oakland, CA.
165
- https://www.worldwater.org/water-conflict/
166
-
167
- ### Negative Examples (Non-Water Conflict Headlines)
168
- Armed Conflict Location & Event Data Project (ACLED).
169
- https://acleddata.com/
170
-
171
- **Note:** Training negatives include synthetic "hard negatives" - peaceful water-related news (e.g., "New desalination plant opens", "Water conservation conference") to prevent false positives on non-conflict water topics.
172
-
173
- ## 🌍 About This Project
174
-
175
- This model is part of independent experimental research drawing on the Pacific Institute's Water Conflict Chronology. The Pacific Institute maintains the world's most comprehensive open-source record of water-related conflicts, documenting over 2,700 events across 4,500 years of history.
176
-
177
- **Project Links:**
178
- - Pacific Institute Water Conflict Chronology: https://www.worldwater.org/water-conflict/
179
- - Python Package (PyPI): https://pypi.org/project/water-conflict-classifier/
180
- - Source Code: https://github.com/baobabtech/waterconflict
181
- - Model Hub: https://huggingface.co/{model_repo}
182
-
183
-
184
- ## 🌱 Frugal AI: Training with Limited Data
185
-
186
- This classifier demonstrates an intentional approach to building AI systems with **limited data** using [SetFit](https://huggingface.co/docs/setfit/en/index) - a framework for few-shot learning with sentence transformers. Rather than defaulting to massive language models (GPT, Claude, or 100B+ parameter models) for simple classification tasks, we fine-tune small, efficient models (e.g., BAAI/bge-small-en-v1.5 with ~33M parameters) on a focused dataset.
187
-
188
- **Why this matters:** The industry has normalized using trillion-parameter models to classify headlines, answer simple questions, or categorize text - tasks that don't require world knowledge, reasoning, or generative capabilities. This is computationally wasteful and environmentally costly. A properly fine-tuned small model can achieve comparable or better accuracy while using a fraction of the compute resources.
189
-
190
- **Our approach:**
191
- - Train on ~600 examples (few-shot learning with SetFit)
192
- - Deploy small parameter models (e.g., ~33M params) vs. 100B-1T parameter alternatives
193
- - Achieve specialized task performance without the overhead of general-purpose LLMs
194
- - Reduce inference costs and latency by orders of magnitude
195
-
196
- This is not about avoiding large models altogether - they're invaluable for complex reasoning tasks. But for targeted classification problems with labeled data, fine-tuning remains the professional, responsible choice.
197
-
198
-
199
- ### 🏋🏽‍♀️ Training Your Own Model
200
-
201
- You can train your own version using the [published package](https://pypi.org/project/water-conflict-classifier/).
202
-
203
- **Package includes:**
204
- - Data preprocessing utilities
205
- - Training logic (SetFit multi-label)
206
- - Evaluation metrics
207
- - Model card generation
208
-
209
- **Source code:** https://github.com/baobabtech/waterconflict/tree/main/classifier
210
- **PyPI:** https://pypi.org/project/water-conflict-classifier/
211
-
212
- ```bash
213
- # Install package
214
- pip install water-conflict-classifier
215
-
216
- # Or install from source for development
217
- git clone https://github.com/baobabtech/waterconflict.git
218
- cd waterconflict/classifier
219
- pip install -e .
220
-
221
- # Train locally
222
- python train_setfit_headline_classifier.py
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
223
  ```
224
 
225
- For cloud training on HuggingFace Jobs infrastructure, see the scripts folder in the repository.
226
-
227
- ## 📜 License
228
-
229
- Copyright © 2025 Baobab Tech
230
-
231
- This work is licensed under the [Creative Commons Attribution-NonCommercial 4.0 International License](http://creativecommons.org/licenses/by-nc/4.0/).
232
 
233
- **You are free to:**
234
- - **Share** — copy and redistribute the material in any medium or format
235
- - **Adapt** — remix, transform, and build upon the material
236
 
237
- **Under the following terms:**
238
- - **Attribution** You must give appropriate credit to Baobab Tech, provide a link to the license, and indicate if changes were made
239
- - **NonCommercial** — You may not use the material for commercial purposes
240
 
 
 
241
 
242
- ## 📝 Citation
243
-
244
- If you use this model in your work, please cite:
245
-
246
- ```bibtex
247
- @misc{{waterconflict2025,
248
- title={{Water Conflict Multi-Label Classifier}},
249
- author={{Independent Experimental Research Drawing on Pacific Institute Water Conflict Chronology}},
250
- year={{2025}},
251
- howpublished={{\url{{https://huggingface.co/{model_repo}}}}},
252
- note={{Training data from Pacific Institute Water Conflict Chronology and ACLED}}
253
- }}
254
- ```
255
-
256
- Please also cite the Pacific Institute's Water Conflict Chronology:
257
-
258
- ```bibtex
259
- @misc{{pacificinstitute2025,
260
- title={{Water Conflict Chronology}},
261
- author={{Pacific Institute}},
262
- year={{2025}},
263
- address={{Oakland, CA}},
264
- url={{https://www.worldwater.org/water-conflict/}},
265
- note={{Accessed: [access date]}}
266
- }}
267
- ```
268
 
 
 
 
1
  ---
2
+ language: en
3
  license: cc-by-nc-4.0
 
4
  tags:
5
  - setfit
6
  - sentence-transformers
7
  - text-classification
8
+ - generated_from_setfit_trainer
9
+ widget:
10
+ - text: Israeli forces destroy water pump in Nablus, West Bank, cutting water supply
11
+ to over 20,000 Palestinians in multiple villages
12
+ - text: Chinese man killed for speaking out against displacement of communities by
13
+ the Three Gorges Dam
14
+ - text: Protests over water cuts turn violent in Tunisia
15
+ - text: National leader Dilma Ferreira Silva, working for policy reform to support
16
+ people affected by dams, is murdered in Brazil
17
+ - text: Water reservoir sustains minor damages from bombing in Colombia
18
  metrics:
 
19
  - accuracy
20
+ pipeline_tag: text-classification
21
+ library_name: setfit
22
+ inference: false
23
+ base_model: BAAI/bge-small-en-v1.5
 
 
 
 
24
  ---
25
 
26
+ # SetFit with BAAI/bge-small-en-v1.5
27
 
28
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.
29
 
30
+ The model has been trained using an efficient few-shot learning technique that involves:
31
 
32
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
33
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
34
 
35
+ ## Model Details
36
 
37
+ ### Model Description
38
+ - **Model Type:** SetFit
39
+ - **Sentence Transformer body:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5)
40
+ - **Classification head:** a OneVsRestClassifier instance
41
+ - **Maximum Sequence Length:** 512 tokens
42
+ - **Number of Classes:** 3 classes
43
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
44
+ - **Language:** en
45
+ - **License:** cc-by-nc-4.0
46
 
47
+ ### Model Sources
48
 
49
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
50
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
51
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
52
 
53
+ ## Uses
54
 
55
+ ### Direct Use for Inference
56
 
57
+ First install the SetFit library:
 
 
 
 
 
 
 
58
 
59
+ ```bash
60
+ pip install setfit
61
+ ```
62
 
63
+ Then you can load this model and run inference.
64
 
65
  ```python
66
  from setfit import SetFitModel
67
 
68
+ # Download from the 🤗 Hub
69
  model = SetFitModel.from_pretrained("baobabtech/water-conflict-classifier")
70
+ # Run inference
71
+ preds = model("Protests over water cuts turn violent in Tunisia")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  ```
73
 
74
+ <!--
75
+ ### Downstream Use
76
+
77
+ *List how someone could finetune this model on their own dataset.*
78
+ -->
79
+
80
+ <!--
81
+ ### Out-of-Scope Use
82
+
83
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
84
+ -->
85
+
86
+ <!--
87
+ ## Bias, Risks and Limitations
88
+
89
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
90
+ -->
91
+
92
+ <!--
93
+ ### Recommendations
94
+
95
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
96
+ -->
97
+
98
+ ## Training Details
99
+
100
+ ### Training Set Metrics
101
+ | Training set | Min | Median | Max |
102
+ |:-------------|:----|:--------|:----|
103
+ | Word count | 3 | 16.3692 | 154 |
104
+
105
+ ### Training Hyperparameters
106
+ - batch_size: (16, 16)
107
+ - num_epochs: (3, 3)
108
+ - max_steps: -1
109
+ - sampling_strategy: oversampling
110
+ - num_iterations: 20
111
+ - body_learning_rate: (2e-05, 2e-05)
112
+ - head_learning_rate: 0.01
113
+ - loss: CosineSimilarityLoss
114
+ - distance_metric: cosine_distance
115
+ - margin: 0.25
116
+ - end_to_end: False
117
+ - use_amp: False
118
+ - warmup_proportion: 0.1
119
+ - l2_weight: 0.01
120
+ - seed: 42
121
+ - eval_max_steps: -1
122
+ - load_best_model_at_end: True
123
+
124
+ ### Training Results
125
+ | Epoch | Step | Training Loss | Validation Loss |
126
+ |:------:|:----:|:-------------:|:---------------:|
127
+ | 0.0003 | 1 | 0.2242 | - |
128
+ | 0.0167 | 50 | 0.2401 | - |
129
+ | 0.0333 | 100 | 0.2341 | - |
130
+ | 0.05 | 150 | 0.2292 | - |
131
+ | 0.0667 | 200 | 0.2193 | - |
132
+ | 0.0833 | 250 | 0.2031 | - |
133
+ | 0.1 | 300 | 0.1983 | - |
134
+ | 0.1167 | 350 | 0.1857 | - |
135
+ | 0.1333 | 400 | 0.1665 | - |
136
+ | 0.15 | 450 | 0.1548 | - |
137
+ | 0.1667 | 500 | 0.1352 | - |
138
+ | 0.1833 | 550 | 0.1306 | - |
139
+ | 0.2 | 600 | 0.1197 | - |
140
+ | 0.2167 | 650 | 0.1156 | - |
141
+ | 0.2333 | 700 | 0.1025 | - |
142
+ | 0.25 | 750 | 0.0934 | - |
143
+ | 0.2667 | 800 | 0.1008 | - |
144
+ | 0.2833 | 850 | 0.0905 | - |
145
+ | 0.3 | 900 | 0.0855 | - |
146
+ | 0.3167 | 950 | 0.0903 | - |
147
+ | 0.3333 | 1000 | 0.071 | - |
148
+ | 0.35 | 1050 | 0.0751 | - |
149
+ | 0.3667 | 1100 | 0.0715 | - |
150
+ | 0.3833 | 1150 | 0.0688 | - |
151
+ | 0.4 | 1200 | 0.0701 | - |
152
+ | 0.4167 | 1250 | 0.0676 | - |
153
+ | 0.4333 | 1300 | 0.0637 | - |
154
+ | 0.45 | 1350 | 0.0563 | - |
155
+ | 0.4667 | 1400 | 0.0567 | - |
156
+ | 0.4833 | 1450 | 0.0551 | - |
157
+ | 0.5 | 1500 | 0.0539 | - |
158
+ | 0.5167 | 1550 | 0.0489 | - |
159
+ | 0.5333 | 1600 | 0.0528 | - |
160
+ | 0.55 | 1650 | 0.0444 | - |
161
+ | 0.5667 | 1700 | 0.0497 | - |
162
+ | 0.5833 | 1750 | 0.0464 | - |
163
+ | 0.6 | 1800 | 0.0453 | - |
164
+ | 0.6167 | 1850 | 0.036 | - |
165
+ | 0.6333 | 1900 | 0.0468 | - |
166
+ | 0.65 | 1950 | 0.0428 | - |
167
+ | 0.6667 | 2000 | 0.0509 | - |
168
+ | 0.6833 | 2050 | 0.0388 | - |
169
+ | 0.7 | 2100 | 0.0386 | - |
170
+ | 0.7167 | 2150 | 0.0434 | - |
171
+ | 0.7333 | 2200 | 0.0447 | - |
172
+ | 0.75 | 2250 | 0.0372 | - |
173
+ | 0.7667 | 2300 | 0.0434 | - |
174
+ | 0.7833 | 2350 | 0.0366 | - |
175
+ | 0.8 | 2400 | 0.0355 | - |
176
+ | 0.8167 | 2450 | 0.04 | - |
177
+ | 0.8333 | 2500 | 0.0352 | - |
178
+ | 0.85 | 2550 | 0.0391 | - |
179
+ | 0.8667 | 2600 | 0.0393 | - |
180
+ | 0.8833 | 2650 | 0.0343 | - |
181
+ | 0.9 | 2700 | 0.0343 | - |
182
+ | 0.9167 | 2750 | 0.0356 | - |
183
+ | 0.9333 | 2800 | 0.0315 | - |
184
+ | 0.95 | 2850 | 0.0351 | - |
185
+ | 0.9667 | 2900 | 0.0387 | - |
186
+ | 0.9833 | 2950 | 0.0349 | - |
187
+ | 1.0 | 3000 | 0.0321 | 0.0947 |
188
+ | 1.0167 | 3050 | 0.0298 | - |
189
+ | 1.0333 | 3100 | 0.0332 | - |
190
+ | 1.05 | 3150 | 0.0292 | - |
191
+ | 1.0667 | 3200 | 0.0307 | - |
192
+ | 1.0833 | 3250 | 0.0334 | - |
193
+ | 1.1 | 3300 | 0.0334 | - |
194
+ | 1.1167 | 3350 | 0.032 | - |
195
+ | 1.1333 | 3400 | 0.0285 | - |
196
+ | 1.15 | 3450 | 0.0324 | - |
197
+ | 1.1667 | 3500 | 0.0324 | - |
198
+ | 1.1833 | 3550 | 0.0326 | - |
199
+ | 1.2 | 3600 | 0.0306 | - |
200
+ | 1.2167 | 3650 | 0.0344 | - |
201
+ | 1.2333 | 3700 | 0.0282 | - |
202
+ | 1.25 | 3750 | 0.0344 | - |
203
+ | 1.2667 | 3800 | 0.029 | - |
204
+ | 1.2833 | 3850 | 0.0309 | - |
205
+ | 1.3 | 3900 | 0.0306 | - |
206
+ | 1.3167 | 3950 | 0.0351 | - |
207
+ | 1.3333 | 4000 | 0.0288 | - |
208
+ | 1.35 | 4050 | 0.0265 | - |
209
+ | 1.3667 | 4100 | 0.0283 | - |
210
+ | 1.3833 | 4150 | 0.0285 | - |
211
+ | 1.4 | 4200 | 0.0287 | - |
212
+ | 1.4167 | 4250 | 0.0264 | - |
213
+ | 1.4333 | 4300 | 0.0271 | - |
214
+ | 1.45 | 4350 | 0.0269 | - |
215
+ | 1.4667 | 4400 | 0.0298 | - |
216
+ | 1.4833 | 4450 | 0.0257 | - |
217
+ | 1.5 | 4500 | 0.0273 | - |
218
+ | 1.5167 | 4550 | 0.0297 | - |
219
+ | 1.5333 | 4600 | 0.0261 | - |
220
+ | 1.55 | 4650 | 0.027 | - |
221
+ | 1.5667 | 4700 | 0.0279 | - |
222
+ | 1.5833 | 4750 | 0.0281 | - |
223
+ | 1.6 | 4800 | 0.0269 | - |
224
+ | 1.6167 | 4850 | 0.0279 | - |
225
+ | 1.6333 | 4900 | 0.0271 | - |
226
+ | 1.65 | 4950 | 0.0283 | - |
227
+ | 1.6667 | 5000 | 0.0247 | - |
228
+ | 1.6833 | 5050 | 0.0293 | - |
229
+ | 1.7 | 5100 | 0.0273 | - |
230
+ | 1.7167 | 5150 | 0.027 | - |
231
+ | 1.7333 | 5200 | 0.0258 | - |
232
+ | 1.75 | 5250 | 0.0232 | - |
233
+ | 1.7667 | 5300 | 0.028 | - |
234
+ | 1.7833 | 5350 | 0.0274 | - |
235
+ | 1.8 | 5400 | 0.029 | - |
236
+ | 1.8167 | 5450 | 0.025 | - |
237
+ | 1.8333 | 5500 | 0.0284 | - |
238
+ | 1.85 | 5550 | 0.0272 | - |
239
+ | 1.8667 | 5600 | 0.0241 | - |
240
+ | 1.8833 | 5650 | 0.0275 | - |
241
+ | 1.9 | 5700 | 0.0243 | - |
242
+ | 1.9167 | 5750 | 0.0255 | - |
243
+ | 1.9333 | 5800 | 0.0274 | - |
244
+ | 1.95 | 5850 | 0.0245 | - |
245
+ | 1.9667 | 5900 | 0.0277 | - |
246
+ | 1.9833 | 5950 | 0.0249 | - |
247
+ | 2.0 | 6000 | 0.0259 | 0.0980 |
248
+ | 2.0167 | 6050 | 0.0265 | - |
249
+ | 2.0333 | 6100 | 0.0268 | - |
250
+ | 2.05 | 6150 | 0.0252 | - |
251
+ | 2.0667 | 6200 | 0.0255 | - |
252
+ | 2.0833 | 6250 | 0.0242 | - |
253
+ | 2.1 | 6300 | 0.0255 | - |
254
+ | 2.1167 | 6350 | 0.0251 | - |
255
+ | 2.1333 | 6400 | 0.0238 | - |
256
+ | 2.15 | 6450 | 0.024 | - |
257
+ | 2.1667 | 6500 | 0.0231 | - |
258
+ | 2.1833 | 6550 | 0.0233 | - |
259
+ | 2.2 | 6600 | 0.023 | - |
260
+ | 2.2167 | 6650 | 0.0237 | - |
261
+ | 2.2333 | 6700 | 0.0245 | - |
262
+ | 2.25 | 6750 | 0.0224 | - |
263
+ | 2.2667 | 6800 | 0.0251 | - |
264
+ | 2.2833 | 6850 | 0.0246 | - |
265
+ | 2.3 | 6900 | 0.0248 | - |
266
+ | 2.3167 | 6950 | 0.0232 | - |
267
+ | 2.3333 | 7000 | 0.0252 | - |
268
+ | 2.35 | 7050 | 0.0247 | - |
269
+ | 2.3667 | 7100 | 0.0262 | - |
270
+ | 2.3833 | 7150 | 0.0222 | - |
271
+ | 2.4 | 7200 | 0.0234 | - |
272
+ | 2.4167 | 7250 | 0.0227 | - |
273
+ | 2.4333 | 7300 | 0.0206 | - |
274
+ | 2.45 | 7350 | 0.0246 | - |
275
+ | 2.4667 | 7400 | 0.0233 | - |
276
+ | 2.4833 | 7450 | 0.0237 | - |
277
+ | 2.5 | 7500 | 0.0245 | - |
278
+ | 2.5167 | 7550 | 0.0238 | - |
279
+ | 2.5333 | 7600 | 0.0218 | - |
280
+ | 2.55 | 7650 | 0.0245 | - |
281
+ | 2.5667 | 7700 | 0.024 | - |
282
+ | 2.5833 | 7750 | 0.0248 | - |
283
+ | 2.6 | 7800 | 0.0216 | - |
284
+ | 2.6167 | 7850 | 0.0223 | - |
285
+ | 2.6333 | 7900 | 0.0257 | - |
286
+ | 2.65 | 7950 | 0.0199 | - |
287
+ | 2.6667 | 8000 | 0.0262 | - |
288
+ | 2.6833 | 8050 | 0.0211 | - |
289
+ | 2.7 | 8100 | 0.0213 | - |
290
+ | 2.7167 | 8150 | 0.0221 | - |
291
+ | 2.7333 | 8200 | 0.0251 | - |
292
+ | 2.75 | 8250 | 0.0234 | - |
293
+ | 2.7667 | 8300 | 0.0249 | - |
294
+ | 2.7833 | 8350 | 0.0233 | - |
295
+ | 2.8 | 8400 | 0.0237 | - |
296
+ | 2.8167 | 8450 | 0.0221 | - |
297
+ | 2.8333 | 8500 | 0.0238 | - |
298
+ | 2.85 | 8550 | 0.0211 | - |
299
+ | 2.8667 | 8600 | 0.0238 | - |
300
+ | 2.8833 | 8650 | 0.0258 | - |
301
+ | 2.9 | 8700 | 0.0216 | - |
302
+ | 2.9167 | 8750 | 0.0233 | - |
303
+ | 2.9333 | 8800 | 0.0239 | - |
304
+ | 2.95 | 8850 | 0.0246 | - |
305
+ | 2.9667 | 8900 | 0.021 | - |
306
+ | 2.9833 | 8950 | 0.0241 | - |
307
+ | 3.0 | 9000 | 0.0281 | 0.0972 |
308
+
309
+ ### Framework Versions
310
+ - Python: 3.12.12
311
+ - SetFit: 1.1.3
312
+ - Sentence Transformers: 5.1.2
313
+ - Transformers: 4.57.3
314
+ - PyTorch: 2.9.1+cu128
315
+ - Datasets: 4.4.1
316
+ - Tokenizers: 0.22.1
317
+
318
+ ## Citation
319
+
320
+ ### BibTeX
321
+ ```bibtex
322
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
323
+ doi = {10.48550/ARXIV.2209.11055},
324
+ url = {https://arxiv.org/abs/2209.11055},
325
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
326
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
327
+ title = {Efficient Few-Shot Learning Without Prompts},
328
+ publisher = {arXiv},
329
+ year = {2022},
330
+ copyright = {Creative Commons Attribution 4.0 International}
331
+ }
332
  ```
333
 
334
+ <!--
335
+ ## Glossary
 
 
 
 
 
336
 
337
+ *Clearly define terms in order to be accessible across audiences.*
338
+ -->
 
339
 
340
+ <!--
341
+ ## Model Card Authors
 
342
 
343
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
344
+ -->
345
 
346
+ <!--
347
+ ## Model Card Contact
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
348
 
349
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
350
+ -->
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8c1cdc842637b6b647a74cbfc7f6e83b785dc8a75b3712a320a5aecdf96d8811
3
  size 133462128
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c642f66e9d56bc3552806d084c11dcca579303572fce1cd0ad65683128931f35
3
  size 133462128
model_head.pkl CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f938a7cbb3e88596a74a15b98e1406fbad601fb4c92a9edec735db3053597267
3
  size 11236
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:515942220a35f3b6fb5889eac10212fcaccfc2951b7d701080cbe7339e211d6e
3
  size 11236