rwillh11 Alonadoli commited on
Commit
8fbec2c
·
verified ·
1 Parent(s): 172c4b3

Update README.md (#1)

Browse files

- Update README.md (e5f5d970b7f3b078f95addc82d88ef304f6479c1)


Co-authored-by: Alona Dolinsky <Alonadoli@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +224 -120
README.md CHANGED
@@ -1,199 +1,303 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
-
8
- <!-- Provide a quick summary of what the model is/does. -->
9
-
10
 
 
11
 
12
  ## Model Details
13
 
14
  ### Model Description
15
 
16
- <!-- Provide a longer summary of what this model is. -->
17
-
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
 
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
- ### Model Sources [optional]
29
 
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
  ## Uses
37
 
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
  ### Direct Use
41
 
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
 
46
- ### Downstream Use [optional]
47
 
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
 
50
- [More Information Needed]
 
 
 
 
51
 
52
  ### Out-of-Scope Use
53
 
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
-
56
- [More Information Needed]
 
 
 
57
 
58
  ## Bias, Risks, and Limitations
59
 
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 
 
 
61
 
62
- [More Information Needed]
 
 
 
63
 
64
  ### Recommendations
65
 
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
-
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 
 
69
 
70
  ## How to Get Started with the Model
71
 
72
- Use the code below to get started with the model.
73
-
74
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
 
76
  ## Training Details
77
 
78
  ### Training Data
79
 
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
 
 
 
 
 
 
 
83
 
84
  ### Training Procedure
85
 
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
-
92
 
93
  #### Training Hyperparameters
94
-
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
97
- #### Speeds, Sizes, Times [optional]
98
-
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
-
101
- [More Information Needed]
 
 
 
 
 
 
 
 
102
 
103
  ## Evaluation
104
 
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
  ### Testing Data, Factors & Metrics
108
 
109
  #### Testing Data
110
-
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
 
115
  #### Factors
116
-
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
-
119
- [More Information Needed]
 
120
 
121
  #### Metrics
122
-
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
-
125
- [More Information Needed]
126
 
127
  ### Results
128
 
129
- [More Information Needed]
130
-
131
- #### Summary
132
-
133
-
134
-
135
- ## Model Examination [optional]
136
-
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
140
 
141
  ## Environmental Impact
142
 
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
 
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
 
153
- ## Technical Specifications [optional]
154
 
155
  ### Model Architecture and Objective
156
-
157
- [More Information Needed]
 
 
 
158
 
159
  ### Compute Infrastructure
160
 
161
- [More Information Needed]
162
-
163
  #### Hardware
164
-
165
- [More Information Needed]
166
 
167
  #### Software
 
 
 
 
168
 
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
 
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
 
175
  **BibTeX:**
 
 
 
 
 
 
 
 
176
 
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
 
195
- [More Information Needed]
196
 
197
  ## Model Card Contact
198
 
199
- [More Information Needed]
 
1
  ---
2
  library_name: transformers
3
+ tags:
4
+ - policy-detection
5
+ - political-science
6
+ - multilingual
7
+ - nli
8
+ - deberta
9
+ - group-appeals
10
+ language:
11
+ - en
12
+ - de
13
+ - nl
14
+ - da
15
+ - es
16
+ - fr
17
+ - it
18
+ - sv
19
+ base_model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
20
  ---
21
 
22
+ # Model Card for mDeBERTa Policy Detection
 
 
 
23
 
24
+ A multilingual policy detection model fine-tuned for detecting policy mentions directed towards specific groups in political text.
25
 
26
  ## Model Details
27
 
28
  ### Model Description
29
 
30
+ This model is a fine-tuned mDeBERTa-v3-base that performs policy classification using Natural Language Inference (NLI) to determine whether political text contains specific policy proposals directed towards target groups.
 
 
31
 
32
+ - **Developed by:** Will Horne, Alona O. Dolinsky and Lena Maria Huber
33
+ - **Model type:** Sequence Classification (NLI-based policy detection)
34
+ - **Language(s) (NLP):** English, German (multilingual)
35
+ - **Finetuned from model:** MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
 
 
 
36
 
37
+ ### Model Sources
38
 
39
+ - **Repository:** rwillh11/mdeberta_NLI_policy_noContext
40
+ - **Base Model:** [MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7)
 
 
 
41
 
42
  ## Uses
43
 
 
 
44
  ### Direct Use
45
 
46
+ The model is designed for researchers analyzing whether policy proposals are found in political discourse that are directed towards specific groups. It takes a political text and a target group as input and classifies whether the text contains a policy directed towards that group (policy/no policy).
 
 
47
 
48
+ Note that the model *does not* categorize the texts into designated policy areas (e.g., healthcare, education) but rather identifies the presence of any policy directed at the specified group.
49
 
50
+ ### Downstream Use
51
 
52
+ This model can be integrated into larger political text analysis pipelines for:
53
+ - Political manifestos analysis
54
+ - Policy proposal detection in political communication
55
+ - Comparative political research across countries and languages
56
+ - Group-targeted policy analysis
57
 
58
  ### Out-of-Scope Use
59
 
60
+ This model should not be used for:
61
+ - General policy detection (not group-specific)
62
+ - Categorization of policies into specific policy areas
63
+ - Real-time social media monitoring without human oversight
64
+ - Making decisions about individuals or groups
65
+ - Content moderation without additional validation
66
 
67
  ## Bias, Risks, and Limitations
68
 
69
+ ### Technical Limitations
70
+ - Trained specifically on political manifesto text; performance may vary on other text types
71
+ - Focus sentences without context may lack nuance present in full paragraphs
72
+ - Limited to two policy categories (policy, no policy)
73
 
74
+ ### Bias Considerations
75
+ - Training data consists of political manifestos from specific countries and time periods
76
+ - May reflect biases present in political discourse of training data
77
+ - Policy detection may vary across different political contexts and group types
78
 
79
  ### Recommendations
80
 
81
+ Users should be aware that this model:
82
+ - Is designed for research purposes in political science
83
+ - Should be validated on specific domains before deployment
84
+ - May require human oversight for sensitive applications
85
+ - Performance may vary across different types of groups and political contexts
86
 
87
  ## How to Get Started with the Model
88
 
89
+ ```python
90
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
91
+ import torch
92
+
93
+ # Load model and tokenizer
94
+ model_name = "rwillh11/mdeberta_NLI_policy_noContext"
95
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
96
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
97
+
98
+ # Example usage
99
+ text = "We will increase funding for schools to better support students."
100
+ target_group = "students"
101
+
102
+ # Create hypotheses for each policy class
103
+ hypotheses = {
104
+ "policy": f"The text contains a policy directed towards {target_group}.",
105
+ "no policy": f"The text does not contain a policy directed towards {target_group}."
106
+ }
107
+
108
+ # Get predictions for each hypothesis
109
+ results = {}
110
+ for policy_class, hypothesis in hypotheses.items():
111
+ inputs = tokenizer(text, hypothesis, return_tensors="pt", truncation=True)
112
+ with torch.no_grad():
113
+ outputs = model(**inputs)
114
+ probs = torch.softmax(outputs.logits, dim=-1)
115
+ entailment_prob = probs[0][0].item() # Probability of entailment
116
+ results[policy_class] = entailment_prob
117
+
118
+ # Select policy class with highest entailment probability
119
+ predicted_class = max(results, key=results.get)
120
+ print(f"Predicted policy classification for '{target_group}': {predicted_class}")
121
+ ```
122
 
123
  ## Training Details
124
 
125
  ### Training Data
126
 
127
+ The model was trained on political manifesto data containing:
128
+ - **Languages:** English and German
129
+ - **Text Type:** Political manifesto sentences (focal sentences without context)
130
+ - **Labels:** Two-class policy classification (policy, no policy)
131
+ - **Groups:** Various political target groups (citizens, specific demographics, professions, etc.)
132
+ - **Original dataset:** 7,546 text-group pairs
133
+ - **English:** 4,066 text-group pairs
134
+ - **German:** 3,480 text-group pairs
135
+ - **Training Size:** ~6,037 original texts (80% split)
136
+ - **Test Size:** ~1,509 original texts (20% split)
137
 
138
  ### Training Procedure
139
 
140
+ #### Preprocessing
141
+ - Texts tokenized using mDeBERTa tokenizer with max length 512
142
+ - NLI format: premise (political text) + hypothesis (policy towards group)
143
+ - Each text paired with both true and false hypotheses for binary classification
 
 
144
 
145
  #### Training Hyperparameters
146
+ - **Training regime:** Mixed precision training
147
+ - **Optimizer:** AdamW with weight decay
148
+ - **Learning rate:** Optimized via Optuna (range: 1e-5 to 4e-5)
149
+ - **Weight decay:** Optimized via Optuna (range: 0.01 to 0.3)
150
+ - **Warmup ratio:** Optimized via Optuna (range: 0.0 to 0.1)
151
+ - **Epochs:** 10 per trial
152
+ - **Batch size:** 16 (train and eval)
153
+ - **Trials:** 20 total
154
+ - **Metric for selection:** F1 Macro
155
+ - **Seed:** 42 (deterministic training)
156
+
157
+ #### Training Infrastructure
158
+ - **Hardware:** CUDA-enabled GPU
159
+ - **Framework:** Transformers, PyTorch
160
+ - **Hyperparameter optimization:** Optuna
161
+ - **Deterministic training:** All random seeds fixed
162
 
163
  ## Evaluation
164
 
 
 
165
  ### Testing Data, Factors & Metrics
166
 
167
  #### Testing Data
168
+ - 20% holdout from original dataset
169
+ - Multilingual political manifesto sentences
 
 
170
 
171
  #### Factors
172
+ The model was evaluated across:
173
+ - **Languages:** English and German text
174
+ - **Additional validation on held out sets:** English, German, Dutch, Danish, Spanish, French, Italian, Swedish
175
+ - **Policy classes:** Policy, no policy
176
+ - **Group types:** Various socia-demographic groups
177
 
178
  #### Metrics
179
+ Primary metrics used for evaluation:
180
+ - **F1 Macro:** Primary optimization metric (treats all classes equally)
181
+ - **Balanced Accuracy:** Accounts for class imbalance
182
+ - **Precision/Recall (Macro):** Detailed performance measures
183
 
184
  ### Results
185
 
186
+ **Best Model Performance (Trial 10, Epoch 9):**
187
+ - **Accuracy:** 0.872
188
+ - **Balanced Accuracy:** 0.874
189
+ - **Precision:** 0.869
190
+ - **Recall:** 0.874
191
+ - **F1 Macro:** 0.870
192
+
193
+ Additional validation on held-out sets return the following metrics:
194
+
195
+ **English**
196
+ - **Accuracy:** 0.860
197
+ - **Precision:** 0.864
198
+ - **Recall:** 0.860
199
+ - **F1 Macro:** 0.859
200
+
201
+ **German (using texts translated from English)**
202
+ - **Accuracy:** 0.814
203
+ - **Precision:** 0.814
204
+ - **Recall:** 0.814
205
+ - **F1 Macro:** 0.814
206
+
207
+ **Dutch (using texts translated from English)**
208
+ - **Accuracy:** 0.847
209
+ - **Precision:** 0.847
210
+ - **Recall:** 0.847
211
+ - **F1 Macro:** 0.847
212
+
213
+ **Danish (using texts translated from English)**
214
+ - **Accuracy:** 0.837
215
+ - **Precision:** 0.837
216
+ - **Recall:** 0.837
217
+ - **F1 Macro:** 0.836
218
+
219
+ **Spanish (using texts translated from English)**
220
+ - **Accuracy:** 0.860
221
+ - **Precision:** 0.861
222
+ - **Recall:** 0.860
223
+ - **F1 Macro:** 0.860
224
+
225
+ **French (using texts translated from English)**
226
+ - **Accuracy:** 0.837
227
+ - **Precision:** 0.837
228
+ - **Recall:** 0.837
229
+ - **F1 Macro:** 0.837
230
+
231
+ **Italian (using texts translated from English)**
232
+ - **Accuracy:** 0.845
233
+ - **Precision:** 0.847
234
+ - **Recall:** 0.845
235
+ - **F1 Macro:** 0.845
236
+
237
+ **Swedish (using texts translated from English)**
238
+ - **Accuracy:** 0.863
239
+ - **Precision:** 0.864
240
+ - **Recall:** 0.863
241
+ - **F1 Macro:** 0.863
242
+
243
+ The model demonstrates strong performance across policy categories with deterministic results confirmed through multiple prediction runs.
244
+
245
+ ## Model Examination
246
+
247
+ The model uses Natural Language Inference to transform policy detection into a binary entailment task:
248
+ - For each text-group pair, generates two hypotheses (policy/no policy)
249
+ - Selects the hypothesis with highest entailment probability
250
+ - This approach leverages pre-trained NLI capabilities for policy classification
251
 
252
  ## Environmental Impact
253
 
254
+ Training involved hyperparameter optimization with 20 trials, each training for 10 epochs.
 
 
255
 
256
+ - **Hardware Type:** CUDA-enabled GPU
257
+ - **Hours used:** Estimated 10-15 hours (including hyperparameter search)
258
+ - **Cloud Provider:** Google Colab
259
+ - **Compute Region:** Variable
260
+ - **Carbon Emitted:** Not precisely measured
261
 
262
+ ## Technical Specifications
263
 
264
  ### Model Architecture and Objective
265
+ - **Base Architecture:** mDeBERTa-v3-base (278M parameters)
266
+ - **Task:** Natural Language Inference for policy detection
267
+ - **Input:** Text pair (political sentence + policy hypothesis)
268
+ - **Output:** Binary classification (entailment/non-entailment)
269
+ - **Objective:** Cross-entropy loss with F1 Macro optimization
270
 
271
  ### Compute Infrastructure
272
 
 
 
273
  #### Hardware
274
+ - GPU-accelerated training (CUDA)
275
+ - Mixed precision training support
276
 
277
  #### Software
278
+ - Transformers library
279
+ - PyTorch framework
280
+ - Optuna for hyperparameter optimization
281
+ - scikit-learn for metrics
282
 
283
+ ## Citation
 
 
284
 
285
+ If you use this model in your research, please cite:
286
 
287
  **BibTeX:**
288
+ ```bibtex
289
+ @misc{mdeberta_policy_nocontext,
290
+ title={mDeBERTa Policy Detection Model for Political Group Appeals},
291
+ author={Will Horne and Alona O. Dolinsky and Lena Maria Huber},
292
+ year={2024},
293
+ url={https://huggingface.co/rwillh11/mdeberta_NLI_policy_noContext}
294
+ }
295
+ ```
296
 
297
+ ## Model Card Authors
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
298
 
299
+ Research team studying group appeals in political discourse.
300
 
301
  ## Model Card Contact
302
 
303
+ For questions about this model, please open an issue in the repository or contact the research team through appropriate academic channels.