File size: 6,606 Bytes
65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 9f5e459 4bf3de8 65f9683 4bf3de8 99bd01e 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 99bd01e 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 dd5c3a1 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 65f9683 4bf3de8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
---
base_model: sentence-transformers/all-mpnet-base-v2
language:
- en
license: apache-2.0
tags:
- economic-attributes
- mention-classification
- mpnet-base-v2
- setfit
- multi-label-classification
model-index:
- name: all-mpnet-base-v2_economic-attributes-classifier
results:
- task:
type: multi-label-classification
name: Multi-label classification
metrics:
- type: _tba_
value: -1.0
dataset:
type: custom
name: custom human-labeled multi-label annotation dataset
---
# Group mention economic attributes classifier
A multi-label classifier for detecting **economic attribute** categories referred to in a social group mention, trained with `setfit` based on the light-weight [`sentence-transformers/all-mpnet-base-v2`](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) sentence embedding model.
The economic attributes classified are:
| attribute | definition |
|:------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| class membership | People described with their membership in or belonging to a social class such as the upper class, the middle class, lower class, or the working class. |
| employment status | People described or categorized by their employment status such as employers, employees, self-employed, or unemployed people. |
| education level | People described with or categorized by their education level such as students, apprentices, higher education, tertiary education, vocational training or graduates. |
| income/wealth/economic status | People defined or categorized by their income, wealth, or economic status such as high/medium/low income groups, rich/poor people, homeowners/tenants/homeless. |
| occupation/profession | People referred to with or categorized according to their occupation or profession such as teachers, farmers, public servants, police officers |
| ecology of group | People categorized by their relation to the ecology of society such as carbon emitters, coal miners, green employers, green workers, sustainable farmers, those working in the fossil sector |
## Model Details
### Model Description
Group mention economic attributes classifier
- **Developed by:** Hauke Licht
- **Model type:** mpnet
- **Language(s) (NLP):** ['en']
- **License:** apache-2.0
- **Finetuned from model:** sentence-transformers/all-mpnet-base-v2
- **Funded by:** The *Deutsche Forschungsgemeinschaft* (DFG, German Research Foundation) under Germany's Excellence Strategy – EXC 2126/1 – 390838866
### Model Sources
- **Repository:** _tba_
- **Paper:** _tba_
- **Demo:** [More Information Needed]
## Uses
### Bias, Risks, and Limitations
- Evaluation of the classifier in held-out data shows that it makes mistakes.
- The model has been finetuned only on human-annotated labeled social group mentions recorded in sentences sampled from party manifestos of European parties (mostly far-right and Green parties). Applying the classifier in other domains can lead to higher error rates.
- The data used to finetune the model come from human annotators. Human annotators can be biased and factors like gender and social background can impact their annotations judgments. This may lead to bias in the detection of specific social groups.
#### Recommendations
- Users who want to apply the model outside its training data domain should evaluate its performance in the target data.
- Users who want to apply the model outside its training data domain should contuninue to finetune this model on labeled data.
### How to Get Started with the Model
Use the code below to get started with the model.
## Usage
You can use the model with the [`setfit` python library](https://github.com/huggingface/setfit) (>=1.1.0):
*Note:* It is recommended to use transformers version >=4.5.5,<=5.0.0 and sentence-transformers version >=4.0.1,<=5.1.0 for compatibility.
### Classification
```python
import torch
from setfit import SetFitModel
model_name = "haukelicht/all-mpnet-base-v2_economic-attributes-classifier"
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
classifier = SetFitModel.from_pretrained(model_name)
classifier.to(device);
# Example mentions
mentions = ["working class people", "highly-educated professionals", "people without a stable job"]
# Get predictions
with torch.no_grad():
predictions = classifier.predict(mentions)
print(predictions)
# Map predictions to labels
[
[
classifier.id2label[l]
for l, p in enumerate(pred) if p==1
]
for pred in predictions
]
```
### Mention embedding
```python
import torch
from sentence_transformers import SentenceTransformer
model_name = "haukelicht/all-mpnet-base-v2_economic-attributes-classifier"
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
# Load the sentence transformer component of the pre-trained classifier
model = SentenceTransformer(model_name, device=device)
# Example mentions
mentions = ["working class people", "highly-educated professionals", "people without a stable job"]
# Compute mention embeddings
with torch.no_grad():
embeddings = model.encode(mentions)
````
## Training Details
### Training Data
The train, dev, and test splits used for model finetuning and evaluation will be made available on Github upon publication of the associated research paper.
### Training Procedure
#### Training Hyperparameters
- num epochs: (1, 4)
- train batch sizes: (16, 4)
- body train max teps: 100
- head learning rate: 0.030
- L2 weight: 0.015
- warmup proportion: 0.10
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
The train, dev, and test splits used for model finetuning and evaluation will be made available on Github upon publication of the associated research paper.
## Citation
**BibTeX:**
[More Information Needed]
**APA:**
[More Information Needed]
## Model Card Contact
hauke.licht@uibk.ac.at
|