|
|
--- |
|
|
license: afl-3.0 |
|
|
widget: |
|
|
- text: >- |
|
|
To ask the Secretary of State for Energy and Climate Change what estimate he |
|
|
has made of the proportion of carbon dioxide emissions arising in the UK |
|
|
attributable to burning. |
|
|
example_title: English (UK House of Commons Question) |
|
|
- text: >- |
|
|
To ask the Scottish Government what action it is taking to ensure that women |
|
|
who are prescribed sodium valproate are (a) adequately counselled regarding |
|
|
the risks of taking the drug while pregnant and (b) supported to plan their |
|
|
pregnancies in order to minimise the risk of foetal abnormalities. |
|
|
example_title: English (Scottish Parliamentary Question) |
|
|
tags: |
|
|
- CAP |
|
|
- politics |
|
|
- issues |
|
|
- agenda |
|
|
- multilingual |
|
|
- science |
|
|
- comparative agendas project |
|
|
--- |
|
|
|
|
|
Multilingual Bert base (multilingual uncased) model trained to predict [CAP issue codes](https://www.comparativeagendas.net/pages/master-codebook). |
|
|
|
|
|
Model training on 120,000 assorted political documents -- mostly from the [Comparative Agendas Project](https://www.comparativeagendas.net/) |
|
|
|
|
|
# Countries: |
|
|
- Italy |
|
|
- Sweden |
|
|
- France |
|
|
- Switzerland |
|
|
- Poland |
|
|
- Netherlands |
|
|
- Germany |
|
|
- Denmark |
|
|
- Spain |
|
|
- UK |
|
|
- Austria |
|
|
- Ireland |
|
|
|
|
|
|
|
|
# LABELS USED IN TRAINING |
|
|
|
|
|
- Model labels -> CAP labels: |
|
|
- {0: 1.0, 1: 2.0, 2: 3.0, 3: 4.0, 4: 5.0, 5: 6.0, 6: 7.0, 7: 8.0, 8: 9.0, 9: 10.0, 10: 12.0, 11: 13.0, 12: 14.0, 13: 15.0, 14: 16.0, 15: 17.0, 16: 18.0, 17: 19.0, 18: 20.0, 19: 23.0} |
|
|
|
|
|
- Model labels -> CAP issues: |
|
|
- {0: 'macroeconomics', 1: 'civil_rights', 2: 'healthcare', 3: 'agriculture', 4: 'labour', 5: 'education', 6: 'environment', 7: 'energy', 8: 'immigration', 9: 'transportation', 10: 'law_crime', 11: 'social_welfare', 12: 'housing', 13: 'domestic_commerce', 14: 'defense', 15: 'technology', 16: 'foreign_trade', 17: 'international_affairs', 18: 'government_operations', 19: 'culture'} |
|
|
|
|
|
# Validation |
|
|
|
|
|
| Class | Precision | Recall | F1-score | Support | |
|
|
|---|---|---|---|---| |
|
|
| 0 | 0.72 | 0.83 | 0.77 | 211 | |
|
|
| 1 | 0.82 | 0.77 | 0.79 | 242 | |
|
|
| 2 | 0.82 | 0.86 | 0.84 | 251 | |
|
|
| 3 | 0.92 | 0.89 | 0.90 | 228 | |
|
|
| 4 | 0.81 | 0.85 | 0.83 | 220 | |
|
|
| 5 | 0.90 | 0.93 | 0.91 | 244 | |
|
|
| 6 | 0.87 | 0.87 | 0.87 | 230 | |
|
|
| 7 | 0.92 | 0.88 | 0.90 | 251 | |
|
|
| 8 | 0.94 | 0.90 | 0.92 | 237 | |
|
|
| 9 | 0.87 | 0.88 | 0.87 | 263 | |
|
|
| 10 | 0.70 | 0.88 | 0.78 | 189 | |
|
|
| 11 | 0.90 | 0.81 | 0.85 | 248 | |
|
|
| 12 | 0.87 | 0.90 | 0.88 | 222 | |
|
|
| 13 | 0.76 | 0.72 | 0.74 | 255 | |
|
|
| 14 | 0.84 | 0.84 | 0.84 | 241 | |
|
|
| 15 | 0.92 | 0.79 | 0.85 | 276 | |
|
|
| 16 | 0.95 | 0.90 | 0.92 | 258 | |
|
|
| 17 | 0.71 | 0.82 | 0.76 | 200 | |
|
|
| 18 | 0.77 | 0.73 | 0.75 | 215 | |
|
|
| 19 | 0.92 | 0.91 | 0.92 | 239 | |
|
|
| Accuracy | --- 0.85 --- | | | | |
|
|
| Macro Avg | 0.85 | 0.85 | 0.85 | 4720 | |
|
|
| Weighted Avg | 0.85 | 0.85 | 0.85 | 4720 | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForSequenceClassification |
|
|
from transformers import TextClassificationPipeline, AutoTokenizer |
|
|
|
|
|
mp = 'z-dickson/CAP_multilingual' |
|
|
model = AutoModelForSequenceClassification.from_pretrained(mp) |
|
|
tokenizer = AutoTokenizer.from_pretrained(mp) |
|
|
|
|
|
classifier = TextClassificationPipeline(tokenizer=tokenizer, model=model, device=0) |
|
|
|
|
|
classifier(""" |
|
|
To ask the Secretary of State for Energy and Climate \\ |
|
|
Change what estimate he has made of the proportion of carbon \\ |
|
|
dioxide emissions arising in the UK attributable to burning. |
|
|
""" |
|
|
) |
|
|
``` |