rwillh11
/

mdeberta_NLI_policy_noContext

@@ -1,199 +1,303 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
 #### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
 #### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
 ### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
 ## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
 ### Model Architecture and Objective
-[More Information Needed]
 ### Compute Infrastructure
-[More Information Needed]
 #### Hardware
-[More Information Needed]
 #### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
 ## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+tags:
+- policy-detection
+- political-science
+- multilingual
+- nli
+- deberta
+- group-appeals
+language:
+- en
+- de
+- nl
+- da
+- es
+- fr
+- it
+- sv
+base_model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
 ---
+# Model Card for mDeBERTa Policy Detection
+A multilingual policy detection model fine-tuned for detecting policy mentions directed towards specific groups in political text.
 ## Model Details
 ### Model Description
+This model is a fine-tuned mDeBERTa-v3-base that performs policy classification using Natural Language Inference (NLI) to determine whether political text contains specific policy proposals directed towards target groups.
+- **Developed by:** Will Horne, Alona O. Dolinsky and Lena Maria Huber
+- **Model type:** Sequence Classification (NLI-based policy detection)
+- **Language(s) (NLP):** English, German (multilingual)
+- **Finetuned from model:** MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
+### Model Sources
+- **Repository:** rwillh11/mdeberta_NLI_policy_noContext
+- **Base Model:** [MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7)
 ## Uses
 ### Direct Use
+The model is designed for researchers analyzing whether policy proposals are found in political discourse that are directed towards specific groups. It takes a political text and a target group as input and classifies whether the text contains a policy directed towards that group (policy/no policy).
+Note that the model *does not* categorize the texts into designated policy areas (e.g., healthcare, education) but rather identifies the presence of any policy directed at the specified group.
+### Downstream Use
+This model can be integrated into larger political text analysis pipelines for:
+- Political manifestos analysis
+- Policy proposal detection in political communication
+- Comparative political research across countries and languages
+- Group-targeted policy analysis
 ### Out-of-Scope Use
+This model should not be used for:
+- General policy detection (not group-specific)
+- Categorization of policies into specific policy areas
+- Real-time social media monitoring without human oversight
+- Making decisions about individuals or groups
+- Content moderation without additional validation
 ## Bias, Risks, and Limitations
+### Technical Limitations
+- Trained specifically on political manifesto text; performance may vary on other text types
+- Focus sentences without context may lack nuance present in full paragraphs
+- Limited to two policy categories (policy, no policy)
+### Bias Considerations
+- Training data consists of political manifestos from specific countries and time periods
+- May reflect biases present in political discourse of training data
+- Policy detection may vary across different political contexts and group types
 ### Recommendations
+Users should be aware that this model:
+- Is designed for research purposes in political science
+- Should be validated on specific domains before deployment
+- May require human oversight for sensitive applications
+- Performance may vary across different types of groups and political contexts
 ## How to Get Started with the Model
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+# Load model and tokenizer
+model_name = "rwillh11/mdeberta_NLI_policy_noContext"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+# Example usage
+text = "We will increase funding for schools to better support students."
+target_group = "students"
+# Create hypotheses for each policy class
+hypotheses = {
+    "policy": f"The text contains a policy directed towards {target_group}.",
+    "no policy": f"The text does not contain a policy directed towards {target_group}."
+}
+# Get predictions for each hypothesis
+results = {}
+for policy_class, hypothesis in hypotheses.items():
+    inputs = tokenizer(text, hypothesis, return_tensors="pt", truncation=True)
+    with torch.no_grad():
+        outputs = model(**inputs)
+        probs = torch.softmax(outputs.logits, dim=-1)
+        entailment_prob = probs[0][0].item()  # Probability of entailment
+        results[policy_class] = entailment_prob
+# Select policy class with highest entailment probability
+predicted_class = max(results, key=results.get)
+print(f"Predicted policy classification for '{target_group}': {predicted_class}")
+```
 ## Training Details
 ### Training Data
+The model was trained on political manifesto data containing:
+- **Languages:** English and German
+- **Text Type:** Political manifesto sentences (focal sentences without context)
+- **Labels:** Two-class policy classification (policy, no policy)
+- **Groups:** Various political target groups (citizens, specific demographics, professions, etc.)
+- **Original dataset:** 7,546 text-group pairs
+  - **English:** 4,066 text-group pairs
+  - **German:** 3,480 text-group pairs
+  - **Training Size:** ~6,037 original texts (80% split)
+  - **Test Size:** ~1,509 original texts (20% split)
 ### Training Procedure
+#### Preprocessing
+- Texts tokenized using mDeBERTa tokenizer with max length 512
+- NLI format: premise (political text) + hypothesis (policy towards group)
+- Each text paired with both true and false hypotheses for binary classification
 #### Training Hyperparameters
+- **Training regime:** Mixed precision training
+- **Optimizer:** AdamW with weight decay
+- **Learning rate:** Optimized via Optuna (range: 1e-5 to 4e-5)
+- **Weight decay:** Optimized via Optuna (range: 0.01 to 0.3)
+- **Warmup ratio:** Optimized via Optuna (range: 0.0 to 0.1)
+- **Epochs:** 10 per trial
+- **Batch size:** 16 (train and eval)
+- **Trials:** 20 total
+- **Metric for selection:** F1 Macro
+- **Seed:** 42 (deterministic training)
+#### Training Infrastructure
+- **Hardware:** CUDA-enabled GPU
+- **Framework:** Transformers, PyTorch
+- **Hyperparameter optimization:** Optuna
+- **Deterministic training:** All random seeds fixed
 ## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
+- 20% holdout from original dataset
+- Multilingual political manifesto sentences
 #### Factors
+The model was evaluated across:
+- **Languages:** English and German text
+- **Additional validation on held out sets:** English, German, Dutch, Danish, Spanish, French, Italian, Swedish
+- **Policy classes:** Policy, no policy
+- **Group types:** Various socia-demographic groups
 #### Metrics
+Primary metrics used for evaluation:
+- **F1 Macro:** Primary optimization metric (treats all classes equally)
+- **Balanced Accuracy:** Accounts for class imbalance
+- **Precision/Recall (Macro):** Detailed performance measures
 ### Results
+**Best Model Performance (Trial 10, Epoch 9):**
+- **Accuracy:** 0.872
+- **Balanced Accuracy:** 0.874
+- **Precision:** 0.869
+- **Recall:** 0.874
+- **F1 Macro:** 0.870
+Additional validation on held-out sets return the following metrics:
+**English**
+- **Accuracy:** 0.860
+- **Precision:** 0.864
+- **Recall:** 0.860
+- **F1 Macro:** 0.859
+**German (using texts translated from English)**
+- **Accuracy:** 0.814
+- **Precision:** 0.814
+- **Recall:** 0.814
+- **F1 Macro:** 0.814
+**Dutch (using texts translated from English)**
+- **Accuracy:** 0.847
+- **Precision:** 0.847
+- **Recall:** 0.847
+- **F1 Macro:** 0.847
+**Danish (using texts translated from English)**
+- **Accuracy:** 0.837
+- **Precision:** 0.837
+- **Recall:** 0.837
+- **F1 Macro:** 0.836
+**Spanish (using texts translated from English)**
+- **Accuracy:** 0.860
+- **Precision:** 0.861
+- **Recall:** 0.860
+- **F1 Macro:** 0.860
+**French (using texts translated from English)**
+- **Accuracy:** 0.837
+- **Precision:** 0.837
+- **Recall:** 0.837
+- **F1 Macro:** 0.837
+**Italian (using texts translated from English)**
+- **Accuracy:** 0.845
+- **Precision:** 0.847
+- **Recall:** 0.845
+- **F1 Macro:** 0.845
+**Swedish (using texts translated from English)**
+- **Accuracy:** 0.863
+- **Precision:** 0.864
+- **Recall:** 0.863
+- **F1 Macro:** 0.863
+The model demonstrates strong performance across policy categories with deterministic results confirmed through multiple prediction runs.
+## Model Examination
+The model uses Natural Language Inference to transform policy detection into a binary entailment task:
+- For each text-group pair, generates two hypotheses (policy/no policy)
+- Selects the hypothesis with highest entailment probability
+- This approach leverages pre-trained NLI capabilities for policy classification
 ## Environmental Impact
+Training involved hyperparameter optimization with 20 trials, each training for 10 epochs.
+- **Hardware Type:** CUDA-enabled GPU
+- **Hours used:** Estimated 10-15 hours (including hyperparameter search)
+- **Cloud Provider:** Google Colab
+- **Compute Region:** Variable
+- **Carbon Emitted:** Not precisely measured
+## Technical Specifications
 ### Model Architecture and Objective
+- **Base Architecture:** mDeBERTa-v3-base (278M parameters)
+- **Task:** Natural Language Inference for policy detection
+- **Input:** Text pair (political sentence + policy hypothesis)
+- **Output:** Binary classification (entailment/non-entailment)
+- **Objective:** Cross-entropy loss with F1 Macro optimization
 ### Compute Infrastructure
 #### Hardware
+- GPU-accelerated training (CUDA)
+- Mixed precision training support
 #### Software
+- Transformers library
+- PyTorch framework
+- Optuna for hyperparameter optimization
+- scikit-learn for metrics
+## Citation
+If you use this model in your research, please cite:
 **BibTeX:**
+```bibtex
+@misc{mdeberta_policy_nocontext,
+  title={mDeBERTa Policy Detection Model for Political Group Appeals},
+  author={Will Horne and Alona O. Dolinsky and Lena Maria Huber},
+  year={2024},
+  url={https://huggingface.co/rwillh11/mdeberta_NLI_policy_noContext}
+}
+```
+## Model Card Authors
+Research team studying group appeals in political discourse.
 ## Model Card Contact
+For questions about this model, please open an issue in the repository or contact the research team through appropriate academic channels.