baglecake
/

ces-phase3a-lora

@@ -18,40 +18,40 @@ pipeline_tag: text-generation
 # CES Phase 3A LoRA: Leader Affect + Policy Positions
-This is the **recommended** model for predicting political ideology from demographics, leader thermometers, and wedge issues.
-## Performance
-| Model | Variables | r |
-|-------|-----------|---|
-| **Phase 3A (this model)** | Demographics + Leader Ratings + Wedge Issues | **0.560** |
-| Phase 3B | Same + Party ID | 0.574 |
-**Partisan Delta = 0.014** (essentially zero)
 ## Key Finding: "The Null Result of the Label"
-Adding party identification provides almost no improvement (+1.4%) over leader affect and policy positions alone.
 **What this means:**
 - Party identity is **redundant** — it's already encoded in how people feel about leaders and their policy positions
 - Canadian ideology is **substantive, not tribal** — people's "team" reflects their actual views
-- **Phase 3A is the preferred model** — predicts ideology without "cheating" by asking party affiliation
-## Variables
-### Demographics
-Age, gender, province, education, employment, religion, marital status, urban/rural, born in Canada
-### Leader Thermometers (0-100 ratings)
-- Justin Trudeau
-- Erin O'Toole
-- Jagmeet Singh
-### Wedge Issues
-- Carbon tax support
-- Energy sector/pipelines
-- Medical assistance in dying
 ## Usage
@@ -65,22 +65,61 @@ base_model = AutoModelForCausalLM.from_pretrained(
 )
 model = PeftModel.from_pretrained(base_model, "baglecake/ces-phase3a-lora")
 tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
 ```
 ## Training Details
 - **Base model**: meta-llama/Meta-Llama-3.1-8B-Instruct (4-bit quantized via Unsloth)
-- **Training data**: ~14,450 examples from CES 2021
 - **LoRA rank**: 32
 - **LoRA alpha**: 64
 - **Epochs**: 3
 - **Hardware**: NVIDIA A100 40GB (Colab Pro)
 ## Limitations
-1. **Narrow task**: Model only outputs ideology numbers (0-10).
 2. **Canadian-specific**: Trained on CES 2021 under Trudeau government.
-3. **Leader-specific**: Uses 2021 leader names.
 ## Citation

 # CES Phase 3A LoRA: Leader Affect + Policy Positions
+A LoRA adapter for Llama 3.1 8B Instruct that predicts political ideology from demographics, leader thermometer ratings, and wedge issue positions. This is the **recommended** model in the Phase 3 series.
+## Model Description
+This model was trained on the Canadian Election Study (CES) 2021 to predict self-reported ideology (0-10 left-right scale) from:
+- **Demographics**: Age, gender, province, education, employment, religion, marital status, urban/rural, born in Canada
+- **Leader Thermometers**: Ratings (0-100) of Justin Trudeau, Erin O'Toole, and Jagmeet Singh
+- **Wedge Issues**: Positions on carbon tax, energy/pipelines, and medical assistance in dying (MAID)
+- **Government Satisfaction**: Overall satisfaction with federal government
+## Performance
+| Model | Inputs | Correlation (r) |
+|-------|--------|-----------------|
+| Base Llama 8B | Demographics only | 0.03 |
+| GPT-4o-mini | Demographics only | 0.285 |
+| Phase 1 | Demographics only | 0.213 |
+| Phase 2 | + Gov satisfaction, economy, immigration | 0.428 |
+| **Phase 3A (this model)** | **+ Leader thermometers + wedge issues** | **0.560** |
+| Phase 3B | + Party ID | 0.574 |
 ## Key Finding: "The Null Result of the Label"
+We trained two versions of Phase 3:
+- **Phase 3A** (this model): Uses leader ratings and policy positions, but NOT party identification
+- **Phase 3B**: Adds party identification ("I usually think of myself as a Liberal/Conservative...")
+**Result**: Adding party ID only improves correlation by 0.014 (from 0.560 to 0.574).
 **What this means:**
 - Party identity is **redundant** — it's already encoded in how people feel about leaders and their policy positions
 - Canadian ideology is **substantive, not tribal** — people's "team" reflects their actual views
+- **Phase 3A is preferred** — predicts ideology without "cheating" by asking party affiliation
 ## Usage
 )
 model = PeftModel.from_pretrained(base_model, "baglecake/ces-phase3a-lora")
 tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
+# Example prompt
+system = """You are a 45-year-old man from Ontario, Canada. You live in a suburb of a large city. Your highest level of education is a bachelor's degree. You are currently employed full-time. You are married. You have children. You are Catholic. You were born in Canada.
+Political Profile:
+Leader Ratings: Justin Trudeau: 25/100, Erin O'Toole: 70/100, Jagmeet Singh: 30/100.
+Views: Strongly disagrees that the federal government should continue the carbon tax; strongly agrees that the government should do more to help the energy sector/pipelines.
+Overall Satisfaction: Is not at all satisfied with the federal government.
+Answer survey questions as this person would, based on their background and detailed political profile."""
+user = "On a scale from 0 to 10, where 0 means left/liberal and 10 means right/conservative, where would you place yourself politically? Just give the number."
+# Format as Llama chat and generate
 ```
+## Steerability
+The model is steerable — changing leader ratings and policy positions shifts predicted ideology:
+| Profile | Trudeau | O'Toole | Carbon Tax | Predicted |
+|---------|---------|---------|------------|-----------|
+| Liberal | 85/100 | 15/100 | Strongly agree | 3 (left) |
+| Conservative | 10/100 | 90/100 | Strongly disagree | 8 (right) |
+| Moderate | 50/100 | 55/100 | Neutral | 6 (center) |
+**5-point ideology swing** from profile changes alone, holding demographics constant.
 ## Training Details
 - **Base model**: meta-llama/Meta-Llama-3.1-8B-Instruct (4-bit quantized via Unsloth)
+- **Training data**: 14,452 examples from CES 2021
 - **LoRA rank**: 32
 - **LoRA alpha**: 64
+- **Target modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
 - **Epochs**: 3
 - **Hardware**: NVIDIA A100 40GB (Colab Pro)
+## Implications
+This model is ideal for:
+- Simulating political discourse with leader-specific affect
+- Agent-based models where leader ratings drive polarization
+- Studying how policy positions (not just party labels) shape ideology
+Not suitable for:
+- General political conversation (model only outputs 0-10 numbers)
+- Elections with different leaders (trained on 2021 Trudeau/O'Toole/Singh)
+- Predicting specific budget or policy preferences
 ## Limitations
+1. **Narrow task**: Model only outputs ideology numbers (0-10). Not suitable for general political conversation.
 2. **Canadian-specific**: Trained on CES 2021 under Trudeau government.
+3. **Leader-specific**: Uses 2021 leader names (Trudeau, O'Toole, Singh). Would need adaptation for different elections.
 ## Citation