ces-phase2-lora / README.md
baglecake's picture
Add temporal generalization results (2019, 2015 time travel tests)
476c6b2 verified
---
license: mit
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
tags:
- llama
- lora
- political-science
- survey-replication
- canadian-election-study
- peft
- unsloth
datasets:
- custom
language:
- en
pipeline_tag: text-generation
---
# CES Phase 2 LoRA: Psychographic Ideology Prediction
A LoRA adapter for Llama 3.1 8B Instruct that predicts political ideology from demographics + psychographic attitudes.
## Model Description
This model was trained on the Canadian Election Study (CES) 2021 to predict self-reported ideology (0-10 left-right scale) from:
- **Demographics**: Age, gender, province, education, employment, religion, etc.
- **Psychographics**: Federal government satisfaction, economic retrospective, immigration views
## Performance
| Model | Ideology Correlation (r) |
|-------|-------------------------|
| Base Llama 8B | 0.03 |
| GPT-4o-mini | 0.285 |
| Phase 1 (demographics only) | 0.213 |
| **This model (demographics + psychographics)** | **0.428** |
## Usage
```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Meta-Llama-3.1-8B-Instruct",
load_in_4bit=True
)
model = PeftModel.from_pretrained(base_model, "baglecake/ces-phase2-lora")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
# Example prompt
system = """You are a 45-year-old man. from Ontario, Canada. You live in a suburb of a large city. Your highest level of education is a bachelor's degree. You are currently employed full-time. You are married. You have children. You are Catholic and religion is somewhat important to you. You were born in Canada.
This person not at all satisfied with the federal government, thinks the economy has gotten worse over the past year, thinks Canada should admit fewer immigrants.
Answer survey questions as this person would, based on their background, experiences, and views. Give direct, concise answers."""
user = "On a scale from 0 to 10, where 0 means left/liberal and 10 means right/conservative, where would you place yourself politically? Just give the number."
# Format as Llama chat and generate
```
## Training Details
- **Base model**: meta-llama/Meta-Llama-3.1-8B-Instruct (4-bit quantized via Unsloth)
- **Training data**: 14,456 examples from CES 2021
- **LoRA rank**: 32
- **LoRA alpha**: 64
- **Target modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- **Epochs**: 3
- **Hardware**: NVIDIA H100 80GB
## Steerability
The model is steerable - changing attitudes while holding demographics constant shifts predicted ideology:
| Attitude Config | Predicted Ideology |
|-----------------|-------------------|
| Satisfied + Economy better + More immigration | 2 (left) |
| Dissatisfied + Economy worse + Fewer immigration | 6 (center-right) |
**4-point ideology swing** from attitude changes alone, holding demographics constant.
## Generalization to Unseen Questions
We tested the model on CES questions it was **never trained on**:
| Question Type | Example | Correlation (r) |
|--------------|---------|-----------------|
| **High-salience (Identity)** | COVID satisfaction | **0.60** |
| **High-salience (Identity)** | Carbon tax position | **0.49** |
| Low-salience (Policy) | Defence spending | 0.12 |
| Low-salience (Policy) | Environment spending | -0.12 |
### Key Finding
The model learned **political identity**, not policy platforms:
- **Carbon Tax** (r=0.49) vs **Environment Spending** (r=-0.12) — both are "about the environment" but carbon tax is a tribal identity marker while spending is a technocratic detail
- The 3 psychographic variables compress the "culture war" aspects of Canadian politics
- Model excels at identity/affect prediction, struggles with budget details
## Temporal Generalization
We tested the model on older CES surveys to measure temporal transfer:
| Election | Prime Minister | Correlation | Retention |
|----------|---------------|-------------|-----------|
| **2021** (training) | Trudeau (Liberal) | r = 0.428 | — |
| **2019** (same PM) | Trudeau (Liberal) | r = 0.353 | 82% |
| **2015** (different PM) | Harper (Conservative) | r = 0.206 | 49% |
**Key Finding**: The model is *government-specific*, not time-specific:
- **High transfer under same PM**: "Dissatisfied with Trudeau" maintains consistent left-right valence across 2019-2021
- **Poor transfer across PMs**: "Dissatisfied with Harper" has *opposite* valence (Liberal-leaning in 2015) from "dissatisfied with Trudeau" (Conservative-leaning in 2021)
This confirms the psychographic compression captures incumbent-relative affect, not arbitrary noise.
### Implications
This model is ideal for:
- Simulating political discourse and polarization
- Agent-based models of partisan sorting
- Studying affective political identity
Not suitable for:
- Predicting specific policy preferences
- Budget allocation modeling
## Citation
```bibtex
@software{ces-phase2-lora,
title = {CES Phase 2 LoRA: Psychographic Ideology Prediction},
author = {Coburn, Del},
year = {2025},
url = {https://huggingface.co/baglecake/ces-phase2-lora}
}
```
## Part of émile-GCE
This model is part of the [émile-GCE](https://github.com/delcoburn/emile-gce) project for Generative Computational Ethnography.