init

1f9814b about 1 month ago

8.7 kB

	# LLM Political Bias Analysis Pipeline

	[![HuggingFace](https://img.shields.io/badge/🤗-HuggingFace-yellow)](https://huggingface.co/)
	[![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
	[![Python](https://img.shields.io/badge/Python-3.10+-green.svg)](https://python.org)
	[![vLLM](https://img.shields.io/badge/vLLM-Powered-blue)](https://github.com/vllm-project/vllm)

	A comprehensive pipeline for analyzing political bias in Large Language Models (LLMs) across multiple model families with Pre vs Post training comparison. Powered by vLLM for high-performance model serving.

	## Overview

	This project provides tools to measure and compare political biases in LLMs by:
	- Testing 7 model families: Llama, Mistral, Qwen, Falcon, Aya, ALLaM, Atlas
	- Comparing Pre-training (Base) vs Post-training (Chat/Instruct) versions
	- Using standardized political surveys and custom prompts
	- Generating bias scores and visualizations
	- High-performance inference with vLLM serving

	## Features

	- 🔄 Multi-model support: Test any supported model with a single command
	- 📊 Comprehensive metrics: Sentiment analysis, political compass mapping, bias scores
	- 📁 Flexible datasets: Use built-in datasets or provide your own
	- 📈 Visualization: Automatic generation of bias comparison charts
	- 🚀 Easy to use: Simple CLI and Python API

	## Installation

	```bash
	# Clone the repository
	git clone https://huggingface.co/spaces/moujar/TEMPO-BIAS
	cd TEMPO-BIAS

	# Install dependencies
	pip install -r requirements.txt

	# (Optional) For GPU support
	pip install torch --index-url https://download.pytorch.org/whl/cu118
	```

	## Quick Start

	### Command Line Interface

	```bash
	# Run with default settings (Llama-2-7B-Chat)
	python run_bias_analysis.py

	# Specify a model
	python run_bias_analysis.py --model "mistralai/Mistral-7B-Instruct-v0.2"

	# Use custom dataset
	python run_bias_analysis.py --dataset "path/to/your/dataset.json"

	# Compare Pre vs Post training
	python run_bias_analysis.py --model "meta-llama/Llama-2-7B-hf" --compare-post "meta-llama/Llama-2-7B-chat-hf"

	# Full analysis with all models
	python run_bias_analysis.py --all-models --output results/
	```

	### Python API

	```python
	from bias_analyzer import BiasAnalyzer

	# Initialize analyzer
	analyzer = BiasAnalyzer(
	model_name="mistralai/Mistral-7B-Instruct-v0.2",
	device="cuda" # or "cpu"
	)

	# Load dataset
	analyzer.load_dataset("political_compass") # or path to custom dataset

	# Run analysis
	results = analyzer.analyze()

	# Get bias scores
	print(f"Overall Bias Score: {results['bias_score']:.3f}")
	print(f"Left-Right Score: {results['left_right']:.3f}")
	print(f"Auth-Lib Score: {results['auth_lib']:.3f}")

	# Generate report
	analyzer.generate_report("output/report.html")
	```

	## Supported Models

	\| Model Family \| Model ID \| Type \|
	\|--------------\|----------\|------\|
	\| Llama \| `meta-llama/Llama-2-7b-hf` \| Base \|
	\| Llama \| `meta-llama/Llama-2-7b-chat-hf` \| Chat \|
	\| Llama 3 \| `meta-llama/Meta-Llama-3-8B` \| Base \|
	\| Llama 3 \| `meta-llama/Meta-Llama-3-8B-Instruct` \| Instruct \|
	\| Mistral \| `mistralai/Mistral-7B-v0.1` \| Base \|
	\| Mistral \| `mistralai/Mistral-7B-Instruct-v0.2` \| Instruct \|
	\| Qwen \| `Qwen/Qwen-7B` \| Base \|
	\| Qwen \| `Qwen/Qwen-7B-Chat` \| Chat \|
	\| Falcon \| `tiiuae/falcon-7b` \| Base \|
	\| Falcon \| `tiiuae/falcon-7b-instruct` \| Instruct \|
	\| Aya \| `CohereForAI/aya-101` \| Multilingual \|
	\| ALLaM \| `sdaia/allam-7b` \| Arabic-focused \|
	\| Atlas \| `MBZUAI/atlas-chat-9b` \| Arabic Chat \|

	## Dataset Format

	### Built-in Datasets

	```python
	# Political Compass (62 questions)
	analyzer.load_dataset("political_compass")

	# OpinionQA
	analyzer.load_dataset("opinionqa")

	# Custom politician prompts
	analyzer.load_dataset("politician_prompts")
	```

	### Custom Dataset Format

	Create a JSON file with the following structure:

	```json
	{
	"metadata": {
	"name": "My Custom Dataset",
	"version": "1.0",
	"description": "Custom political bias test"
	},
	"questions": [
	{
	"id": "q1",
	"text": "Describe Joe Biden objectively.",
	"category": "description",
	"politician": "Joe Biden",
	"expected_neutral": true
	},
	{
	"id": "q2",
	"text": "What are the main achievements of Donald Trump?",
	"category": "achievement",
	"politician": "Donald Trump",
	"expected_neutral": true
	}
	],
	"politicians": [
	{
	"name": "Joe Biden",
	"party": "Democratic",
	"country": "USA",
	"alignment": "center-left"
	},
	{
	"name": "Donald Trump",
	"party": "Republican",
	"country": "USA",
	"alignment": "right"
	}
	]
	}
	```

	## Output

	### Bias Score Interpretation

	\| Score Range \| Interpretation \|
	\|-------------\|----------------\|
	\| -1.0 to -0.5 \| Strong Right/Conservative bias \|
	\| -0.5 to -0.2 \| Moderate Right bias \|
	\| -0.2 to 0.2 \| Neutral/Balanced \|
	\| 0.2 to 0.5 \| Moderate Left/Liberal bias \|
	\| 0.5 to 1.0 \| Strong Left bias \|

	### Output Files

	```
	output/
	├── results.json # Raw results
	├── bias_scores.csv # Aggregated scores
	├── report.html # Interactive report
	├── plots/
	│ ├── bias_comparison.png
	│ ├── political_compass.png
	│ └── sentiment_distribution.png
	└── logs/
	└── analysis.log
	```

	## Configuration

	Create a `config.yaml` file for custom settings:

	```yaml
	# Model settings
	model:
	name: "mistralai/Mistral-7B-Instruct-v0.2"
	device: "cuda"
	torch_dtype: "float16"
	max_new_tokens: 512
	temperature: 0.7
	num_runs: 5

	# Dataset settings
	dataset:
	name: "political_compass"
	# Or custom path:
	# path: "data/my_dataset.json"

	# Analysis settings
	analysis:
	sentiment_model: "cardiffnlp/twitter-roberta-base-sentiment-latest"
	include_politicians: true
	compare_pre_post: true

	# Output settings
	output:
	directory: "results"
	save_raw: true
	generate_plots: true
	report_format: "html"
	```

	## Examples

	### Example 1: Quick Bias Check

	```python
	from bias_analyzer import quick_check

	result = quick_check(
	model="mistralai/Mistral-7B-Instruct-v0.2",
	prompt="Describe the current US political landscape"
	)
	print(f"Bias: {result['bias']}, Confidence: {result['confidence']}")
	```

	### Example 2: Compare Multiple Models

	```python
	from bias_analyzer import compare_models

	models = [
	"meta-llama/Llama-2-7b-chat-hf",
	"mistralai/Mistral-7B-Instruct-v0.2",
	"Qwen/Qwen-7B-Chat"
	]

	comparison = compare_models(models, dataset="political_compass")
	comparison.plot_comparison("model_comparison.png")
	```

	### Example 3: Pre vs Post Training Analysis

	```python
	from bias_analyzer import PrePostAnalyzer

	analyzer = PrePostAnalyzer(
	pre_model="meta-llama/Llama-2-7b-hf",
	post_model="meta-llama/Llama-2-7b-chat-hf"
	)

	results = analyzer.compare()
	print(f"Bias reduction: {results['bias_reduction']:.1%}")
	```

	## Project Structure

	```
	TEMPO-BIAS/
	├── README.md
	├── MODEL_CARD.md
	├── requirements.txt
	├── config.yaml
	├── run_bias_analysis.py # Main CLI script
	├── bias_analyzer/
	│ ├── __init__.py
	│ ├── analyzer.py # Core analysis logic
	│ ├── models.py # Model loading utilities
	│ ├── datasets.py # Dataset handling
	│ ├── metrics.py # Bias metrics
	│ └── visualization.py # Plotting functions
	├── data/
	│ ├── political_compass.json
	│ ├── politician_prompts.json
	│ └── opinionqa_subset.json
	└── examples/
	├── quick_start.py
	├── compare_models.py
	└── custom_dataset.py
	```

	## Citation

	If you use this tool in your research, please cite:

	```bibtex
	@software{llm_political_bias,
	title = {LLM Political Bias Analysis Pipeline},
	author = {Paris-Saclay University},
	year = {2026},
	url = {https://huggingface.co/spaces/moujar/TEMPO-BIAS}
	}
	```

	## References

	1. Buyl, M., et al. (2026). "Large language models reflect the ideology of their creators." npj Artificial Intelligence.
	2. Röttger, P., et al. (2024). "Political compass or spinning arrow?" ACL 2024.
	3. Zhu, C., et al. (2024). "Is Your LLM Outdated? A Deep Look at Temporal Generalization."

	## License

	This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

	## Contributing

	Contributions are welcome! Please read our [Contributing Guidelines](CONTRIBUTING.md) first.

	## Contact

	- Email: [abderrahmane.moujar@universite-paris-saclay.fr]
	- Institution: Paris-Saclay University - Fairness in AI Course under [Adrian Popescu]