Spaces:
Sleeping
Redesign app to focus on experiment methodology
Browse filesMajor restructure prioritizing the AI experiment over voting results:
Tab 1: The Experiment - explains agent architecture, system prompts, process
Tab 2: System Prompt Explorer - view any country's system prompt
Tab 3: The Resolution - shows the motion text
Tab 4: Case Study Gaza Ceasefire - voting results (moved to secondary position)
Tab 5: Agent Response Inspector - compare prompt β response
Tab 6: All Responses - complete data table
Key improvements:
- Emphasizes this is an AI research experiment, not prediction
- Shows system prompts are generic templates
- Explains how AI infers positions from training data
- Clear disclaimers about limitations
- Links to open source code and prompts
π€ Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
@@ -4,16 +4,32 @@ import pandas as pd
|
|
| 4 |
import plotly.graph_objects as go
|
| 5 |
from pathlib import Path
|
| 6 |
|
| 7 |
-
# Load
|
| 8 |
def load_data():
|
| 9 |
data_path = Path("tasks/reactions/01_gaza_ceasefire_resolution_latest.json")
|
| 10 |
with open(data_path, 'r') as f:
|
| 11 |
return json.load(f)
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
def create_vote_summary_chart(data):
|
| 14 |
-
"""Create a pie chart showing vote distribution"""
|
| 15 |
vote_summary = data['vote_summary']
|
| 16 |
-
|
| 17 |
fig = go.Figure(data=[go.Pie(
|
| 18 |
labels=['Yes', 'No', 'Abstain'],
|
| 19 |
values=[vote_summary['yes'], vote_summary['no'], vote_summary['abstain']],
|
|
@@ -21,182 +37,263 @@ def create_vote_summary_chart(data):
|
|
| 21 |
textinfo='label+value+percent',
|
| 22 |
textfont_size=16
|
| 23 |
)])
|
| 24 |
-
|
| 25 |
fig.update_layout(
|
| 26 |
-
title=f"
|
| 27 |
-
height=
|
| 28 |
showlegend=True
|
| 29 |
)
|
| 30 |
-
|
| 31 |
return fig
|
| 32 |
|
| 33 |
-
def
|
| 34 |
-
"""
|
| 35 |
-
votes = data['votes']
|
| 36 |
-
df = pd.DataFrame([
|
| 37 |
-
{
|
| 38 |
-
'Country': v['country'],
|
| 39 |
-
'Vote': v['vote'].upper(),
|
| 40 |
-
'Statement': v['statement'][:200] + '...' if len(v['statement']) > 200 else v['statement']
|
| 41 |
-
}
|
| 42 |
-
for v in votes
|
| 43 |
-
])
|
| 44 |
-
return df
|
| 45 |
-
|
| 46 |
-
def create_regional_breakdown(data):
|
| 47 |
-
"""Create regional vote breakdown (simplified grouping)"""
|
| 48 |
-
# Simplified regional classification
|
| 49 |
-
regions = {
|
| 50 |
-
'Middle East & North Africa': ['Afghanistan', 'Algeria', 'Bahrain', 'Egypt', 'Iran', 'Iraq', 'Israel',
|
| 51 |
-
'Jordan', 'Kuwait', 'Lebanon', 'Libya', 'Morocco', 'Oman', 'Palestine',
|
| 52 |
-
'Qatar', 'Saudi Arabia', 'Syria', 'Tunisia', 'United Arab Emirates', 'Yemen'],
|
| 53 |
-
'Europe': ['Albania', 'Andorra', 'Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czechia',
|
| 54 |
-
'Denmark', 'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', 'Iceland',
|
| 55 |
-
'Ireland', 'Italy', 'Latvia', 'Lithuania', 'Luxembourg', 'Malta', 'Monaco', 'Montenegro',
|
| 56 |
-
'Netherlands', 'North Macedonia', 'Norway', 'Poland', 'Portugal', 'Romania', 'San Marino',
|
| 57 |
-
'Serbia', 'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Switzerland', 'United Kingdom'],
|
| 58 |
-
'Asia-Pacific': ['Australia', 'Bangladesh', 'Bhutan', 'Brunei', 'Cambodia', 'China', 'Fiji', 'India',
|
| 59 |
-
'Indonesia', 'Japan', 'Kiribati', 'Laos', 'Malaysia', 'Maldives', 'Marshall Islands',
|
| 60 |
-
'Micronesia', 'Mongolia', 'Myanmar', 'Nauru', 'Nepal', 'New Zealand', 'Pakistan',
|
| 61 |
-
'Palau', 'Papua New Guinea', 'Philippines', 'Samoa', 'Singapore', 'Solomon Islands',
|
| 62 |
-
'South Korea', 'Sri Lanka', 'Thailand', 'Timor-Leste', 'Tonga', 'Tuvalu', 'Vanuatu', 'Vietnam'],
|
| 63 |
-
'Africa': ['Angola', 'Benin', 'Botswana', 'Burkina Faso', 'Burundi', 'Cameroon', 'Cape Verde',
|
| 64 |
-
'Central African Republic', 'Chad', 'Comoros', 'Congo', 'CΓ΄te d\'Ivoire', 'Djibouti',
|
| 65 |
-
'Equatorial Guinea', 'Eritrea', 'Eswatini', 'Ethiopia', 'Gabon', 'Gambia', 'Ghana',
|
| 66 |
-
'Guinea', 'Guinea-Bissau', 'Kenya', 'Lesotho', 'Liberia', 'Madagascar', 'Malawi',
|
| 67 |
-
'Mali', 'Mauritania', 'Mauritius', 'Mozambique', 'Namibia', 'Niger', 'Nigeria',
|
| 68 |
-
'Rwanda', 'SΓ£o TomΓ© and PrΓncipe', 'Senegal', 'Seychelles', 'Sierra Leone', 'Somalia',
|
| 69 |
-
'South Africa', 'South Sudan', 'Sudan', 'Tanzania', 'Togo', 'Uganda', 'Zambia', 'Zimbabwe'],
|
| 70 |
-
'Americas': ['Antigua And Barbuda', 'Argentina', 'Bahamas', 'Barbados', 'Belize', 'Bolivia', 'Brazil',
|
| 71 |
-
'Canada', 'Chile', 'Colombia', 'Costa Rica', 'Cuba', 'Dominica', 'Dominican Republic',
|
| 72 |
-
'Ecuador', 'El Salvador', 'Grenada', 'Guatemala', 'Guyana', 'Haiti', 'Honduras', 'Jamaica',
|
| 73 |
-
'Mexico', 'Nicaragua', 'Panama', 'Paraguay', 'Peru', 'Saint Kitts And Nevis', 'Saint Lucia',
|
| 74 |
-
'Saint Vincent And The Grenadines', 'Suriname', 'Trinidad And Tobago', 'United States', 'Uruguay', 'Venezuela'],
|
| 75 |
-
'Eastern Europe & Central Asia': ['Armenia', 'Azerbaijan', 'Belarus', 'Georgia', 'Kazakhstan', 'Kyrgyzstan',
|
| 76 |
-
'Moldova', 'Russia', 'Tajikistan', 'Turkmenistan', 'Ukraine', 'Uzbekistan']
|
| 77 |
-
}
|
| 78 |
-
|
| 79 |
-
regional_votes = {region: {'yes': 0, 'no': 0, 'abstain': 0} for region in regions}
|
| 80 |
-
|
| 81 |
-
for vote in data['votes']:
|
| 82 |
-
country = vote['country']
|
| 83 |
-
vote_type = vote['vote']
|
| 84 |
-
|
| 85 |
-
for region, countries in regions.items():
|
| 86 |
-
if country in countries:
|
| 87 |
-
regional_votes[region][vote_type] += 1
|
| 88 |
-
break
|
| 89 |
-
|
| 90 |
-
# Create stacked bar chart
|
| 91 |
-
regions_list = list(regional_votes.keys())
|
| 92 |
-
yes_votes = [regional_votes[r]['yes'] for r in regions_list]
|
| 93 |
-
no_votes = [regional_votes[r]['no'] for r in regions_list]
|
| 94 |
-
abstain_votes = [regional_votes[r]['abstain'] for r in regions_list]
|
| 95 |
-
|
| 96 |
-
fig = go.Figure(data=[
|
| 97 |
-
go.Bar(name='Yes', x=regions_list, y=yes_votes, marker_color='#2ecc71'),
|
| 98 |
-
go.Bar(name='No', x=regions_list, y=no_votes, marker_color='#e74c3c'),
|
| 99 |
-
go.Bar(name='Abstain', x=regions_list, y=abstain_votes, marker_color='#f39c12')
|
| 100 |
-
])
|
| 101 |
-
|
| 102 |
-
fig.update_layout(
|
| 103 |
-
barmode='stack',
|
| 104 |
-
title='Regional Voting Breakdown',
|
| 105 |
-
xaxis_title='Region',
|
| 106 |
-
yaxis_title='Number of Countries',
|
| 107 |
-
height=500,
|
| 108 |
-
xaxis={'tickangle': -45}
|
| 109 |
-
)
|
| 110 |
-
|
| 111 |
-
return fig
|
| 112 |
-
|
| 113 |
-
def get_country_details(country_name, data):
|
| 114 |
-
"""Get detailed voting info for a specific country"""
|
| 115 |
if not country_name:
|
| 116 |
-
return "Select a country to see
|
| 117 |
|
| 118 |
for vote in data['votes']:
|
| 119 |
if vote['country'].lower() == country_name.lower():
|
| 120 |
-
|
| 121 |
-
|
|
|
|
| 122 |
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
**Statement:**
|
| 126 |
{vote['statement']}
|
| 127 |
"""
|
| 128 |
-
|
|
|
|
| 129 |
|
| 130 |
# Load data
|
| 131 |
data = load_data()
|
| 132 |
country_names = sorted([v['country'] for v in data['votes']])
|
|
|
|
| 133 |
|
| 134 |
# Create Gradio interface
|
| 135 |
-
with gr.Blocks(title="
|
| 136 |
-
gr.Markdown("""
|
| 137 |
-
# πΊπ³ UN AI Agent Simulation: Gaza Ceasefire Resolution
|
| 138 |
|
| 139 |
-
|
| 140 |
-
|
| 141 |
|
| 142 |
-
|
| 143 |
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
- Total Countries: 195
|
| 148 |
""")
|
| 149 |
|
| 150 |
-
with gr.Tab("
|
| 151 |
-
gr.Markdown("
|
| 152 |
-
|
| 153 |
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
- **
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 159 |
""")
|
| 160 |
|
| 161 |
-
with gr.Tab("
|
| 162 |
-
gr.Markdown("
|
| 163 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 164 |
|
| 165 |
-
with gr.Tab("
|
| 166 |
-
gr.Markdown("
|
| 167 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 168 |
choices=country_names,
|
| 169 |
-
label="Select Country",
|
| 170 |
-
value=
|
| 171 |
)
|
| 172 |
-
country_details = gr.Markdown(value=get_country_details(country_names[0], data))
|
| 173 |
|
| 174 |
-
|
| 175 |
-
|
| 176 |
-
|
| 177 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 178 |
)
|
| 179 |
|
| 180 |
-
with gr.Tab("
|
| 181 |
-
gr.Markdown("### Complete voting record with statements")
|
| 182 |
-
|
| 183 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 184 |
height=600,
|
| 185 |
-
interactive=False
|
|
|
|
| 186 |
)
|
| 187 |
|
| 188 |
gr.Markdown("""
|
| 189 |
---
|
| 190 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 191 |
|
| 192 |
-
|
| 193 |
-
|
| 194 |
-
|
|
|
|
| 195 |
|
| 196 |
-
|
| 197 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 198 |
|
| 199 |
-
|
| 200 |
""")
|
| 201 |
|
| 202 |
if __name__ == "__main__":
|
|
|
|
| 4 |
import plotly.graph_objects as go
|
| 5 |
from pathlib import Path
|
| 6 |
|
| 7 |
+
# Load data
|
| 8 |
def load_data():
|
| 9 |
data_path = Path("tasks/reactions/01_gaza_ceasefire_resolution_latest.json")
|
| 10 |
with open(data_path, 'r') as f:
|
| 11 |
return json.load(f)
|
| 12 |
|
| 13 |
+
def load_system_prompt(country_slug):
|
| 14 |
+
"""Load the system prompt for a specific country"""
|
| 15 |
+
try:
|
| 16 |
+
prompt_path = Path(f"agents/representatives/{country_slug}/system-prompt.md")
|
| 17 |
+
with open(prompt_path, 'r') as f:
|
| 18 |
+
return f.read()
|
| 19 |
+
except:
|
| 20 |
+
return "System prompt not found for this country."
|
| 21 |
+
|
| 22 |
+
def load_motion():
|
| 23 |
+
"""Load the ceasefire resolution text"""
|
| 24 |
+
try:
|
| 25 |
+
with open("tasks/motions/01_gaza_ceasefire_resolution.md", 'r') as f:
|
| 26 |
+
return f.read()
|
| 27 |
+
except:
|
| 28 |
+
return "Motion text not found."
|
| 29 |
+
|
| 30 |
+
# Visualization functions
|
| 31 |
def create_vote_summary_chart(data):
|
|
|
|
| 32 |
vote_summary = data['vote_summary']
|
|
|
|
| 33 |
fig = go.Figure(data=[go.Pie(
|
| 34 |
labels=['Yes', 'No', 'Abstain'],
|
| 35 |
values=[vote_summary['yes'], vote_summary['no'], vote_summary['abstain']],
|
|
|
|
| 37 |
textinfo='label+value+percent',
|
| 38 |
textfont_size=16
|
| 39 |
)])
|
|
|
|
| 40 |
fig.update_layout(
|
| 41 |
+
title=f"Voting Results (Total: {data['total_votes']} countries)",
|
| 42 |
+
height=400,
|
| 43 |
showlegend=True
|
| 44 |
)
|
|
|
|
| 45 |
return fig
|
| 46 |
|
| 47 |
+
def get_country_response(country_name, data):
|
| 48 |
+
"""Get the full response for a specific country"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
if not country_name:
|
| 50 |
+
return "Select a country to see their full response", ""
|
| 51 |
|
| 52 |
for vote in data['votes']:
|
| 53 |
if vote['country'].lower() == country_name.lower():
|
| 54 |
+
vote_emoji = "β
" if vote['vote'] == 'yes' else "β" if vote['vote'] == 'no' else "βͺ"
|
| 55 |
+
response = f"""
|
| 56 |
+
## {vote_emoji} Vote: {vote['vote'].upper()}
|
| 57 |
|
| 58 |
+
### Diplomatic Statement:
|
|
|
|
|
|
|
| 59 |
{vote['statement']}
|
| 60 |
"""
|
| 61 |
+
return response, vote['country_slug']
|
| 62 |
+
return "Country not found", ""
|
| 63 |
|
| 64 |
# Load data
|
| 65 |
data = load_data()
|
| 66 |
country_names = sorted([v['country'] for v in data['votes']])
|
| 67 |
+
motion_text = load_motion()
|
| 68 |
|
| 69 |
# Create Gradio interface
|
| 70 |
+
with gr.Blocks(title="AI Agent UN Experiment", theme=gr.themes.Soft()) as demo:
|
|
|
|
|
|
|
| 71 |
|
| 72 |
+
gr.Markdown("""
|
| 73 |
+
# π€ AI Agent United Nations Experiment
|
| 74 |
|
| 75 |
+
## Simulating International Diplomacy with Large Language Models
|
| 76 |
|
| 77 |
+
This is an experimental research project that explores how AI can model international diplomatic behavior.
|
| 78 |
+
Each of the 195 UN member states is represented by an AI agent with a unique system prompt defining their
|
| 79 |
+
foreign policy positions, national interests, and diplomatic style.
|
|
|
|
| 80 |
""")
|
| 81 |
|
| 82 |
+
with gr.Tab("π¬ The Experiment"):
|
| 83 |
+
gr.Markdown("""
|
| 84 |
+
## How It Works
|
| 85 |
|
| 86 |
+
### 1. Agent Architecture
|
| 87 |
+
Each country is represented by an AI agent powered by **Claude 3.5 Sonnet** (claude-3-5-sonnet-20241022).
|
| 88 |
+
Every agent receives a unique system prompt that defines:
|
| 89 |
+
|
| 90 |
+
- **National Identity**: The country they represent and their role
|
| 91 |
+
- **Core Responsibilities**: How to advocate for their country's interests
|
| 92 |
+
- **Behavioral Guidelines**: Diplomatic style and historical context
|
| 93 |
+
- **Key Considerations**: Security, economic, and strategic factors
|
| 94 |
+
- **Decision Framework**: How to analyze and respond to resolutions
|
| 95 |
+
|
| 96 |
+
### 2. The System Prompts
|
| 97 |
+
|
| 98 |
+
The system prompts are **generic templates** - they do NOT contain country-specific foreign policy positions.
|
| 99 |
+
Instead, they instruct the AI to:
|
| 100 |
+
- Draw upon the country's historical positions (from the model's training data)
|
| 101 |
+
- Consider national security and economic interests
|
| 102 |
+
- Maintain appropriate diplomatic tone
|
| 103 |
+
- Think strategically about alliances and precedents
|
| 104 |
+
|
| 105 |
+
This means the AI agent must infer each country's likely position based on what it has learned
|
| 106 |
+
during training about that country's foreign policy, voting patterns, and geopolitical context.
|
| 107 |
+
|
| 108 |
+
### 3. The Process
|
| 109 |
+
|
| 110 |
+
1. **Input**: Each agent receives the same UN resolution text
|
| 111 |
+
2. **Processing**: The agent analyzes how the resolution affects their country's interests
|
| 112 |
+
3. **Output**: The agent produces a structured JSON response containing:
|
| 113 |
+
- A vote: YES, NO, or ABSTAIN
|
| 114 |
+
- A diplomatic statement explaining their position
|
| 115 |
+
|
| 116 |
+
### 4. What This Tests
|
| 117 |
+
|
| 118 |
+
This experiment explores:
|
| 119 |
+
- How well LLMs understand different countries' foreign policy positions
|
| 120 |
+
- Whether AI can model complex geopolitical decision-making
|
| 121 |
+
- The diversity of perspectives in international relations
|
| 122 |
+
- Multi-agent AI systems in realistic scenarios
|
| 123 |
+
|
| 124 |
+
### 5. Important Limitations
|
| 125 |
+
|
| 126 |
+
β οΈ **This is a simulation, not prediction:**
|
| 127 |
+
- The AI agents' positions are based on historical patterns in training data
|
| 128 |
+
- They do NOT represent actual government policies or intentions
|
| 129 |
+
- They should NOT be considered authoritative or predictive
|
| 130 |
+
- Real diplomacy involves classified information, domestic politics, and human judgment
|
| 131 |
""")
|
| 132 |
|
| 133 |
+
with gr.Tab("π System Prompt Explorer"):
|
| 134 |
+
gr.Markdown("""
|
| 135 |
+
## Explore the Agent System Prompts
|
| 136 |
+
|
| 137 |
+
Select any country to view the exact system prompt their AI agent received.
|
| 138 |
+
Notice how the prompts are **identical in structure** - the only differences are:
|
| 139 |
+
- The country name
|
| 140 |
+
- Whether they're a P5 member (for veto power context)
|
| 141 |
+
|
| 142 |
+
The AI must infer everything else from its training data about each country.
|
| 143 |
+
""")
|
| 144 |
+
|
| 145 |
+
with gr.Row():
|
| 146 |
+
with gr.Column(scale=1):
|
| 147 |
+
country_selector = gr.Dropdown(
|
| 148 |
+
choices=country_names,
|
| 149 |
+
label="Select Country",
|
| 150 |
+
value="United States"
|
| 151 |
+
)
|
| 152 |
+
gr.Markdown("""
|
| 153 |
+
### Try comparing:
|
| 154 |
+
- **P5 members**: United States, China, Russia, United Kingdom, France
|
| 155 |
+
- **Regional powers**: Brazil, India, South Africa, Nigeria
|
| 156 |
+
- **Small states**: Palau, Tuvalu, Monaco
|
| 157 |
+
- **Key stakeholders**: Israel, Palestine, Egypt, Iran
|
| 158 |
+
""")
|
| 159 |
+
|
| 160 |
+
with gr.Column(scale=2):
|
| 161 |
+
system_prompt_display = gr.Markdown(
|
| 162 |
+
value=load_system_prompt("united-states"),
|
| 163 |
+
label="System Prompt"
|
| 164 |
+
)
|
| 165 |
+
|
| 166 |
+
country_selector.change(
|
| 167 |
+
fn=lambda country: load_system_prompt(data['votes'][[v['country'] for v in data['votes']].index(country)]['country_slug']),
|
| 168 |
+
inputs=country_selector,
|
| 169 |
+
outputs=system_prompt_display
|
| 170 |
+
)
|
| 171 |
|
| 172 |
+
with gr.Tab("π The Resolution"):
|
| 173 |
+
gr.Markdown("""
|
| 174 |
+
## The Motion Presented to All Agents
|
| 175 |
+
|
| 176 |
+
Every AI agent received this exact same resolution text and was asked to vote on it.
|
| 177 |
+
|
| 178 |
+
**Resolution**: Support for Ceasefire Agreement in Gaza and Commitment to Lasting Peace
|
| 179 |
+
""")
|
| 180 |
+
|
| 181 |
+
gr.Markdown(motion_text)
|
| 182 |
+
|
| 183 |
+
with gr.Tab("π³οΈ Case Study: Gaza Ceasefire"):
|
| 184 |
+
gr.Markdown("""
|
| 185 |
+
## Simulation Results
|
| 186 |
+
|
| 187 |
+
This tab shows the results when all 195 AI country agents voted on the ceasefire resolution.
|
| 188 |
+
This is ONE example of the experiment in action.
|
| 189 |
+
""")
|
| 190 |
+
|
| 191 |
+
with gr.Row():
|
| 192 |
+
with gr.Column():
|
| 193 |
+
vote_chart = gr.Plot(value=create_vote_summary_chart(data))
|
| 194 |
+
|
| 195 |
+
gr.Markdown(f"""
|
| 196 |
+
### Results Summary
|
| 197 |
+
- **Yes votes:** {data['vote_summary']['yes']} ({data['vote_summary']['yes']/data['total_votes']*100:.1f}%)
|
| 198 |
+
- **No votes:** {data['vote_summary']['no']} ({data['vote_summary']['no']/data['total_votes']*100:.1f}%)
|
| 199 |
+
- **Abstentions:** {data['vote_summary']['abstain']} ({data['vote_summary']['abstain']/data['total_votes']*100:.1f}%)
|
| 200 |
+
|
| 201 |
+
**Model**: {data['model']}
|
| 202 |
+
**Date**: {data['timestamp'][:10]}
|
| 203 |
+
""")
|
| 204 |
+
|
| 205 |
+
with gr.Tab("π Agent Response Inspector"):
|
| 206 |
+
gr.Markdown("""
|
| 207 |
+
## Compare System Prompt β Agent Response
|
| 208 |
+
|
| 209 |
+
Select a country to see:
|
| 210 |
+
1. The system prompt they received
|
| 211 |
+
2. The vote and statement they produced
|
| 212 |
+
|
| 213 |
+
This shows how the generic prompt + the model's knowledge β specific diplomatic position
|
| 214 |
+
""")
|
| 215 |
+
|
| 216 |
+
country_inspector = gr.Dropdown(
|
| 217 |
choices=country_names,
|
| 218 |
+
label="Select Country to Inspect",
|
| 219 |
+
value="United States"
|
| 220 |
)
|
|
|
|
| 221 |
|
| 222 |
+
with gr.Row():
|
| 223 |
+
with gr.Column():
|
| 224 |
+
gr.Markdown("### System Prompt Received")
|
| 225 |
+
inspector_prompt = gr.Markdown(value=load_system_prompt("united-states"))
|
| 226 |
+
|
| 227 |
+
with gr.Column():
|
| 228 |
+
gr.Markdown("### Agent's Response")
|
| 229 |
+
inspector_response = gr.Markdown(value=get_country_response("United States", data)[0])
|
| 230 |
+
|
| 231 |
+
def update_inspector(country):
|
| 232 |
+
response, slug = get_country_response(country, data)
|
| 233 |
+
prompt = load_system_prompt(slug) if slug else "Country not found"
|
| 234 |
+
return prompt, response
|
| 235 |
+
|
| 236 |
+
country_inspector.change(
|
| 237 |
+
fn=update_inspector,
|
| 238 |
+
inputs=country_inspector,
|
| 239 |
+
outputs=[inspector_prompt, inspector_response]
|
| 240 |
)
|
| 241 |
|
| 242 |
+
with gr.Tab("π All Responses"):
|
| 243 |
+
gr.Markdown("### Complete voting record with all diplomatic statements")
|
| 244 |
+
|
| 245 |
+
votes_data = pd.DataFrame([
|
| 246 |
+
{
|
| 247 |
+
'Country': v['country'],
|
| 248 |
+
'Vote': v['vote'].upper(),
|
| 249 |
+
'Statement': v['statement']
|
| 250 |
+
}
|
| 251 |
+
for v in data['votes']
|
| 252 |
+
])
|
| 253 |
+
|
| 254 |
+
gr.Dataframe(
|
| 255 |
+
value=votes_data,
|
| 256 |
height=600,
|
| 257 |
+
interactive=False,
|
| 258 |
+
column_widths=["15%", "10%", "75%"]
|
| 259 |
)
|
| 260 |
|
| 261 |
gr.Markdown("""
|
| 262 |
---
|
| 263 |
+
## About This Project
|
| 264 |
+
|
| 265 |
+
**AI Agent UN** is an experimental research project exploring multi-agent AI systems in international relations contexts.
|
| 266 |
+
|
| 267 |
+
### Key Points
|
| 268 |
|
| 269 |
+
β
**What this is:**
|
| 270 |
+
- An AI experiment in modeling diplomatic behavior
|
| 271 |
+
- A research tool for understanding LLM capabilities
|
| 272 |
+
- An educational demonstration of international relations complexity
|
| 273 |
|
| 274 |
+
β οΈ **What this is NOT:**
|
| 275 |
+
- A prediction of actual government positions
|
| 276 |
+
- An authoritative source on foreign policy
|
| 277 |
+
- A replacement for real diplomatic analysis
|
| 278 |
+
|
| 279 |
+
### Open Source
|
| 280 |
+
|
| 281 |
+
This project is open source. All system prompts, code, and simulation results are available on GitHub.
|
| 282 |
+
|
| 283 |
+
- π [GitHub Repository](https://github.com/danielrosehill/AI-Agent-UN)
|
| 284 |
+
- π [Documentation](https://github.com/danielrosehill/AI-Agent-UN/blob/main/README.md)
|
| 285 |
+
- π€ [Agent Prompts](https://github.com/danielrosehill/AI-Agent-UN/tree/main/agents/representatives)
|
| 286 |
+
|
| 287 |
+
### Technical Details
|
| 288 |
+
|
| 289 |
+
- **Model**: Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)
|
| 290 |
+
- **Countries**: 195 UN member states
|
| 291 |
+
- **Output Format**: Structured JSON (vote + statement)
|
| 292 |
+
- **System Prompts**: Generic templates (no country-specific policies hardcoded)
|
| 293 |
+
|
| 294 |
+
---
|
| 295 |
|
| 296 |
+
*Built with [Gradio](https://gradio.app) | Powered by [Anthropic Claude](https://anthropic.com/claude)*
|
| 297 |
""")
|
| 298 |
|
| 299 |
if __name__ == "__main__":
|