Spaces:

danielrosehill
/

Agent-UN

Sleeping

danielrosehill Claude commited on Oct 9, 2025

Commit

617208b

1 Parent(s): c266ed1

Redesign app to focus on experiment methodology

Major restructure prioritizing the AI experiment over voting results:

Tab 1: The Experiment - explains agent architecture, system prompts, process
Tab 2: System Prompt Explorer - view any country's system prompt
Tab 3: The Resolution - shows the motion text
Tab 4: Case Study Gaza Ceasefire - voting results (moved to secondary position)
Tab 5: Agent Response Inspector - compare prompt → response
Tab 6: All Responses - complete data table

Key improvements:
- Emphasizes this is an AI research experiment, not prediction
- Shows system prompts are generic templates
- Explains how AI infers positions from training data
- Clear disclaimers about limitations
- Links to open source code and prompts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (1) hide show

app.py +236 -139

app.py CHANGED Viewed

@@ -4,16 +4,32 @@ import pandas as pd
 import plotly.graph_objects as go
 from pathlib import Path
-# Load the latest ceasefire resolution results
 def load_data():
     data_path = Path("tasks/reactions/01_gaza_ceasefire_resolution_latest.json")
     with open(data_path, 'r') as f:
         return json.load(f)
 def create_vote_summary_chart(data):
-    """Create a pie chart showing vote distribution"""
     vote_summary = data['vote_summary']
     fig = go.Figure(data=[go.Pie(
         labels=['Yes', 'No', 'Abstain'],
         values=[vote_summary['yes'], vote_summary['no'], vote_summary['abstain']],
@@ -21,182 +37,263 @@ def create_vote_summary_chart(data):
         textinfo='label+value+percent',
         textfont_size=16
     )])
     fig.update_layout(
-        title=f"Gaza Ceasefire Resolution Voting Results (Total: {data['total_votes']} countries)",
-        height=500,
         showlegend=True
     )
     return fig
-def create_votes_table(data):
-    """Create a detailed table of all votes"""
-    votes = data['votes']
-    df = pd.DataFrame([
-        {
-            'Country': v['country'],
-            'Vote': v['vote'].upper(),
-            'Statement': v['statement'][:200] + '...' if len(v['statement']) > 200 else v['statement']
-        }
-        for v in votes
-    ])
-    return df
-def create_regional_breakdown(data):
-    """Create regional vote breakdown (simplified grouping)"""
-    # Simplified regional classification
-    regions = {
-        'Middle East & North Africa': ['Afghanistan', 'Algeria', 'Bahrain', 'Egypt', 'Iran', 'Iraq', 'Israel',
-                                        'Jordan', 'Kuwait', 'Lebanon', 'Libya', 'Morocco', 'Oman', 'Palestine',
-                                        'Qatar', 'Saudi Arabia', 'Syria', 'Tunisia', 'United Arab Emirates', 'Yemen'],
-        'Europe': ['Albania', 'Andorra', 'Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czechia',
-                   'Denmark', 'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', 'Iceland',
-                   'Ireland', 'Italy', 'Latvia', 'Lithuania', 'Luxembourg', 'Malta', 'Monaco', 'Montenegro',
-                   'Netherlands', 'North Macedonia', 'Norway', 'Poland', 'Portugal', 'Romania', 'San Marino',
-                   'Serbia', 'Slovakia', 'Slovenia', 'Spain', 'Sweden', 'Switzerland', 'United Kingdom'],
-        'Asia-Pacific': ['Australia', 'Bangladesh', 'Bhutan', 'Brunei', 'Cambodia', 'China', 'Fiji', 'India',
-                        'Indonesia', 'Japan', 'Kiribati', 'Laos', 'Malaysia', 'Maldives', 'Marshall Islands',
-                        'Micronesia', 'Mongolia', 'Myanmar', 'Nauru', 'Nepal', 'New Zealand', 'Pakistan',
-                        'Palau', 'Papua New Guinea', 'Philippines', 'Samoa', 'Singapore', 'Solomon Islands',
-                        'South Korea', 'Sri Lanka', 'Thailand', 'Timor-Leste', 'Tonga', 'Tuvalu', 'Vanuatu', 'Vietnam'],
-        'Africa': ['Angola', 'Benin', 'Botswana', 'Burkina Faso', 'Burundi', 'Cameroon', 'Cape Verde',
-                   'Central African Republic', 'Chad', 'Comoros', 'Congo', 'Côte d\'Ivoire', 'Djibouti',
-                   'Equatorial Guinea', 'Eritrea', 'Eswatini', 'Ethiopia', 'Gabon', 'Gambia', 'Ghana',
-                   'Guinea', 'Guinea-Bissau', 'Kenya', 'Lesotho', 'Liberia', 'Madagascar', 'Malawi',
-                   'Mali', 'Mauritania', 'Mauritius', 'Mozambique', 'Namibia', 'Niger', 'Nigeria',
-                   'Rwanda', 'São Tomé and Príncipe', 'Senegal', 'Seychelles', 'Sierra Leone', 'Somalia',
-                   'South Africa', 'South Sudan', 'Sudan', 'Tanzania', 'Togo', 'Uganda', 'Zambia', 'Zimbabwe'],
-        'Americas': ['Antigua And Barbuda', 'Argentina', 'Bahamas', 'Barbados', 'Belize', 'Bolivia', 'Brazil',
-                    'Canada', 'Chile', 'Colombia', 'Costa Rica', 'Cuba', 'Dominica', 'Dominican Republic',
-                    'Ecuador', 'El Salvador', 'Grenada', 'Guatemala', 'Guyana', 'Haiti', 'Honduras', 'Jamaica',
-                    'Mexico', 'Nicaragua', 'Panama', 'Paraguay', 'Peru', 'Saint Kitts And Nevis', 'Saint Lucia',
-                    'Saint Vincent And The Grenadines', 'Suriname', 'Trinidad And Tobago', 'United States', 'Uruguay', 'Venezuela'],
-        'Eastern Europe & Central Asia': ['Armenia', 'Azerbaijan', 'Belarus', 'Georgia', 'Kazakhstan', 'Kyrgyzstan',
-                                          'Moldova', 'Russia', 'Tajikistan', 'Turkmenistan', 'Ukraine', 'Uzbekistan']
-    }
-    regional_votes = {region: {'yes': 0, 'no': 0, 'abstain': 0} for region in regions}
-    for vote in data['votes']:
-        country = vote['country']
-        vote_type = vote['vote']
-        for region, countries in regions.items():
-            if country in countries:
-                regional_votes[region][vote_type] += 1
-                break
-    # Create stacked bar chart
-    regions_list = list(regional_votes.keys())
-    yes_votes = [regional_votes[r]['yes'] for r in regions_list]
-    no_votes = [regional_votes[r]['no'] for r in regions_list]
-    abstain_votes = [regional_votes[r]['abstain'] for r in regions_list]
-    fig = go.Figure(data=[
-        go.Bar(name='Yes', x=regions_list, y=yes_votes, marker_color='#2ecc71'),
-        go.Bar(name='No', x=regions_list, y=no_votes, marker_color='#e74c3c'),
-        go.Bar(name='Abstain', x=regions_list, y=abstain_votes, marker_color='#f39c12')
-    ])
-    fig.update_layout(
-        barmode='stack',
-        title='Regional Voting Breakdown',
-        xaxis_title='Region',
-        yaxis_title='Number of Countries',
-        height=500,
-        xaxis={'tickangle': -45}
-    )
-    return fig
-def get_country_details(country_name, data):
-    """Get detailed voting info for a specific country"""
     if not country_name:
-        return "Select a country to see details"
     for vote in data['votes']:
         if vote['country'].lower() == country_name.lower():
-            return f"""
-**Country:** {vote['country']}
-**Vote:** {vote['vote'].upper()}
-**Statement:**
 {vote['statement']}
             """
-    return "Country not found"
 # Load data
 data = load_data()
 country_names = sorted([v['country'] for v in data['votes']])
 # Create Gradio interface
-with gr.Blocks(title="UN AI Agent Simulation - Gaza Ceasefire Resolution", theme=gr.themes.Soft()) as demo:
-    gr.Markdown("""
-    # 🇺🇳 UN AI Agent Simulation: Gaza Ceasefire Resolution
-    An experimental Model United Nations simulation where AI agents representing 195 countries vote on a ceasefire resolution.
-    Each agent embodies the foreign policy positions, diplomatic style, and national interests of their country.
-    **Motion:** Support for Ceasefire Agreement in Gaza and Commitment to Lasting Peace
-    **Simulation Details:**
-    - Model: Claude 3.5 Sonnet
-    - Date: October 9, 2025
-    - Total Countries: 195
     """)
-    with gr.Tab("📊 Vote Summary"):
-        gr.Markdown("### Overall Voting Results")
-        vote_chart = gr.Plot(value=create_vote_summary_chart(data))
-        gr.Markdown(f"""
-        ### Key Statistics
-        - **Yes votes:** {data['vote_summary']['yes']} ({data['vote_summary']['yes']/data['total_votes']*100:.1f}%)
-        - **No votes:** {data['vote_summary']['no']} ({data['vote_summary']['no']/data['total_votes']*100:.1f}%)
-        - **Abstentions:** {data['vote_summary']['abstain']} ({data['vote_summary']['abstain']/data['total_votes']*100:.1f}%)
         """)
-    with gr.Tab("🌍 Regional Analysis"):
-        gr.Markdown("### How different regions voted")
-        regional_chart = gr.Plot(value=create_regional_breakdown(data))
-    with gr.Tab("🔍 Country Details"):
-        gr.Markdown("### Search for a specific country's vote and statement")
-        country_dropdown = gr.Dropdown(
             choices=country_names,
-            label="Select Country",
-            value=country_names[0]
         )
-        country_details = gr.Markdown(value=get_country_details(country_names[0], data))
-        country_dropdown.change(
-            fn=lambda x: get_country_details(x, data),
-            inputs=country_dropdown,
-            outputs=country_details
         )
-    with gr.Tab("📋 All Votes"):
-        gr.Markdown("### Complete voting record with statements")
-        votes_table = gr.Dataframe(
-            value=create_votes_table(data),
             height=600,
-            interactive=False
         )
     gr.Markdown("""
     ---
-    ### About This Simulation
-    This is an AI experiment exploring how large language models can simulate international diplomatic interactions.
-    Each country is represented by an AI agent with detailed system prompts defining their foreign policy positions,
-    historical stances, and diplomatic style.
-    **⚠️ Disclaimer:** This is a simulation for research and educational purposes. The AI agents' positions do not
-    represent actual government policies or diplomatic stances.
-    📂 [View on GitHub](https://github.com/yourusername/AI-Agent-UN)
     """)
 if __name__ == "__main__":

 import plotly.graph_objects as go
 from pathlib import Path
+# Load data
 def load_data():
     data_path = Path("tasks/reactions/01_gaza_ceasefire_resolution_latest.json")
     with open(data_path, 'r') as f:
         return json.load(f)
+def load_system_prompt(country_slug):
+    """Load the system prompt for a specific country"""
+    try:
+        prompt_path = Path(f"agents/representatives/{country_slug}/system-prompt.md")
+        with open(prompt_path, 'r') as f:
+            return f.read()
+    except:
+        return "System prompt not found for this country."
+def load_motion():
+    """Load the ceasefire resolution text"""
+    try:
+        with open("tasks/motions/01_gaza_ceasefire_resolution.md", 'r') as f:
+            return f.read()
+    except:
+        return "Motion text not found."
+# Visualization functions
 def create_vote_summary_chart(data):
     vote_summary = data['vote_summary']
     fig = go.Figure(data=[go.Pie(
         labels=['Yes', 'No', 'Abstain'],
         values=[vote_summary['yes'], vote_summary['no'], vote_summary['abstain']],
         textinfo='label+value+percent',
         textfont_size=16
     )])
     fig.update_layout(
+        title=f"Voting Results (Total: {data['total_votes']} countries)",
+        height=400,
         showlegend=True
     )
     return fig
+def get_country_response(country_name, data):
+    """Get the full response for a specific country"""
     if not country_name:
+        return "Select a country to see their full response", ""
     for vote in data['votes']:
         if vote['country'].lower() == country_name.lower():
+            vote_emoji = "✅" if vote['vote'] == 'yes' else "❌" if vote['vote'] == 'no' else "⚪"
+            response = f"""
+## {vote_emoji} Vote: {vote['vote'].upper()}
+### Diplomatic Statement:
 {vote['statement']}
             """
+            return response, vote['country_slug']
+    return "Country not found", ""
 # Load data
 data = load_data()
 country_names = sorted([v['country'] for v in data['votes']])
+motion_text = load_motion()
 # Create Gradio interface
+with gr.Blocks(title="AI Agent UN Experiment", theme=gr.themes.Soft()) as demo:
+    gr.Markdown("""
+    # 🤖 AI Agent United Nations Experiment
+    ## Simulating International Diplomacy with Large Language Models
+    This is an experimental research project that explores how AI can model international diplomatic behavior.
+    Each of the 195 UN member states is represented by an AI agent with a unique system prompt defining their
+    foreign policy positions, national interests, and diplomatic style.
     """)
+    with gr.Tab("🔬 The Experiment"):
+        gr.Markdown("""
+        ## How It Works
+        ### 1. Agent Architecture
+        Each country is represented by an AI agent powered by **Claude 3.5 Sonnet** (claude-3-5-sonnet-20241022).
+        Every agent receives a unique system prompt that defines:
+        - **National Identity**: The country they represent and their role
+        - **Core Responsibilities**: How to advocate for their country's interests
+        - **Behavioral Guidelines**: Diplomatic style and historical context
+        - **Key Considerations**: Security, economic, and strategic factors
+        - **Decision Framework**: How to analyze and respond to resolutions
+        ### 2. The System Prompts
+        The system prompts are **generic templates** - they do NOT contain country-specific foreign policy positions.
+        Instead, they instruct the AI to:
+        - Draw upon the country's historical positions (from the model's training data)
+        - Consider national security and economic interests
+        - Maintain appropriate diplomatic tone
+        - Think strategically about alliances and precedents
+        This means the AI agent must infer each country's likely position based on what it has learned
+        during training about that country's foreign policy, voting patterns, and geopolitical context.
+        ### 3. The Process
+        1. **Input**: Each agent receives the same UN resolution text
+        2. **Processing**: The agent analyzes how the resolution affects their country's interests
+        3. **Output**: The agent produces a structured JSON response containing:
+           - A vote: YES, NO, or ABSTAIN
+           - A diplomatic statement explaining their position
+        ### 4. What This Tests
+        This experiment explores:
+        - How well LLMs understand different countries' foreign policy positions
+        - Whether AI can model complex geopolitical decision-making
+        - The diversity of perspectives in international relations
+        - Multi-agent AI systems in realistic scenarios
+        ### 5. Important Limitations
+        ⚠️ **This is a simulation, not prediction:**
+        - The AI agents' positions are based on historical patterns in training data
+        - They do NOT represent actual government policies or intentions
+        - They should NOT be considered authoritative or predictive
+        - Real diplomacy involves classified information, domestic politics, and human judgment
         """)
+    with gr.Tab("📋 System Prompt Explorer"):
+        gr.Markdown("""
+        ## Explore the Agent System Prompts
+        Select any country to view the exact system prompt their AI agent received.
+        Notice how the prompts are **identical in structure** - the only differences are:
+        - The country name
+        - Whether they're a P5 member (for veto power context)
+        The AI must infer everything else from its training data about each country.
+        """)
+        with gr.Row():
+            with gr.Column(scale=1):
+                country_selector = gr.Dropdown(
+                    choices=country_names,
+                    label="Select Country",
+                    value="United States"
+                )
+                gr.Markdown("""
+                ### Try comparing:
+                - **P5 members**: United States, China, Russia, United Kingdom, France
+                - **Regional powers**: Brazil, India, South Africa, Nigeria
+                - **Small states**: Palau, Tuvalu, Monaco
+                - **Key stakeholders**: Israel, Palestine, Egypt, Iran
+                """)
+            with gr.Column(scale=2):
+                system_prompt_display = gr.Markdown(
+                    value=load_system_prompt("united-states"),
+                    label="System Prompt"
+                )
+        country_selector.change(
+            fn=lambda country: load_system_prompt(data['votes'][[v['country'] for v in data['votes']].index(country)]['country_slug']),
+            inputs=country_selector,
+            outputs=system_prompt_display
+        )
+    with gr.Tab("📜 The Resolution"):
+        gr.Markdown("""
+        ## The Motion Presented to All Agents
+        Every AI agent received this exact same resolution text and was asked to vote on it.
+        **Resolution**: Support for Ceasefire Agreement in Gaza and Commitment to Lasting Peace
+        """)
+        gr.Markdown(motion_text)
+    with gr.Tab("🗳️ Case Study: Gaza Ceasefire"):
+        gr.Markdown("""
+        ## Simulation Results
+        This tab shows the results when all 195 AI country agents voted on the ceasefire resolution.
+        This is ONE example of the experiment in action.
+        """)
+        with gr.Row():
+            with gr.Column():
+                vote_chart = gr.Plot(value=create_vote_summary_chart(data))
+                gr.Markdown(f"""
+                ### Results Summary
+                - **Yes votes:** {data['vote_summary']['yes']} ({data['vote_summary']['yes']/data['total_votes']*100:.1f}%)
+                - **No votes:** {data['vote_summary']['no']} ({data['vote_summary']['no']/data['total_votes']*100:.1f}%)
+                - **Abstentions:** {data['vote_summary']['abstain']} ({data['vote_summary']['abstain']/data['total_votes']*100:.1f}%)
+                **Model**: {data['model']}
+                **Date**: {data['timestamp'][:10]}
+                """)
+    with gr.Tab("🔍 Agent Response Inspector"):
+        gr.Markdown("""
+        ## Compare System Prompt → Agent Response
+        Select a country to see:
+        1. The system prompt they received
+        2. The vote and statement they produced
+        This shows how the generic prompt + the model's knowledge → specific diplomatic position
+        """)
+        country_inspector = gr.Dropdown(
             choices=country_names,
+            label="Select Country to Inspect",
+            value="United States"
         )
+        with gr.Row():
+            with gr.Column():
+                gr.Markdown("### System Prompt Received")
+                inspector_prompt = gr.Markdown(value=load_system_prompt("united-states"))
+            with gr.Column():
+                gr.Markdown("### Agent's Response")
+                inspector_response = gr.Markdown(value=get_country_response("United States", data)[0])
+        def update_inspector(country):
+            response, slug = get_country_response(country, data)
+            prompt = load_system_prompt(slug) if slug else "Country not found"
+            return prompt, response
+        country_inspector.change(
+            fn=update_inspector,
+            inputs=country_inspector,
+            outputs=[inspector_prompt, inspector_response]
         )
+    with gr.Tab("📊 All Responses"):
+        gr.Markdown("### Complete voting record with all diplomatic statements")
+        votes_data = pd.DataFrame([
+            {
+                'Country': v['country'],
+                'Vote': v['vote'].upper(),
+                'Statement': v['statement']
+            }
+            for v in data['votes']
+        ])
+        gr.Dataframe(
+            value=votes_data,
             height=600,
+            interactive=False,
+            column_widths=["15%", "10%", "75%"]
         )
     gr.Markdown("""
     ---
+    ## About This Project
+    **AI Agent UN** is an experimental research project exploring multi-agent AI systems in international relations contexts.
+    ### Key Points
+    ✅ **What this is:**
+    - An AI experiment in modeling diplomatic behavior
+    - A research tool for understanding LLM capabilities
+    - An educational demonstration of international relations complexity
+    ⚠️ **What this is NOT:**
+    - A prediction of actual government positions
+    - An authoritative source on foreign policy
+    - A replacement for real diplomatic analysis
+    ### Open Source
+    This project is open source. All system prompts, code, and simulation results are available on GitHub.
+    - 📂 [GitHub Repository](https://github.com/danielrosehill/AI-Agent-UN)
+    - 📖 [Documentation](https://github.com/danielrosehill/AI-Agent-UN/blob/main/README.md)
+    - 🤖 [Agent Prompts](https://github.com/danielrosehill/AI-Agent-UN/tree/main/agents/representatives)
+    ### Technical Details
+    - **Model**: Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)
+    - **Countries**: 195 UN member states
+    - **Output Format**: Structured JSON (vote + statement)
+    - **System Prompts**: Generic templates (no country-specific policies hardcoded)
+    ---
+    *Built with [Gradio](https://gradio.app) | Powered by [Anthropic Claude](https://anthropic.com/claude)*
     """)
 if __name__ == "__main__":