Patricksturg commited on
Commit
27b9e9b
·
verified ·
1 Parent(s): bad4b27

Upload 5 files

Browse files
Files changed (5) hide show
  1. README.md +123 -7
  2. dashboard.py +726 -0
  3. dashboard_backend.py +871 -0
  4. ess_uk_with_backstories.csv +0 -0
  5. requirements.txt +3 -0
README.md CHANGED
@@ -1,13 +1,129 @@
1
  ---
2
- title: Silicon Sampling Dashboard
3
- emoji: 🏢
4
  colorFrom: blue
5
- colorTo: yellow
6
- sdk: gradio
7
- sdk_version: 6.0.1
8
- app_file: app.py
9
  pinned: false
10
  license: mit
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: COGbot Silicon Sampling Dashboard
3
+ emoji: 🤖
4
  colorFrom: blue
5
+ colorTo: purple
6
+ sdk: streamlit
7
+ sdk_version: 1.28.0
8
+ app_file: dashboard.py
9
  pinned: false
10
  license: mit
11
  ---
12
 
13
+ # 🤖 COGbot Dashboard - Silicon Sampling
14
+
15
+ Generate synthetic survey responses using AI-powered persona simulation.
16
+
17
+ ## 🚀 Quick Start
18
+
19
+ 1. **Choose your AI model** (Claude or ChatGPT)
20
+ 2. **Enter your API key** (get one from the links in the sidebar)
21
+ 3. **Write your survey question**
22
+ 4. **Generate responses** from 2,204 ESS personas
23
+ 5. **Download results** as CSV
24
+
25
+ ## 💡 What is Silicon Sampling?
26
+
27
+ Silicon sampling uses AI to generate synthetic survey responses based on real demographic personas. Each persona is built from European Social Survey (ESS) data and includes:
28
+ - Age, gender, education, occupation
29
+ - Political ideology, religious attendance
30
+ - Income, household composition
31
+ - Regional and ethnic background
32
+
33
+ ## ✨ Features
34
+
35
+ ### Response Generation Mode
36
+ - Generate synthetic survey responses
37
+ - Multiple formats: Scale (0-10), Scale (1-5), Multiple Choice, Yes/No, Open Text
38
+ - Statistical summaries (mean, median, std dev)
39
+ - Automated thematic analysis for open text
40
+ - Download as CSV
41
+
42
+ ### Question Testing Mode
43
+ - Test draft survey questions for clarity
44
+ - Identify ambiguous wording
45
+ - Get improvement suggestions
46
+ - Validate questions before real fielding
47
+
48
+ ## 💰 Cost
49
+
50
+ This tool requires your own API key from either:
51
+ - **Claude** (Anthropic): ~$0.015 per 50 responses [Get key →](https://console.anthropic.com/settings/keys)
52
+ - **ChatGPT** (OpenAI): ~$0.01 per 50 responses [Get key →](https://platform.openai.com/api-keys)
53
+
54
+ **Example costs:**
55
+ - 50 responses: ~$0.01-0.015
56
+ - 100 responses: ~$0.02-0.03
57
+ - 500 responses: ~$0.10-0.15
58
+
59
+ Your API key is only used for your session and is never stored.
60
+
61
+ ## 🎯 Use Cases
62
+
63
+ - **Pilot Testing**: Test survey instruments before fielding
64
+ - **Question Refinement**: Identify problematic wording
65
+ - **Hypothesis Generation**: Explore potential response patterns
66
+ - **Survey Methods Teaching**: Demonstrate questionnaire design
67
+ - **Methodological Research**: Study survey question effects
68
+
69
+ ## 📊 Sample Data
70
+
71
+ Based on **European Social Survey Round 9 UK data (2018)**:
72
+ - 2,204 respondents
73
+ - Representative UK demographics
74
+ - Rich persona backstories
75
+
76
+ ## 🔒 Privacy & Security
77
+
78
+ - API keys are never logged or stored
79
+ - Used only for your current session
80
+ - Data sent only to your chosen AI provider
81
+ - No retention after session ends
82
+
83
+ ## 📚 How It Works
84
+
85
+ 1. **Persona Loading**: Each respondent has a detailed backstory
86
+ 2. **AI Prompting**: Backstory becomes the AI's "persona"
87
+ 3. **Question Answering**: AI responds as that persona would
88
+ 4. **Aggregation**: Responses collected and analyzed
89
+
90
+ ## 🎓 Citation
91
+
92
+ Based on European Social Survey Round 9 UK data (2018).
93
+
94
+ ESS Round 9: European Social Survey Round 9 Data (2018). Data file edition 3.1. Sikt - Norwegian Agency for Shared Services in Education and Research, Norway – Data Archive and distributor of ESS data for ESS ERIC. doi:10.21338/NSD-ESS9-2018.
95
+
96
+ ## 📖 Documentation
97
+
98
+ - [Full Documentation](https://github.com/PatrickSturgis/Silicon_samples)
99
+ - [Methodology Paper](https://github.com/PatrickSturgis/Silicon_samples)
100
+ - [GitHub Repository](https://github.com/PatrickSturgis/Silicon_samples)
101
+
102
+ ## ⚠️ Important Notes
103
+
104
+ - Synthetic responses are for research/testing purposes only
105
+ - Should complement, not replace, real survey data
106
+ - Best used for question development and pilot testing
107
+ - Response quality depends on persona detail and AI model
108
+
109
+ ## 🛠️ Technical Details
110
+
111
+ - Built with Streamlit
112
+ - Supports Claude 3.5 Sonnet and GPT-4o-mini
113
+ - Processes 50 responses in ~1-2 minutes
114
+ - CSV export with all demographic variables
115
+
116
+ ## 📧 Contact & Support
117
+
118
+ - **GitHub Issues**: [Report bugs or request features](https://github.com/PatrickSturgis/Silicon_samples/issues)
119
+ - **Research Inquiries**: Via GitHub
120
+ - **Educational Use**: Free for academic purposes
121
+
122
+ ## 📄 License
123
+
124
+ MIT License - Free for research and educational use.
125
+
126
+ ---
127
+
128
+ **Developed by**: Patrick Sturgis, LSE Department of Methodology
129
+ **Powered by**: Anthropic Claude & OpenAI GPT
dashboard.py ADDED
@@ -0,0 +1,726 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Silicon Sampling Dashboard
4
+
5
+ Interactive web interface for generating synthetic survey responses.
6
+ Users can input custom questions and get silicon sample data without coding.
7
+
8
+ Usage:
9
+ streamlit run dashboard.py
10
+ """
11
+
12
+ import streamlit as st
13
+ import pandas as pd
14
+ from pathlib import Path
15
+ import json
16
+ from datetime import datetime
17
+ import os
18
+ from dashboard_backend import SiliconSampler, WinstonSampler, HuggingFaceSampler, OpenAISampler, AnthropicSampler
19
+
20
+ # Check deployment mode (set PUBLIC_DEPLOYMENT=true for HuggingFace/public hosting)
21
+ IS_PUBLIC = os.getenv('PUBLIC_DEPLOYMENT', 'false').lower() == 'true'
22
+
23
+ # Page configuration
24
+ st.set_page_config(
25
+ page_title="COGbot Dashboard",
26
+ page_icon="🤖",
27
+ layout="wide"
28
+ )
29
+
30
+ # Initialize session state
31
+ if 'results' not in st.session_state:
32
+ st.session_state.results = None
33
+ if 'processing' not in st.session_state:
34
+ st.session_state.processing = False
35
+ if 'mode' not in st.session_state:
36
+ st.session_state.mode = "Response Generation"
37
+ if 'question_text' not in st.session_state:
38
+ st.session_state.question_text = ""
39
+ if 'response_options_text' not in st.session_state:
40
+ st.session_state.response_options_text = ""
41
+
42
+ # Title and description
43
+ st.title("🤖 COGbot Dashboard")
44
+ st.markdown("""
45
+ Generate synthetic survey responses using LLM-based persona simulation.
46
+ Enter your question and response format - we'll handle the rest.
47
+ """)
48
+
49
+ # Sidebar - Logo and Configuration
50
+ # Display LSE logo at top of sidebar
51
+ logo_path = "LSE_logo.jpg"
52
+ if Path(logo_path).exists():
53
+ st.sidebar.image(logo_path, width=180)
54
+ st.sidebar.markdown("---")
55
+
56
+ st.sidebar.header("⚙️ Configuration")
57
+
58
+ # Data source
59
+ data_source = st.sidebar.radio(
60
+ "Data Source",
61
+ ["Default ESS UK (1,286 respondents)", "Upload CSV"]
62
+ )
63
+
64
+ if data_source == "Upload CSV":
65
+ uploaded_file = st.sidebar.file_uploader(
66
+ "Upload backstories CSV",
67
+ type=['csv'],
68
+ help="CSV must have 'backstory' column"
69
+ )
70
+ if uploaded_file:
71
+ df_backstories = pd.read_csv(uploaded_file)
72
+ else:
73
+ df_backstories = None
74
+ else:
75
+ # Load default ESS data
76
+ default_path = Path("ess_uk_with_backstories.csv")
77
+ if default_path.exists():
78
+ df_backstories = pd.read_csv(default_path)
79
+ else:
80
+ df_backstories = None
81
+ st.sidebar.warning("⚠️ Default file not found: ess_uk_with_backstories.csv")
82
+
83
+ # Show data info
84
+ if df_backstories is not None:
85
+ st.sidebar.success(f"✅ Loaded {len(df_backstories):,} respondents")
86
+
87
+ # Sample size
88
+ max_size = len(df_backstories)
89
+ sample_size = st.sidebar.slider(
90
+ "Sample Size",
91
+ min_value=10,
92
+ max_value=max_size,
93
+ value=min(50, max_size),
94
+ step=10,
95
+ help="Start with small sample for testing"
96
+ )
97
+ else:
98
+ sample_size = 0
99
+
100
+ # Model settings
101
+ st.sidebar.subheader("Model Settings")
102
+
103
+ # Choose model options based on deployment mode
104
+ if IS_PUBLIC:
105
+ # Public deployment: Only show API-based models
106
+ model_options = ["Claude (Claude 3.5 Sonnet)", "ChatGPT (GPT-4o-mini)"]
107
+ st.sidebar.info("""
108
+ 💡 **About API Keys**
109
+
110
+ This tool uses AI models via API. You'll need to provide your own API key:
111
+ - **Claude**: ~$0.015 per 50 responses (recommended for quality)
112
+ - **ChatGPT**: ~$0.01 per 50 responses (faster, good quality)
113
+
114
+ Your API key is used only for your session and is never stored.
115
+ """)
116
+ else:
117
+ # Local deployment: Show all options including local models
118
+ model_options = ["Claude (Claude 3.5 Sonnet)", "ChatGPT (GPT-4o-mini)", "Local (SmolLM2-1.7B)", "Winston (Qwen2.5-7B)"]
119
+
120
+ model_option = st.sidebar.selectbox(
121
+ "Model",
122
+ model_options,
123
+ help="Choose your AI model. API models require your own API key."
124
+ )
125
+
126
+ # API key inputs based on selected model
127
+ openai_api_key = None
128
+ anthropic_api_key = None
129
+
130
+ if "Claude" in model_option:
131
+ anthropic_api_key = st.sidebar.text_input(
132
+ "Anthropic API Key",
133
+ type="password",
134
+ help="Get your API key from https://console.anthropic.com/settings/keys"
135
+ )
136
+ if not anthropic_api_key:
137
+ st.sidebar.warning("⚠️ API key required for Claude")
138
+ else:
139
+ st.sidebar.success("✅ API key provided")
140
+ st.sidebar.markdown("[Get API key →](https://console.anthropic.com/settings/keys)")
141
+
142
+ elif "ChatGPT" in model_option:
143
+ openai_api_key = st.sidebar.text_input(
144
+ "OpenAI API Key",
145
+ type="password",
146
+ help="Get your API key from https://platform.openai.com/api-keys"
147
+ )
148
+ if not openai_api_key:
149
+ st.sidebar.warning("⚠️ API key required for ChatGPT")
150
+ else:
151
+ st.sidebar.success("✅ API key provided")
152
+ st.sidebar.markdown("[Get API key →](https://platform.openai.com/api-keys)")
153
+
154
+ temperature = st.sidebar.slider(
155
+ "Temperature",
156
+ min_value=0.0,
157
+ max_value=1.0,
158
+ value=0.7,
159
+ step=0.1,
160
+ help="Higher = more creative, Lower = more consistent"
161
+ )
162
+
163
+ # Main panel - Question configuration
164
+ st.header("📋 Step 1: Configure Question")
165
+
166
+ # Mode selection: Response Generation vs Question Testing
167
+ mode = st.radio(
168
+ "Mode",
169
+ ["Response Generation", "Question Testing"],
170
+ help="Response Generation: Get synthetic survey responses. Question Testing: Get feedback on question quality."
171
+ )
172
+
173
+ col1, col2 = st.columns([2, 1])
174
+
175
+ with col1:
176
+ question_text = st.text_area(
177
+ "Survey Question",
178
+ height=150,
179
+ placeholder="Enter your survey question here...",
180
+ help="The question your synthetic respondents will answer" if mode == "Response Generation" else "The draft question you want to test for clarity and quality"
181
+ )
182
+
183
+ with col2:
184
+ if mode == "Response Generation":
185
+ response_format = st.selectbox(
186
+ "Response Format",
187
+ ["Scale (0-10)", "Scale (1-5)", "Multiple Choice", "Yes/No", "Open Text"]
188
+ )
189
+ else: # Question Testing mode
190
+ response_format = "Open Text"
191
+ st.info("📝 Question Testing uses open text responses to gather feedback on question quality.")
192
+
193
+ # Configure prompt based on mode
194
+ # Initialize variables that will be used in preview
195
+ mc_options = ""
196
+ response_options_text = ""
197
+
198
+ if mode == "Question Testing":
199
+ # Question Testing mode: Create critique prompt
200
+ st.subheader("Response Options/Instructions")
201
+ response_options_text = st.text_area(
202
+ "Response Options (if applicable)",
203
+ height=100,
204
+ placeholder="e.g., Scale from 0-10 where 0=Not at all, 10=Extremely, or Multiple choice options A, B, C, D",
205
+ help="Include any response options or scales that are part of the question being tested"
206
+ )
207
+
208
+ # Build the testing prompt
209
+ instructions = f"""Please provide feedback on this survey question. Comment on:
210
+
211
+ 1. Are there any parts of the question that are ambiguous or unclear?
212
+ 2. Are there any parts that are difficult to understand?
213
+ 3. Did you have any problems thinking about how to answer?
214
+ 4. Are the response options (if provided) appropriate and complete?
215
+
216
+ Provide your feedback in 2-3 sentences, being specific about any issues you identify."""
217
+
218
+ # Automatically enable thematic coding for Question Testing
219
+ enable_thematic_coding = True
220
+ st.info("🔍 Thematic analysis will automatically run to identify common issues in the question.")
221
+
222
+ else:
223
+ # Response Generation mode: Original behavior
224
+ # Scale anchor labels (if scale selected)
225
+ if "Scale" in response_format:
226
+ st.subheader("Scale Labels")
227
+
228
+ if "0-10" in response_format:
229
+ # 10-point scale: just endpoints
230
+ col_low, col_high = st.columns(2)
231
+ with col_low:
232
+ low_label = st.text_input(
233
+ "0 means",
234
+ value="Not at all",
235
+ help="What does the lowest value mean?"
236
+ )
237
+ with col_high:
238
+ high_label = st.text_input(
239
+ "10 means",
240
+ value="Extremely",
241
+ help="What does the highest value mean?"
242
+ )
243
+ instructions = f"Respond with a single integer from 0 to 10, where 0 means '{low_label}' and 10 means '{high_label}'. Only output the number."
244
+
245
+ else: # 1-5 scale: label all 5 points
246
+ label_1 = st.text_input("1 means", value="Strongly disagree")
247
+ label_2 = st.text_input("2 means", value="Disagree")
248
+ label_3 = st.text_input("3 means", value="Neither agree nor disagree")
249
+ label_4 = st.text_input("4 means", value="Agree")
250
+ label_5 = st.text_input("5 means", value="Strongly agree")
251
+
252
+ instructions = f"""Respond with a single integer from 1 to 5 based on these labels:
253
+ 1 = {label_1}
254
+ 2 = {label_2}
255
+ 3 = {label_3}
256
+ 4 = {label_4}
257
+ 5 = {label_5}
258
+
259
+ Only output the number."""
260
+ else:
261
+ # Non-scale formats
262
+ format_instructions = {
263
+ "Multiple Choice": "Choose one option and respond with only the letter (A, B, C, or D).",
264
+ "Yes/No": "Respond with only 'Yes' or 'No'.",
265
+ "Open Text": "Provide a brief 1-2 sentence response based on your persona."
266
+ }
267
+ instructions = format_instructions.get(response_format, "")
268
+
269
+ # Allow editing instructions
270
+ instructions = st.text_area(
271
+ "Instructions to Model",
272
+ value=instructions,
273
+ height=100,
274
+ help="How the model should format its response"
275
+ )
276
+
277
+ # Multiple choice options (if selected)
278
+ if response_format == "Multiple Choice":
279
+ st.subheader("Response Options")
280
+ col1, col2, col3, col4 = st.columns(4)
281
+ with col1:
282
+ option_a = st.text_input("Option A", "Strongly agree")
283
+ with col2:
284
+ option_b = st.text_input("Option B", "Agree")
285
+ with col3:
286
+ option_c = st.text_input("Option C", "Disagree")
287
+ with col4:
288
+ option_d = st.text_input("Option D", "Strongly disagree")
289
+
290
+ mc_options = f"\nA. {option_a}\nB. {option_b}\nC. {option_c}\nD. {option_d}"
291
+ else:
292
+ mc_options = ""
293
+
294
+ # Thematic coding option (if open text selected)
295
+ enable_thematic_coding = False
296
+ if response_format == "Open Text":
297
+ st.subheader("Thematic Coding")
298
+ enable_thematic_coding = st.checkbox(
299
+ "Perform automated thematic analysis after generating responses",
300
+ value=False,
301
+ help="Uses LLM to identify themes, counts, and percentages in open text responses. Runs automatically after response generation."
302
+ )
303
+
304
+ # Preview full prompt
305
+ with st.expander("🔍 Preview Full Prompt"):
306
+ st.markdown("**System Prompt:**")
307
+ st.code("""Adopt the following persona and answer only based on it.
308
+ Do not invent details beyond the provided attributes.
309
+
310
+ [Backstory will be inserted here for each respondent]""")
311
+
312
+ st.markdown("**User Prompt:**")
313
+ if mode == "Question Testing":
314
+ # Include response options in the question display for testing
315
+ full_question = f"Question: {question_text}\n"
316
+ if response_options_text.strip():
317
+ full_question += f"\nResponse Options: {response_options_text}\n"
318
+ full_question += f"\n{instructions}"
319
+ else:
320
+ full_question = question_text + mc_options + "\n\n" + instructions
321
+ st.code(full_question)
322
+
323
+ # Generate button
324
+ if mode == "Question Testing":
325
+ st.header("🧪 Step 2: Test Question")
326
+ button_text = "🧪 Test Question with Synthetic Respondents"
327
+ else:
328
+ st.header("🚀 Step 2: Generate Responses")
329
+ button_text = "🎯 Generate Responses"
330
+
331
+ can_generate = (
332
+ df_backstories is not None
333
+ and question_text.strip() != ""
334
+ and not st.session_state.processing
335
+ and (not ("Claude" in model_option) or anthropic_api_key) # Require API key for Claude
336
+ and (not ("ChatGPT" in model_option) or openai_api_key) # Require API key for ChatGPT
337
+ )
338
+
339
+ if st.button(
340
+ button_text,
341
+ disabled=not can_generate,
342
+ type="primary",
343
+ use_container_width=True
344
+ ):
345
+ st.session_state.processing = True
346
+ st.session_state.results = None
347
+ st.session_state.mode = mode # Store mode for results display
348
+ st.session_state.question_text = question_text # Store for thematic analysis
349
+ if mode == "Question Testing":
350
+ st.session_state.response_options_text = response_options_text # Store for improved version
351
+
352
+ # Prepare configuration
353
+ config = {
354
+ "question": full_question,
355
+ "temperature": temperature,
356
+ "sample_size": sample_size
357
+ }
358
+
359
+ # Create sampler based on model selection
360
+ if "Claude" in model_option:
361
+ config["model_type"] = "anthropic"
362
+ config["anthropic_api_key"] = anthropic_api_key
363
+ sampler = AnthropicSampler(config)
364
+ elif "ChatGPT" in model_option:
365
+ config["model_type"] = "openai"
366
+ config["openai_api_key"] = openai_api_key
367
+ sampler = OpenAISampler(config)
368
+ elif "Winston" in model_option:
369
+ config["model_type"] = "winston"
370
+ sampler = WinstonSampler(config)
371
+ else: # Local
372
+ config["model_type"] = "local"
373
+ sampler = SiliconSampler(config)
374
+
375
+ # Progress bar
376
+ progress_bar = st.progress(0)
377
+ status_text = st.empty()
378
+
379
+ # Sample backstories (random sample)
380
+ df_sample = df_backstories.sample(n=sample_size, random_state=42).copy()
381
+
382
+ # Process
383
+ try:
384
+ results = sampler.generate_responses(
385
+ df_sample,
386
+ progress_callback=lambda i, total: (
387
+ progress_bar.progress(i / total),
388
+ status_text.text(f"Processing: {i}/{total} respondents ({100*i/total:.1f}%)")
389
+ )
390
+ )
391
+
392
+ st.session_state.results = results
393
+ st.session_state.processing = False
394
+ st.success(f"✅ Generated {len(results)} responses!")
395
+ st.rerun()
396
+
397
+ except Exception as e:
398
+ st.error(f"❌ Error: {str(e)}")
399
+ st.session_state.processing = False
400
+
401
+ # Show results
402
+ if st.session_state.results is not None:
403
+ st.header("📊 Step 3: Results")
404
+
405
+ results_df = st.session_state.results
406
+
407
+ # Summary stats
408
+ col1, col2, col3 = st.columns(3)
409
+ with col1:
410
+ st.metric("Total Responses", len(results_df))
411
+ with col2:
412
+ valid_responses = results_df['response'].notna().sum()
413
+ st.metric("Valid Responses", valid_responses)
414
+ with col3:
415
+ completion_rate = 100 * valid_responses / len(results_df)
416
+ st.metric("Completion Rate", f"{completion_rate:.1f}%")
417
+
418
+ # Preview
419
+ st.subheader("Preview (First 10 rows)")
420
+ st.dataframe(results_df.head(10), use_container_width=True)
421
+
422
+ # Download
423
+ st.subheader("Download Results")
424
+
425
+ timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
426
+ filename = f"silicon_sample_{timestamp}.csv"
427
+
428
+ csv = results_df.to_csv(index=False)
429
+ st.download_button(
430
+ label="📥 Download CSV",
431
+ data=csv,
432
+ file_name=filename,
433
+ mime="text/csv",
434
+ use_container_width=True
435
+ )
436
+
437
+ # Response distribution and statistics
438
+ if response_format in ["Scale (0-10)", "Scale (1-5)", "Yes/No", "Multiple Choice"]:
439
+ st.subheader(f"Response Distribution: {question_text}")
440
+ try:
441
+ # For numeric formats, convert to numbers
442
+ if response_format.startswith("Scale"):
443
+ numeric_responses = pd.to_numeric(results_df['response'], errors='coerce')
444
+ valid_responses = numeric_responses.dropna()
445
+ elif response_format == "Yes/No":
446
+ # For Yes/No, show frequency distribution
447
+ valid_responses = results_df['response'].dropna()
448
+ elif response_format == "Multiple Choice":
449
+ # For Multiple Choice, show frequency distribution
450
+ valid_responses = results_df['response'].dropna()
451
+
452
+ if len(valid_responses) > 0:
453
+ # Show statistics for numeric scales
454
+ if response_format.startswith("Scale"):
455
+ col1, col2, col3, col4, col5 = st.columns(5)
456
+
457
+ with col1:
458
+ st.metric("Mean", f"{valid_responses.mean():.2f}")
459
+ with col2:
460
+ st.metric("Median", f"{valid_responses.median():.2f}")
461
+ with col3:
462
+ st.metric("Std Dev", f"{valid_responses.std():.2f}")
463
+ with col4:
464
+ mode_val = valid_responses.mode()
465
+ mode_display = f"{mode_val.iloc[0]:.0f}" if len(mode_val) > 0 else "N/A"
466
+ st.metric("Mode", mode_display)
467
+ with col5:
468
+ st.metric("Valid N", f"{len(valid_responses)}")
469
+
470
+ # Distribution chart
471
+ st.bar_chart(pd.to_numeric(results_df['response'], errors='coerce').value_counts().sort_index())
472
+
473
+ # Show frequency counts for categorical
474
+ else:
475
+ value_counts = valid_responses.value_counts()
476
+
477
+ # Display as metrics
478
+ cols = st.columns(min(len(value_counts), 5))
479
+ for idx, (value, count) in enumerate(value_counts.items()):
480
+ if idx < 5: # Limit to 5 columns
481
+ with cols[idx]:
482
+ pct = 100 * count / len(valid_responses)
483
+ st.metric(f"{value}", f"{count} ({pct:.1f}%)")
484
+
485
+ # Also show total N
486
+ st.metric("Total Valid N", f"{len(valid_responses)}")
487
+
488
+ # Distribution chart
489
+ st.bar_chart(value_counts)
490
+ else:
491
+ st.info("No valid responses to analyze")
492
+ except Exception as e:
493
+ st.info(f"Could not generate statistics: {str(e)}")
494
+
495
+ # Thematic coding for open text responses
496
+ elif response_format == "Open Text" and enable_thematic_coding:
497
+ # Get the stored mode and question text
498
+ stored_mode = st.session_state.get('mode', 'Response Generation')
499
+ stored_question = st.session_state.get('question_text', question_text)
500
+
501
+ # Different heading based on mode
502
+ if stored_mode == "Question Testing":
503
+ st.subheader(f"Question Testing Results: {stored_question}")
504
+ else:
505
+ st.subheader(f"Thematic Analysis: {stored_question}")
506
+
507
+ # Get valid text responses
508
+ valid_responses = results_df['response'].dropna()
509
+ valid_responses = valid_responses[valid_responses.str.strip() != ""]
510
+
511
+ if len(valid_responses) > 0:
512
+ st.info(f"Analyzing {len(valid_responses)} open text responses...")
513
+
514
+ # Automatically run thematic coding
515
+ if True: # Changed from button to automatic
516
+ with st.spinner("Analyzing themes with LLM..."):
517
+ try:
518
+ # Prepare responses for analysis
519
+ responses_text = "\n\n".join([f"Response {i+1}: {resp}" for i, resp in enumerate(valid_responses)])
520
+
521
+ # Create thematic analysis prompt - different for Question Testing
522
+ if stored_mode == "Question Testing":
523
+ coding_prompt = f"""You are a survey methodology expert analyzing feedback from respondents who tested a draft survey question.
524
+
525
+ Question being tested: "{stored_question}"
526
+
527
+ Here is the feedback from respondents:
528
+
529
+ {responses_text}
530
+
531
+ Task:
532
+ 1. Identify the main issues and concerns raised about the question (aim for 4-8 distinct issues)
533
+ 2. For each issue, provide:
534
+ - Issue name (2-4 words, e.g., "Ambiguous wording", "Unclear scale", "Missing context")
535
+ - Brief description (1 sentence explaining the specific problem)
536
+ - Count of how many respondents mentioned this issue
537
+ - Percentage of total respondents
538
+
539
+ Format your response as:
540
+ ISSUE: [Name]
541
+ DESCRIPTION: [Description]
542
+ COUNT: [Number]
543
+ PERCENTAGE: [Percentage]
544
+
545
+ [Repeat for each issue]
546
+
547
+ After listing all issues, provide a brief summary of the most critical problems that should be addressed."""
548
+ else:
549
+ coding_prompt = f"""You are a qualitative researcher conducting thematic analysis on open-ended survey responses.
550
+
551
+ Question asked: "{stored_question}"
552
+
553
+ Here are all the responses:
554
+
555
+ {responses_text}
556
+
557
+ Task:
558
+ 1. Identify the main themes present in these responses (aim for 4-8 themes)
559
+ 2. For each theme, provide:
560
+ - Theme name (2-4 words)
561
+ - Brief description (1 sentence)
562
+ - Count of how many responses express this theme
563
+ - Percentage of total responses
564
+
565
+ Format your response as:
566
+ THEME: [Name]
567
+ DESCRIPTION: [Description]
568
+ COUNT: [Number]
569
+ PERCENTAGE: [Percentage]
570
+
571
+ [Repeat for each theme]"""
572
+
573
+ # Send to LLM for coding
574
+ if "Claude" in model_option:
575
+ # Use Anthropic sampler
576
+ from dashboard_backend import AnthropicSampler
577
+ temp_config = {
578
+ "temperature": 0.3, # Lower temp for more consistent coding
579
+ "model_type": "anthropic",
580
+ "anthropic_api_key": anthropic_api_key
581
+ }
582
+ temp_sampler = AnthropicSampler(temp_config)
583
+
584
+ st.info("Sending to Claude for analysis...")
585
+
586
+ # Query Anthropic
587
+ analysis_result = temp_sampler.query_single(
588
+ "You are a qualitative research expert analyzing survey responses.",
589
+ coding_prompt
590
+ )
591
+
592
+ elif "ChatGPT" in model_option:
593
+ # Use OpenAI sampler
594
+ from dashboard_backend import OpenAISampler
595
+ temp_config = {
596
+ "temperature": 0.3, # Lower temp for more consistent coding
597
+ "model_type": "openai",
598
+ "openai_api_key": openai_api_key
599
+ }
600
+ temp_sampler = OpenAISampler(temp_config)
601
+
602
+ st.info("Sending to ChatGPT for analysis...")
603
+
604
+ # Query OpenAI
605
+ analysis_result = temp_sampler.query_single(
606
+ "You are a qualitative research expert analyzing survey responses.",
607
+ coding_prompt
608
+ )
609
+
610
+ elif "Winston" in model_option:
611
+ # Use Winston sampler with single query method
612
+ from dashboard_backend import WinstonSampler
613
+ temp_config = {
614
+ "temperature": 0.3, # Lower temp for more consistent coding
615
+ "model_type": "winston"
616
+ }
617
+ temp_sampler = WinstonSampler(temp_config)
618
+
619
+ st.info("Sending to Winston for analysis... This may take 1-2 minutes (includes model loading time).")
620
+
621
+ # Query Winston
622
+ analysis_result = temp_sampler.query_single(
623
+ "You are a qualitative research expert analyzing survey responses.",
624
+ coding_prompt
625
+ )
626
+
627
+ else:
628
+ # Use local model
629
+ from dashboard_backend import SiliconSampler
630
+ temp_config = {
631
+ "question": coding_prompt,
632
+ "temperature": 0.3,
633
+ "model_type": "local"
634
+ }
635
+ temp_sampler = SiliconSampler(temp_config)
636
+ temp_sampler._initialize_local_model()
637
+
638
+ # Query with analysis prompt
639
+ analysis_result = temp_sampler.query_llm(
640
+ "You are a qualitative research expert analyzing survey responses.",
641
+ coding_prompt
642
+ )
643
+
644
+ # Display results
645
+ st.markdown("### Thematic Coding Results")
646
+ st.text_area("Analysis", analysis_result, height=400)
647
+
648
+ # For Question Testing mode, offer to suggest improved wording
649
+ if stored_mode == "Question Testing":
650
+ st.markdown("---")
651
+ st.markdown("### Suggest Improved Question Wording")
652
+
653
+ if st.button("✨ Generate Improved Question", type="secondary"):
654
+ with st.spinner("Generating improved question wording..."):
655
+ try:
656
+ # Get response options if they exist
657
+ stored_options = st.session_state.get('response_options_text', '')
658
+
659
+ # Create improvement prompt
660
+ # Build the options section separately to avoid f-string backslash issue
661
+ options_section = f"\nOriginal Response Options: {stored_options}\n" if stored_options else ""
662
+ improved_options_section = "\n\nIMPROVED RESPONSE OPTIONS:\n[Your improved options]\n" if stored_options else ""
663
+
664
+ improvement_prompt = f"""You are a survey methodology expert. Based on the feedback analysis below, suggest an improved version of the survey question that addresses the identified issues.
665
+
666
+ Original Question: "{stored_question}"{options_section}
667
+
668
+ Issues Identified:
669
+ {analysis_result}
670
+
671
+ Task:
672
+ 1. Provide an improved version of the question that addresses the main issues
673
+ 2. If response options were provided, suggest improved response options as well
674
+ 3. Explain what changes you made and why they address the identified problems
675
+
676
+ Format your response as:
677
+
678
+ IMPROVED QUESTION:
679
+ [Your improved question text]{improved_options_section}
680
+
681
+ CHANGES MADE:
682
+ [Brief explanation of what you changed and why]"""
683
+
684
+ # Send to same model that was used for analysis
685
+ if "Claude" in model_option:
686
+ improvement_result = temp_sampler.query_single(
687
+ "You are a survey methodology expert specializing in question wording and design.",
688
+ improvement_prompt
689
+ )
690
+ elif "ChatGPT" in model_option:
691
+ improvement_result = temp_sampler.query_single(
692
+ "You are a survey methodology expert specializing in question wording and design.",
693
+ improvement_prompt
694
+ )
695
+ elif "Winston" in model_option:
696
+ improvement_result = temp_sampler.query_single(
697
+ "You are a survey methodology expert specializing in question wording and design.",
698
+ improvement_prompt
699
+ )
700
+ else:
701
+ improvement_result = temp_sampler.query_llm(
702
+ "You are a survey methodology expert specializing in question wording and design.",
703
+ improvement_prompt
704
+ )
705
+
706
+ # Display improved version
707
+ st.markdown("### Improved Question Suggestion")
708
+ st.text_area("Suggested Improvements", improvement_result, height=300)
709
+
710
+ st.info("💡 Review the suggested improvements and adapt them as needed for your research context.")
711
+
712
+ except Exception as e:
713
+ st.error(f"Error generating improved question: {str(e)}")
714
+
715
+ except Exception as e:
716
+ st.error(f"Error during thematic analysis: {str(e)}")
717
+ else:
718
+ st.info("No valid open text responses to analyze")
719
+
720
+ # Footer
721
+ st.sidebar.markdown("---")
722
+ st.sidebar.markdown("""
723
+ **Need Help?**
724
+ - [Documentation](WINSTON_README.md)
725
+ - [GitHub](https://github.com/PatrickSturgis/Silicon_samples)
726
+ """)
dashboard_backend.py ADDED
@@ -0,0 +1,871 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Dashboard Backend - Silicon Sampling Processing
4
+
5
+ Handles LLM querying and response generation for the dashboard.
6
+ Supports both local (lightweight) and Winston (production) modes.
7
+ """
8
+
9
+ import pandas as pd
10
+ import torch
11
+ from typing import Callable, Optional
12
+ import time
13
+ import os
14
+
15
+ # Set HuggingFace cache to a writable location
16
+ os.environ['HF_HOME'] = os.path.expanduser('~/Library/Caches/huggingface')
17
+ os.environ['TRANSFORMERS_CACHE'] = os.path.expanduser('~/Library/Caches/huggingface')
18
+
19
+ class SiliconSampler:
20
+ """
21
+ Silicon sampling backend for dashboard
22
+
23
+ Supports:
24
+ - Local mode: Quick testing with small models
25
+ - Winston mode: Production quality with Qwen2.5 (future)
26
+ """
27
+
28
+ def __init__(self, config: dict):
29
+ """
30
+ Initialize sampler
31
+
32
+ Args:
33
+ config: Dictionary with:
34
+ - question: Survey question text
35
+ - temperature: Sampling temperature
36
+ - sample_size: Number of respondents
37
+ - model_type: "local" or "winston"
38
+ """
39
+ self.config = config
40
+ self.llm = None
41
+ self.model = None
42
+ self.tokenizer = None
43
+ self.device = None
44
+ self.model_loaded = False
45
+
46
+ # Don't load model in __init__ - load lazily on first use
47
+
48
+ def _initialize_local_model(self):
49
+ """Initialize lightweight local model for testing"""
50
+ try:
51
+ from transformers import AutoTokenizer, AutoModelForCausalLM
52
+
53
+ # Use SmolLM2-1.7B-Instruct for better quality
54
+ model_name = "HuggingFaceTB/SmolLM2-1.7B-Instruct"
55
+
56
+ print(f"Loading model: {model_name}")
57
+
58
+ self.tokenizer = AutoTokenizer.from_pretrained(model_name)
59
+
60
+ self.device = "cuda" if torch.cuda.is_available() else "cpu"
61
+
62
+ self.model = AutoModelForCausalLM.from_pretrained(
63
+ model_name,
64
+ torch_dtype=torch.float32, # Use float32 for CPU compatibility
65
+ low_cpu_mem_usage=True
66
+ )
67
+
68
+ self.model = self.model.to(self.device)
69
+
70
+ print(f"✅ Model loaded on {self.device}")
71
+
72
+ except Exception as e:
73
+ print(f"Error loading model: {e}")
74
+ raise
75
+
76
+ def query_llm(self, backstory: str, question: str) -> str:
77
+ """
78
+ Query LLM with backstory and question
79
+
80
+ Args:
81
+ backstory: Persona backstory text
82
+ question: Survey question
83
+
84
+ Returns:
85
+ Model response
86
+ """
87
+ # Lazy load model on first query
88
+ if not self.model_loaded and self.config['model_type'] == 'local':
89
+ self._initialize_local_model()
90
+ self.model_loaded = True
91
+
92
+ messages = [
93
+ {
94
+ "role": "system",
95
+ "content": (
96
+ "Adopt the following persona and answer only based on it. "
97
+ "Do not invent details beyond the provided attributes.\n\n"
98
+ f"{backstory}"
99
+ )
100
+ },
101
+ {
102
+ "role": "user",
103
+ "content": question
104
+ }
105
+ ]
106
+
107
+ # Format using chat template (same as working job assessment code)
108
+ formatted_prompt = self.tokenizer.apply_chat_template(
109
+ messages,
110
+ tokenize=False,
111
+ add_generation_prompt=True
112
+ )
113
+
114
+ # Tokenize
115
+ inputs = self.tokenizer(
116
+ formatted_prompt,
117
+ return_tensors="pt",
118
+ truncation=True,
119
+ max_length=2048
120
+ ).to(self.device)
121
+
122
+ # Generate (matching working parameters)
123
+ with torch.no_grad():
124
+ outputs = self.model.generate(
125
+ **inputs,
126
+ max_new_tokens=100,
127
+ temperature=self.config['temperature'],
128
+ top_p=1.0,
129
+ do_sample=True if self.config['temperature'] > 0 else False,
130
+ pad_token_id=self.tokenizer.pad_token_id,
131
+ eos_token_id=self.tokenizer.eos_token_id
132
+ )
133
+
134
+ # Decode
135
+ generated_tokens = outputs[0][inputs['input_ids'].shape[1]:]
136
+ response = self.tokenizer.decode(generated_tokens, skip_special_tokens=True).strip()
137
+
138
+ return response
139
+
140
+ def generate_responses(
141
+ self,
142
+ df: pd.DataFrame,
143
+ progress_callback: Optional[Callable[[int, int], None]] = None
144
+ ) -> pd.DataFrame:
145
+ """
146
+ Generate responses for all backstories in DataFrame
147
+
148
+ Args:
149
+ df: DataFrame with 'backstory' column
150
+ progress_callback: Optional function(current, total) for progress updates
151
+
152
+ Returns:
153
+ DataFrame with original columns plus 'response' column
154
+ """
155
+ if 'backstory' not in df.columns:
156
+ raise ValueError("DataFrame must have 'backstory' column")
157
+
158
+ results = df.copy()
159
+ results['response'] = ""
160
+
161
+ question = self.config['question']
162
+ total = len(df)
163
+
164
+ for i, (idx, row) in enumerate(df.iterrows()):
165
+ backstory = row['backstory']
166
+
167
+ # Skip empty backstories
168
+ if pd.isna(backstory) or str(backstory).strip() == "":
169
+ results.loc[idx, 'response'] = "[EMPTY]"
170
+ continue
171
+
172
+ try:
173
+ # Query LLM
174
+ response = self.query_llm(str(backstory), question)
175
+ results.loc[idx, 'response'] = response
176
+
177
+ except Exception as e:
178
+ results.loc[idx, 'response'] = f"[ERROR: {str(e)[:50]}]"
179
+
180
+ # Progress callback
181
+ if progress_callback:
182
+ progress_callback(i + 1, total)
183
+
184
+ # Small delay to prevent overheating on CPU
185
+ if self.device == "cpu":
186
+ time.sleep(0.1)
187
+
188
+ return results
189
+
190
+
191
+ class HuggingFaceSampler:
192
+ """
193
+ Hugging Face Inference API sampler
194
+
195
+ Uses HF's free Inference API to access larger models without local compute.
196
+ Requires HF_TOKEN environment variable or passed in config.
197
+ """
198
+
199
+ def __init__(self, config: dict):
200
+ self.config = config
201
+ self.api_token = config.get('hf_token') or os.getenv('HF_TOKEN')
202
+ # Use Meta's Llama 3.2 which is freely accessible via Inference API
203
+ self.model_name = config.get('hf_model', 'meta-llama/Llama-3.2-3B-Instruct')
204
+
205
+ if not self.api_token:
206
+ raise ValueError(
207
+ "Hugging Face API token required. Set HF_TOKEN environment variable "
208
+ "or pass 'hf_token' in config. Get token from: https://huggingface.co/settings/tokens"
209
+ )
210
+
211
+ def query_llm(self, backstory: str, question: str) -> str:
212
+ """Query HF Inference API using direct HTTP requests"""
213
+ import requests
214
+
215
+ # Format the prompt for the model
216
+ prompt = f"""<|begin_of_text|><|start_header_id|>system<|end_header_id|>
217
+
218
+ Adopt the following persona and answer only based on it. Do not invent details beyond the provided attributes.
219
+
220
+ {backstory}<|eot_id|><|start_header_id|>user<|end_header_id|>
221
+
222
+ {question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
223
+
224
+ """
225
+
226
+ # Use the new serverless inference API endpoint
227
+ api_url = f"https://api-inference.huggingface.co/models/{self.model_name}"
228
+
229
+ headers = {
230
+ "Authorization": f"Bearer {self.api_token}",
231
+ "Content-Type": "application/json"
232
+ }
233
+
234
+ payload = {
235
+ "inputs": prompt,
236
+ "parameters": {
237
+ "max_new_tokens": 100,
238
+ "temperature": self.config['temperature'],
239
+ "return_full_text": False
240
+ }
241
+ }
242
+
243
+ try:
244
+ response = requests.post(api_url, headers=headers, json=payload, timeout=30)
245
+
246
+ if response.status_code == 200:
247
+ result = response.json()
248
+ if isinstance(result, list) and len(result) > 0:
249
+ return result[0].get('generated_text', '').strip()
250
+ else:
251
+ return str(result).strip()
252
+ else:
253
+ return f"[API_ERROR: {response.status_code} - {response.text[:100]}]"
254
+
255
+ except Exception as e:
256
+ return f"[API_ERROR: {str(e)[:100]}]"
257
+
258
+ def generate_responses(
259
+ self,
260
+ df: pd.DataFrame,
261
+ progress_callback: Optional[Callable[[int, int], None]] = None
262
+ ) -> pd.DataFrame:
263
+ """Generate responses using HF Inference API"""
264
+
265
+ if 'backstory' not in df.columns:
266
+ raise ValueError("DataFrame must have 'backstory' column")
267
+
268
+ results = df.copy()
269
+ results['response'] = ""
270
+
271
+ question = self.config['question']
272
+ total = len(df)
273
+
274
+ for i, (idx, row) in enumerate(df.iterrows()):
275
+ backstory = row['backstory']
276
+
277
+ if pd.isna(backstory) or str(backstory).strip() == "":
278
+ results.loc[idx, 'response'] = "[EMPTY]"
279
+ continue
280
+
281
+ try:
282
+ response = self.query_llm(str(backstory), question)
283
+ results.loc[idx, 'response'] = response
284
+
285
+ except Exception as e:
286
+ results.loc[idx, 'response'] = f"[ERROR: {str(e)[:50]}]"
287
+
288
+ if progress_callback:
289
+ progress_callback(i + 1, total)
290
+
291
+ # Small delay to avoid rate limiting
292
+ time.sleep(0.5)
293
+
294
+ return results
295
+
296
+
297
+ class OpenAISampler:
298
+ """
299
+ OpenAI API sampler (ChatGPT)
300
+
301
+ Uses OpenAI's API to access GPT models.
302
+ Requires OPENAI_API_KEY environment variable or passed in config.
303
+ """
304
+
305
+ def __init__(self, config: dict):
306
+ self.config = config
307
+ self.api_key = config.get('openai_api_key') or os.getenv('OPENAI_API_KEY')
308
+ # Use GPT-4o-mini by default (fast and cost-effective)
309
+ self.model_name = config.get('openai_model', 'gpt-4o-mini')
310
+
311
+ if not self.api_key:
312
+ raise ValueError(
313
+ "OpenAI API key required. Set OPENAI_API_KEY environment variable "
314
+ "or pass 'openai_api_key' in config. Get key from: https://platform.openai.com/api-keys"
315
+ )
316
+
317
+ def query_llm(self, backstory: str, question: str) -> str:
318
+ """Query OpenAI API"""
319
+ import requests
320
+
321
+ api_url = "https://api.openai.com/v1/chat/completions"
322
+
323
+ headers = {
324
+ "Authorization": f"Bearer {self.api_key}",
325
+ "Content-Type": "application/json"
326
+ }
327
+
328
+ messages = [
329
+ {
330
+ "role": "system",
331
+ "content": (
332
+ "Adopt the following persona and answer only based on it. "
333
+ "Do not invent details beyond the provided attributes.\n\n"
334
+ f"{backstory}"
335
+ )
336
+ },
337
+ {
338
+ "role": "user",
339
+ "content": question
340
+ }
341
+ ]
342
+
343
+ payload = {
344
+ "model": self.model_name,
345
+ "messages": messages,
346
+ "temperature": self.config['temperature'],
347
+ "max_tokens": 150
348
+ }
349
+
350
+ try:
351
+ response = requests.post(api_url, headers=headers, json=payload, timeout=30)
352
+
353
+ if response.status_code == 200:
354
+ result = response.json()
355
+ return result['choices'][0]['message']['content'].strip()
356
+ else:
357
+ return f"[API_ERROR: {response.status_code} - {response.text[:100]}]"
358
+
359
+ except Exception as e:
360
+ return f"[API_ERROR: {str(e)[:100]}]"
361
+
362
+ def query_single(self, backstory: str, question: str) -> str:
363
+ """
364
+ Query OpenAI with a single request (e.g., for thematic analysis)
365
+
366
+ Args:
367
+ backstory: System prompt / context
368
+ question: Query text
369
+
370
+ Returns:
371
+ LLM response text
372
+ """
373
+ # For OpenAI, we can just use the regular query_llm method
374
+ # but with higher max_tokens for longer analysis
375
+ import requests
376
+
377
+ api_url = "https://api.openai.com/v1/chat/completions"
378
+
379
+ headers = {
380
+ "Authorization": f"Bearer {self.api_key}",
381
+ "Content-Type": "application/json"
382
+ }
383
+
384
+ messages = [
385
+ {
386
+ "role": "system",
387
+ "content": backstory
388
+ },
389
+ {
390
+ "role": "user",
391
+ "content": question
392
+ }
393
+ ]
394
+
395
+ payload = {
396
+ "model": self.model_name,
397
+ "messages": messages,
398
+ "temperature": self.config.get('temperature', 0.3),
399
+ "max_tokens": 1000 # More tokens for thematic analysis
400
+ }
401
+
402
+ try:
403
+ response = requests.post(api_url, headers=headers, json=payload, timeout=60)
404
+
405
+ if response.status_code == 200:
406
+ result = response.json()
407
+ return result['choices'][0]['message']['content'].strip()
408
+ else:
409
+ raise Exception(f"API returned {response.status_code}: {response.text[:200]}")
410
+
411
+ except Exception as e:
412
+ raise Exception(f"OpenAI API error: {str(e)}")
413
+
414
+ def generate_responses(
415
+ self,
416
+ df: pd.DataFrame,
417
+ progress_callback: Optional[Callable[[int, int], None]] = None
418
+ ) -> pd.DataFrame:
419
+ """Generate responses using OpenAI API"""
420
+
421
+ if 'backstory' not in df.columns:
422
+ raise ValueError("DataFrame must have 'backstory' column")
423
+
424
+ results = df.copy()
425
+ results['response'] = ""
426
+
427
+ question = self.config['question']
428
+ total = len(df)
429
+
430
+ for i, (idx, row) in enumerate(df.iterrows()):
431
+ backstory = row['backstory']
432
+
433
+ if pd.isna(backstory) or str(backstory).strip() == "":
434
+ results.loc[idx, 'response'] = "[EMPTY]"
435
+ continue
436
+
437
+ try:
438
+ response = self.query_llm(str(backstory), question)
439
+ results.loc[idx, 'response'] = response
440
+
441
+ except Exception as e:
442
+ results.loc[idx, 'response'] = f"[ERROR: {str(e)[:50]}]"
443
+
444
+ if progress_callback:
445
+ progress_callback(i + 1, total)
446
+
447
+ # Small delay to avoid rate limiting
448
+ time.sleep(0.2)
449
+
450
+ return results
451
+
452
+
453
+ class AnthropicSampler:
454
+ """
455
+ Anthropic API sampler (Claude)
456
+
457
+ Uses Anthropic's API to access Claude models.
458
+ Requires ANTHROPIC_API_KEY environment variable or passed in config.
459
+ """
460
+
461
+ def __init__(self, config: dict):
462
+ self.config = config
463
+ self.api_key = config.get('anthropic_api_key') or os.getenv('ANTHROPIC_API_KEY')
464
+ # Use Claude 3.5 Sonnet by default (best balance of quality and cost)
465
+ self.model_name = config.get('anthropic_model', 'claude-3-5-sonnet-20241022')
466
+
467
+ if not self.api_key:
468
+ raise ValueError(
469
+ "Anthropic API key required. Set ANTHROPIC_API_KEY environment variable "
470
+ "or pass 'anthropic_api_key' in config. Get key from: https://console.anthropic.com/settings/keys"
471
+ )
472
+
473
+ def query_llm(self, backstory: str, question: str) -> str:
474
+ """Query Anthropic API"""
475
+ import requests
476
+
477
+ api_url = "https://api.anthropic.com/v1/messages"
478
+
479
+ headers = {
480
+ "x-api-key": self.api_key,
481
+ "anthropic-version": "2023-06-01",
482
+ "Content-Type": "application/json"
483
+ }
484
+
485
+ payload = {
486
+ "model": self.model_name,
487
+ "max_tokens": 150,
488
+ "temperature": self.config['temperature'],
489
+ "system": (
490
+ "Adopt the following persona and answer only based on it. "
491
+ "Do not invent details beyond the provided attributes.\n\n"
492
+ f"{backstory}"
493
+ ),
494
+ "messages": [
495
+ {
496
+ "role": "user",
497
+ "content": question
498
+ }
499
+ ]
500
+ }
501
+
502
+ try:
503
+ response = requests.post(api_url, headers=headers, json=payload, timeout=30)
504
+
505
+ if response.status_code == 200:
506
+ result = response.json()
507
+ return result['content'][0]['text'].strip()
508
+ else:
509
+ return f"[API_ERROR: {response.status_code} - {response.text[:100]}]"
510
+
511
+ except Exception as e:
512
+ return f"[API_ERROR: {str(e)[:100]}]"
513
+
514
+ def query_single(self, backstory: str, question: str) -> str:
515
+ """
516
+ Query Anthropic with a single request (e.g., for thematic analysis)
517
+
518
+ Args:
519
+ backstory: System prompt / context
520
+ question: Query text
521
+
522
+ Returns:
523
+ LLM response text
524
+ """
525
+ import requests
526
+
527
+ api_url = "https://api.anthropic.com/v1/messages"
528
+
529
+ headers = {
530
+ "x-api-key": self.api_key,
531
+ "anthropic-version": "2023-06-01",
532
+ "Content-Type": "application/json"
533
+ }
534
+
535
+ payload = {
536
+ "model": self.model_name,
537
+ "max_tokens": 1000, # More tokens for thematic analysis
538
+ "temperature": self.config.get('temperature', 0.3),
539
+ "system": backstory,
540
+ "messages": [
541
+ {
542
+ "role": "user",
543
+ "content": question
544
+ }
545
+ ]
546
+ }
547
+
548
+ try:
549
+ response = requests.post(api_url, headers=headers, json=payload, timeout=60)
550
+
551
+ if response.status_code == 200:
552
+ result = response.json()
553
+ return result['content'][0]['text'].strip()
554
+ else:
555
+ raise Exception(f"API returned {response.status_code}: {response.text[:200]}")
556
+
557
+ except Exception as e:
558
+ raise Exception(f"Anthropic API error: {str(e)}")
559
+
560
+ def generate_responses(
561
+ self,
562
+ df: pd.DataFrame,
563
+ progress_callback: Optional[Callable[[int, int], None]] = None
564
+ ) -> pd.DataFrame:
565
+ """Generate responses using Anthropic API"""
566
+
567
+ if 'backstory' not in df.columns:
568
+ raise ValueError("DataFrame must have 'backstory' column")
569
+
570
+ results = df.copy()
571
+ results['response'] = ""
572
+
573
+ question = self.config['question']
574
+ total = len(df)
575
+
576
+ for i, (idx, row) in enumerate(df.iterrows()):
577
+ backstory = row['backstory']
578
+
579
+ if pd.isna(backstory) or str(backstory).strip() == "":
580
+ results.loc[idx, 'response'] = "[EMPTY]"
581
+ continue
582
+
583
+ try:
584
+ response = self.query_llm(str(backstory), question)
585
+ results.loc[idx, 'response'] = response
586
+
587
+ except Exception as e:
588
+ results.loc[idx, 'response'] = f"[ERROR: {str(e)[:50]}]"
589
+
590
+ if progress_callback:
591
+ progress_callback(i + 1, total)
592
+
593
+ # Small delay to avoid rate limiting
594
+ time.sleep(0.2)
595
+
596
+ return results
597
+
598
+
599
+ class WinstonSampler:
600
+ """
601
+ Winston GPU server sampler using SSH commands
602
+
603
+ Requires:
604
+ - SSH key authentication to Winston (no password prompts)
605
+ - Winston files already set up (see WINSTON_README.md)
606
+ """
607
+
608
+ def __init__(self, config: dict):
609
+ self.config = config
610
+ self.winston_host = "sturgis@158.143.14.43"
611
+ self.winston_dir = "/home/sturgis/silicon_samples"
612
+
613
+ def query_single(self, backstory: str, question: str) -> str:
614
+ """
615
+ Query Winston with a single request (e.g., for thematic analysis)
616
+
617
+ Args:
618
+ backstory: System prompt / context
619
+ question: Query text
620
+
621
+ Returns:
622
+ LLM response text
623
+ """
624
+ import subprocess
625
+ import tempfile
626
+ from pathlib import Path
627
+
628
+ # Create single-row dataframe
629
+ df = pd.DataFrame({"backstory": [backstory]})
630
+
631
+ # Create temp files
632
+ temp_dir = Path(tempfile.mkdtemp())
633
+ local_input = temp_dir / "query_input.csv"
634
+ local_output = temp_dir / "query_output.csv"
635
+
636
+ df.to_csv(local_input, index=False)
637
+
638
+ remote_input = f"{self.winston_dir}/temp_query_input.csv"
639
+ remote_output = f"{self.winston_dir}/temp_query_output.csv"
640
+
641
+ try:
642
+ # Upload
643
+ subprocess.run(
644
+ ["scp", str(local_input), f"{self.winston_host}:{remote_input}"],
645
+ check=True,
646
+ capture_output=True
647
+ )
648
+
649
+ # Update config with question
650
+ # Use JSON to safely pass the question text
651
+ import json as json_lib
652
+ temp_val = self.config.get('temperature', 0.3)
653
+
654
+ # Create Python script that uses json.dumps to handle escaping
655
+ config_update_script = f"""
656
+ import json
657
+ with open('{self.winston_dir}/config_winston_silicon.json') as f:
658
+ config = json.load(f)
659
+ config['question'] = {json_lib.dumps(question)}
660
+ config['processing']['temperature'] = {temp_val}
661
+ config['processing']['max_tokens'] = 500
662
+ with open('{self.winston_dir}/config_winston_silicon.json', 'w') as f:
663
+ json.dump(config, f, indent=2)
664
+ """
665
+
666
+ # Write script to temp file, upload, execute, then delete
667
+ local_script = temp_dir / "update_config.py"
668
+ with open(local_script, 'w') as f:
669
+ f.write(config_update_script)
670
+
671
+ remote_script = f"{self.winston_dir}/temp_update_config.py"
672
+
673
+ subprocess.run(
674
+ ["scp", str(local_script), f"{self.winston_host}:{remote_script}"],
675
+ check=True,
676
+ capture_output=True
677
+ )
678
+
679
+ subprocess.run(
680
+ ["ssh", self.winston_host, f"python3 {remote_script}"],
681
+ check=True,
682
+ capture_output=True
683
+ )
684
+
685
+ subprocess.run(
686
+ ["ssh", self.winston_host, f"rm {remote_script}"],
687
+ capture_output=True
688
+ )
689
+
690
+ # Run processing
691
+ cmd = (
692
+ f"cd {self.winston_dir} && "
693
+ f"bash -c 'source ~/miniconda3/bin/activate soc_env && "
694
+ f"python3 process_silicon_winston_simple.py {remote_input} {remote_output}'"
695
+ )
696
+
697
+ result = subprocess.run(
698
+ ["ssh", self.winston_host, cmd],
699
+ capture_output=True,
700
+ text=True,
701
+ timeout=120
702
+ )
703
+
704
+ if result.returncode != 0:
705
+ raise Exception(f"Winston query failed: {result.stderr}")
706
+
707
+ # Download result
708
+ subprocess.run(
709
+ ["scp", f"{self.winston_host}:{remote_output}", str(local_output)],
710
+ check=True,
711
+ capture_output=True
712
+ )
713
+
714
+ # Read response
715
+ results_df = pd.read_csv(local_output)
716
+
717
+ if 'LLM_response' in results_df.columns:
718
+ response = results_df['LLM_response'].iloc[0]
719
+ elif 'response' in results_df.columns:
720
+ response = results_df['response'].iloc[0]
721
+ else:
722
+ response = "[No response column found]"
723
+
724
+ # Cleanup remote
725
+ subprocess.run(
726
+ ["ssh", self.winston_host, f"rm -f {remote_input} {remote_output}"],
727
+ capture_output=True
728
+ )
729
+
730
+ return response
731
+
732
+ except Exception as e:
733
+ raise Exception(f"Winston query error: {str(e)}")
734
+ finally:
735
+ # Cleanup local files
736
+ local_input.unlink(missing_ok=True)
737
+ local_output.unlink(missing_ok=True)
738
+ if 'local_script' in locals():
739
+ local_script.unlink(missing_ok=True)
740
+ # Remove temp directory (will only work if empty)
741
+ try:
742
+ temp_dir.rmdir()
743
+ except:
744
+ # If directory not empty, use shutil
745
+ import shutil
746
+ shutil.rmtree(temp_dir, ignore_errors=True)
747
+
748
+ def generate_responses(
749
+ self,
750
+ df: pd.DataFrame,
751
+ progress_callback: Optional[Callable[[int, int], None]] = None
752
+ ) -> pd.DataFrame:
753
+ """
754
+ Generate responses using Winston GPU server
755
+
756
+ This is a synchronous operation that:
757
+ 1. Uploads sample data to Winston
758
+ 2. Runs processing script directly (not via Slurm)
759
+ 3. Downloads results
760
+
761
+ Args:
762
+ df: DataFrame with 'backstory' column
763
+ progress_callback: Optional function(current, total) for progress updates
764
+
765
+ Returns:
766
+ DataFrame with original columns plus 'response' column
767
+ """
768
+ import subprocess
769
+ import tempfile
770
+ from pathlib import Path
771
+
772
+ if 'backstory' not in df.columns:
773
+ raise ValueError("DataFrame must have 'backstory' column")
774
+
775
+ # Create temp files
776
+ temp_dir = Path(tempfile.mkdtemp())
777
+ local_input = temp_dir / "input.csv"
778
+ local_output = temp_dir / "output.csv"
779
+
780
+ # Save input data
781
+ df.to_csv(local_input, index=False)
782
+
783
+ # Remote paths
784
+ remote_input = f"{self.winston_dir}/temp_dashboard_input.csv"
785
+ remote_output = f"{self.winston_dir}/temp_dashboard_output.csv"
786
+
787
+ try:
788
+ # Step 1: Upload input file
789
+ print("📤 Uploading data to Winston...")
790
+ subprocess.run(
791
+ ["scp", str(local_input), f"{self.winston_host}:{remote_input}"],
792
+ check=True,
793
+ capture_output=True
794
+ )
795
+
796
+ # Step 2: Create question config on Winston
797
+ question_text = self.config['question']
798
+ temp_val = self.config['temperature']
799
+
800
+ # Update config remotely with our question
801
+ config_update = f"""
802
+ import json
803
+ with open('{self.winston_dir}/config_winston_silicon.json') as f:
804
+ config = json.load(f)
805
+ config['question'] = '''{question_text}'''
806
+ config['processing']['temperature'] = {temp_val}
807
+ config['processing']['max_tokens'] = 100
808
+ with open('{self.winston_dir}/config_winston_silicon.json', 'w') as f:
809
+ json.dump(config, f, indent=2)
810
+ """
811
+
812
+ subprocess.run(
813
+ ["ssh", self.winston_host, f"python3 -c \"{config_update}\""],
814
+ check=True,
815
+ capture_output=True
816
+ )
817
+
818
+ # Step 3: Run processing on Winston
819
+ print("🚀 Processing on Winston with Qwen2.5...")
820
+ print(" This may take several minutes...")
821
+
822
+ cmd = (
823
+ f"cd {self.winston_dir} && "
824
+ f"bash -c 'source ~/miniconda3/bin/activate soc_env && "
825
+ f"python3 process_silicon_winston_simple.py {remote_input} {remote_output}'"
826
+ )
827
+
828
+ result = subprocess.run(
829
+ ["ssh", self.winston_host, cmd],
830
+ capture_output=True,
831
+ text=True
832
+ )
833
+
834
+ if result.returncode != 0:
835
+ raise Exception(f"Winston processing failed: {result.stderr}")
836
+
837
+ # Show progress (we can't get real-time updates, so just show completion)
838
+ if progress_callback:
839
+ progress_callback(len(df), len(df))
840
+
841
+ # Step 4: Download results
842
+ print("📥 Downloading results...")
843
+ subprocess.run(
844
+ ["scp", f"{self.winston_host}:{remote_output}", str(local_output)],
845
+ check=True,
846
+ capture_output=True
847
+ )
848
+
849
+ # Step 5: Load and process results
850
+ results_df = pd.read_csv(local_output)
851
+
852
+ # Rename LLM_response column to response for consistency with dashboard
853
+ if 'LLM_response' in results_df.columns:
854
+ results_df['response'] = results_df['LLM_response']
855
+ results_df = results_df.drop(columns=['LLM_response'])
856
+
857
+ # Clean up remote files
858
+ subprocess.run(
859
+ ["ssh", self.winston_host, f"rm -f {remote_input} {remote_output}"],
860
+ capture_output=True
861
+ )
862
+
863
+ return results_df
864
+
865
+ except subprocess.CalledProcessError as e:
866
+ raise Exception(f"SSH/SCP command failed: {e.stderr if hasattr(e, 'stderr') else str(e)}")
867
+ finally:
868
+ # Clean up local temp files
869
+ local_input.unlink(missing_ok=True)
870
+ local_output.unlink(missing_ok=True)
871
+ temp_dir.rmdir()
ess_uk_with_backstories.csv ADDED
The diff for this file is too large to render. See raw diff
 
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ streamlit>=1.28.0
2
+ pandas>=2.0.0
3
+ requests>=2.31.0