# Data Logging Module This module tracks all user interactions with the HicXAI loan assistant and saves data to a private GitHub repository. ## Setup 1. **Create GitHub Personal Access Token**: - Go to GitHub Settings → Developer settings → Personal access tokens - Create token with `repo` scope - Add to your `.env` file as `GITHUB_DATA_TOKEN` 2. **Private Repository**: - Data is saved to: `https://github.com/ksauka/hicxai-data-private` - Ensure the token has access to this repository ## Data Collected ### User Identification - Prolific ID (from query param `pid` or `PROLIFIC_PID`) - Condition (1-6, from query param `cond`) - Session ID (unique per session) - Timestamps (start, end, duration) ### Application Data - All 12 loan application fields (age, education, occupation, etc.) - Final prediction (approved/denied) - Prediction probability ### Interactions - Every user message (typed or clicked) - Every assistant response - Input method (typed vs button click) - Current field being collected - Conversation state ### Behavior Metrics - Total messages sent - Typed vs clicked responses - Help button clicks - Explanation requests - Progress checks - Fields changed/corrected ### Feedback - Rating (1-5 stars) - Ease of use - Explanation clarity - Would recommend - Free-text comments ## File Structure Data is saved to: ``` sessions/ YYYY-MM-DD/ {prolific_id}_{condition}_{timestamp}.json ``` ## Example Data ```json { "session_id": "abc123", "prolific_id": "TEST123", "condition": 2, "ab_version": "control", "timestamps": { "session_start": "2025-11-28T10:30:00", "session_end": "2025-11-28T10:33:45", "duration_seconds": 225 }, "application_data": { "age": 35, "education": "Bachelors", ... "prediction": ">50K", "prediction_probability": 0.73 }, "interactions": [ { "timestamp": "2025-11-28T10:30:15", "type": "user_message", "field": "age", "input_method": "typed", "content": "35" }, ... ], "behavior_metrics": { "total_messages": 15, "typed_responses": 8, "clicked_responses": 7, ... }, "feedback": { "rating": 4, ... } } ``` ## Fallback If GitHub save fails (missing token, network error, etc.), data is saved locally to: ``` data/sessions/{date}_{prolific_id}_{condition}_{timestamp}.json ``` ## Privacy - Data is saved to a **private** repository - Only accessible with the GitHub token - No personally identifiable information beyond Prolific ID