File size: 2,541 Bytes
070061f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
# Data Logging Module

This module tracks all user interactions with the HicXAI loan assistant and saves data to a private GitHub repository.

## Setup

1. **Create GitHub Personal Access Token**:
   - Go to GitHub Settings → Developer settings → Personal access tokens
   - Create token with `repo` scope
   - Add to your `.env` file as `GITHUB_DATA_TOKEN`

2. **Private Repository**:
   - Data is saved to: `https://github.com/ksauka/hicxai-data-private`
   - Ensure the token has access to this repository

## Data Collected

### User Identification
- Prolific ID (from query param `pid` or `PROLIFIC_PID`)
- Condition (1-6, from query param `cond`)
- Session ID (unique per session)
- Timestamps (start, end, duration)

### Application Data
- All 12 loan application fields (age, education, occupation, etc.)
- Final prediction (approved/denied)
- Prediction probability

### Interactions
- Every user message (typed or clicked)
- Every assistant response
- Input method (typed vs button click)
- Current field being collected
- Conversation state

### Behavior Metrics
- Total messages sent
- Typed vs clicked responses
- Help button clicks
- Explanation requests
- Progress checks
- Fields changed/corrected

### Feedback
- Rating (1-5 stars)
- Ease of use
- Explanation clarity
- Would recommend
- Free-text comments

## File Structure

Data is saved to:
```
sessions/
  YYYY-MM-DD/
    {prolific_id}_{condition}_{timestamp}.json
```

## Example Data

```json
{
  "session_id": "abc123",
  "prolific_id": "TEST123",
  "condition": 2,
  "ab_version": "control",
  "timestamps": {
    "session_start": "2025-11-28T10:30:00",
    "session_end": "2025-11-28T10:33:45",
    "duration_seconds": 225
  },
  "application_data": {
    "age": 35,
    "education": "Bachelors",
    ...
    "prediction": ">50K",
    "prediction_probability": 0.73
  },
  "interactions": [
    {
      "timestamp": "2025-11-28T10:30:15",
      "type": "user_message",
      "field": "age",
      "input_method": "typed",
      "content": "35"
    },
    ...
  ],
  "behavior_metrics": {
    "total_messages": 15,
    "typed_responses": 8,
    "clicked_responses": 7,
    ...
  },
  "feedback": {
    "rating": 4,
    ...
  }
}
```

## Fallback

If GitHub save fails (missing token, network error, etc.), data is saved locally to:
```
data/sessions/{date}_{prolific_id}_{condition}_{timestamp}.json
```

## Privacy

- Data is saved to a **private** repository
- Only accessible with the GitHub token
- No personally identifiable information beyond Prolific ID