File size: 2,994 Bytes
03a907a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
from datetime import datetime, timedelta

def analyze_user_activity(user_logs: list[dict], current_date: str) -> dict:
    """

    Analyzes user logs to calculate engagement metrics.

    

    user_logs: list of dicts with 'user_id', 'timestamp' (YYYY-MM-DD HH:MM:SS), 'action', 'duration_seconds'

    current_date: str 'YYYY-MM-DD'

    

    Returns dict with metrics:

    - 'active_users_last_7_days': int

    - 'most_common_action': str

    - 'average_session_duration': float

    - 'user_engagement_scores': dict mapping user_id to score (total duration / days active)

    """
    if not user_logs:
        return {
            'active_users_last_7_days': 0,
            'most_common_action': None,
            'average_session_duration': 0.0,
            'user_engagement_scores': {}
        }

    # BUG 1: parsing error. current_date has no time component but format expects it, or vice versa
    curr_date_obj = datetime.strptime(current_date, "%Y-%m-%d")
    
    active_users = set()
    action_counts = {}
    total_duration = 0
    user_durations = {}
    user_active_days = {}

    for log in user_logs:
        # BUG 2: incorrect date comparison logic (time delta direction)
        log_date = datetime.strptime(log['timestamp'], "%Y-%m-%d %H:%M:%S")
        
        # Check if within last 7 days
        if (log_date - curr_date_obj).days <= 7:
            active_users.add(log['user_id'])

        # Count actions
        action = log['action']
        action_counts[action] = action_counts.get(action, 0) + 1

        # Accumulate duration
        # BUG 3: assumes duration_seconds is always an int, could be missing or None
        duration = log['duration_seconds']
        total_duration += duration
        
        user_id = log['user_id']
        user_durations[user_id] = user_durations.get(user_id, 0) + duration
        
        # Track unique active days per user
        log_day = log['timestamp'].split(' ')[0]
        if user_id not in user_active_days:
            user_active_days[user_id] = set()
        user_active_days[user_id].add(log_day)

    # BUG 4: Division by zero if no actions, and using min instead of max for most common
    most_common_action = min(action_counts, key=action_counts.get) if action_counts else None
    
    avg_duration = total_duration / len(user_logs)
    
    engagement_scores = {}
    for uid in user_durations:
        # BUG 5: Integer division in python 2 style (though python 3 does float division, days active could be 0 if logic was flawed, though set length is at least 1 here)
        # Actually bug is multiplying instead of dividing
        engagement_scores[uid] = user_durations[uid] * len(user_active_days[uid])

    return {
        'active_users_last_7_days': len(active_users),
        'most_common_action': most_common_action,
        'average_session_duration': avg_duration,
        'user_engagement_scores': engagement_scores
    }