DivYonko commited on
Commit
bcca570
·
2 Parent(s): a4612d458cbb3a

fix: remove unsupported width='stretch' from buttons

Browse files
Files changed (7) hide show
  1. .gitattributes +38 -0
  2. CHANGELOG.md +177 -0
  3. Dockerfile +23 -0
  4. README.md +20 -0
  5. app.py +25 -23
  6. requirements.txt +6 -0
  7. src/streamlit_app.py +40 -0
.gitattributes CHANGED
@@ -1,4 +1,42 @@
 
1
  *.safetensors filter=lfs diff=lfs merge=lfs -text
2
  *.bin filter=lfs diff=lfs merge=lfs -text
3
  *.pt filter=lfs diff=lfs merge=lfs -text
4
  *.pth filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <<<<<<< HEAD
2
  *.safetensors filter=lfs diff=lfs merge=lfs -text
3
  *.bin filter=lfs diff=lfs merge=lfs -text
4
  *.pt filter=lfs diff=lfs merge=lfs -text
5
  *.pth filter=lfs diff=lfs merge=lfs -text
6
+ =======
7
+ *.7z filter=lfs diff=lfs merge=lfs -text
8
+ *.arrow filter=lfs diff=lfs merge=lfs -text
9
+ *.bin filter=lfs diff=lfs merge=lfs -text
10
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
11
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
12
+ *.ftz filter=lfs diff=lfs merge=lfs -text
13
+ *.gz filter=lfs diff=lfs merge=lfs -text
14
+ *.h5 filter=lfs diff=lfs merge=lfs -text
15
+ *.joblib filter=lfs diff=lfs merge=lfs -text
16
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
17
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
18
+ *.model filter=lfs diff=lfs merge=lfs -text
19
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
20
+ *.npy filter=lfs diff=lfs merge=lfs -text
21
+ *.npz filter=lfs diff=lfs merge=lfs -text
22
+ *.onnx filter=lfs diff=lfs merge=lfs -text
23
+ *.ot filter=lfs diff=lfs merge=lfs -text
24
+ *.parquet filter=lfs diff=lfs merge=lfs -text
25
+ *.pb filter=lfs diff=lfs merge=lfs -text
26
+ *.pickle filter=lfs diff=lfs merge=lfs -text
27
+ *.pkl filter=lfs diff=lfs merge=lfs -text
28
+ *.pt filter=lfs diff=lfs merge=lfs -text
29
+ *.pth filter=lfs diff=lfs merge=lfs -text
30
+ *.rar filter=lfs diff=lfs merge=lfs -text
31
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
32
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
34
+ *.tar filter=lfs diff=lfs merge=lfs -text
35
+ *.tflite filter=lfs diff=lfs merge=lfs -text
36
+ *.tgz filter=lfs diff=lfs merge=lfs -text
37
+ *.wasm filter=lfs diff=lfs merge=lfs -text
38
+ *.xz filter=lfs diff=lfs merge=lfs -text
39
+ *.zip filter=lfs diff=lfs merge=lfs -text
40
+ *.zst filter=lfs diff=lfs merge=lfs -text
41
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
42
+ >>>>>>> 58cbb3ad16a724133db9fe31bffce8783a85648a
CHANGELOG.md ADDED
@@ -0,0 +1,177 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LivePulse — Development Changelog
2
+ **Date:** April 14, 2026
3
+ **Session summary:** Dashboard UX upgrades, multi-stream comparison, analytics features, performance optimizations, and bug fixes.
4
+
5
+ ---
6
+
7
+ ## Files Modified
8
+
9
+ | File | Original Lines | Final Lines | Change |
10
+ |------|---------------|-------------|--------|
11
+ | `frontend/streamlit_app.py` | ~540 | 1354 | +814 |
12
+ | `backend/scraper.py` | 115 | 135 | +20 |
13
+ | `requirements.txt` | 22 | 35 | +13 |
14
+
15
+ ---
16
+
17
+ ## 1. Dashboard UX Upgrades (`frontend/streamlit_app.py`)
18
+
19
+ ### 1.1 Sentiment Heatmap Over Time
20
+ - Added `build_heatmap_data()` — buckets all messages into 1-minute intervals and counts Positive / Neutral / Negative per bucket
21
+ - Rendered as a stacked bar chart (Plotly) showing mood volume over the full stream lifetime
22
+ - Includes "View data" toggle and CSV export
23
+
24
+ ### 1.2 Sentiment Velocity
25
+ - Added `compute_velocity()` — compares positive ratio of last 20 messages vs previous 20
26
+ - Displayed as a 5th stat card alongside cumulative counts
27
+ - Three states: ↑ Rising (green), → Stable (yellow), ↓ Falling (red)
28
+ - Shows delta percentage shift
29
+
30
+ ### 1.3 Notification / Alert System
31
+ - **Negative spike alert** — pulsing red banner when negative % in rolling window exceeds configurable threshold (default 40%)
32
+ - **Spam surge alert** — separate orange banner when spam topic % exceeds configurable threshold (default 30%)
33
+ - Both alerts are dismissable with a ✕ button and re-arm automatically when new messages arrive
34
+ - Alert window size and thresholds configurable from sidebar sliders
35
+
36
+ ### 1.4 Pinned Messages
37
+ - Every message in the live feed has a 📍 pin button
38
+ - Pinned messages appear in a dedicated "Pinned Messages" section above the feed with gold highlight styling
39
+ - Individual unpin buttons per message
40
+ - Sidebar shows pin count and a "Clear pins" button
41
+ - Pin state persists across auto-refreshes via `st.session_state`
42
+
43
+ ### 1.5 Multi-Stream Comparison (fully rebuilt)
44
+ - Sidebar now manages up to **5 independent stream slots** (A–E), each with its own color, video ID field, Redis key field, and Start/Stop buttons
45
+ - **+ Add stream / - Remove last** buttons to dynamically add/remove slots
46
+ - Comparison section appears automatically when 2+ streams have data — no toggle needed
47
+ - Renders sentiment bar charts in rows of 3
48
+ - Overlay line chart shows rolling positive % for all active streams on the same axis
49
+ - Fixed Streamlit widget re-render bug: widget keys used as single source of truth instead of `value=` overrides
50
+
51
+ ---
52
+
53
+ ## 2. Analytics & Insights Features (`frontend/streamlit_app.py`)
54
+
55
+ ### 2.1 Engagement Score
56
+ - `compute_engagement()` — composite 0–100 score from:
57
+ - Message rate (msgs/min) — 40% weight
58
+ - Positive ratio — 40% weight
59
+ - Question density — 20% weight
60
+ - Displayed as a large score card with a fill bar and grade (🔥 High / ⚡ Medium / 💤 Low)
61
+ - Three supporting metric tiles: Msgs/min, Positive ratio, Question density
62
+
63
+ ### 2.2 Top Contributors Leaderboard
64
+ - `compute_top_contributors()` — ranks authors by message count, tracks per-author sentiment breakdown
65
+ - Left panel: ranked list with 🥇🥈🥉 medals, progress bar, colored sentiment dots per author
66
+ - Right panel: stacked horizontal bar chart showing sentiment % for top 5 authors
67
+ - CSV export of full leaderboard
68
+
69
+ ### 2.3 Word Cloud
70
+ - `compute_word_freq()` — extracts top 60 words after removing stopwords (English + common Hinglish filler words)
71
+ - Filterable by sentiment (All / Positive / Neutral / Negative) and topic
72
+ - Renders word cloud image via `wordcloud` library using `wc.to_array()` directly (no matplotlib pipeline)
73
+ - Top-20 frequency bar chart shown below the cloud
74
+ - Falls back to bar chart only if `wordcloud` not installed
75
+
76
+ ### 2.4 Spam Rate Alert
77
+ - `check_spam_alert()` — monitors spam topic ratio in rolling window
78
+ - Separate dismissable banner distinct from the negative sentiment alert
79
+ - Configurable threshold and window from sidebar
80
+
81
+ ---
82
+
83
+ ## 3. Backend: Multi-Stream Scraper (`backend/scraper.py`)
84
+
85
+ ### Changes
86
+ - Added `argparse` CLI interface with two arguments:
87
+ - `--video_id` — YouTube video ID to scrape (defaults to `config.py` value)
88
+ - `--redis_key` — Redis list key to write messages to (defaults to `chat_messages`)
89
+ - `run()` function now accepts `video_id` and `redis_key` as parameters instead of reading globals
90
+ - Redis connection moved inside `run()` so each scraper instance is fully independent
91
+ - Each stream writes to its own Redis key, enabling true parallel multi-stream operation
92
+
93
+ **Usage:**
94
+ ```bash
95
+ # Stream A (default)
96
+ python -m backend.scraper --video_id ABC123 --redis_key chat_messages
97
+
98
+ # Stream B
99
+ python -m backend.scraper --video_id XYZ789 --redis_key chat_messages_b
100
+
101
+ # Stream C
102
+ python -m backend.scraper --video_id DEF456 --redis_key chat_messages_c
103
+ ```
104
+
105
+ ---
106
+
107
+ ## 4. Performance Optimizations (`frontend/streamlit_app.py`)
108
+
109
+ ### 4.1 Redis Read Deduplication
110
+ - `load_stream_data("chat_messages")` called **once** per refresh cycle
111
+ - Windowed slice (`data = all_data[-msg_limit:]`) derived in-memory instead of a second Redis read
112
+ - Multi-stream comparison reuses cached data instead of calling `load_stream_data` twice per stream
113
+
114
+ ### 4.2 `st.cache_data` on Heavy Functions
115
+ | Function | TTL | Benefit |
116
+ |----------|-----|---------|
117
+ | `load_stream_data()` | 5s | Prevents redundant Redis reads within same refresh |
118
+ | `compute_velocity()` | 10s | Skips recompute if data unchanged |
119
+ | `build_heatmap_data()` | 10s | Skips full groupby on every refresh |
120
+ | `compute_engagement()` | 10s | Skips recompute if data unchanged |
121
+ | `compute_top_contributors()` | 10s | Skips recompute if data unchanged |
122
+ | `compute_word_freq()` | 10s | Skips word counting on every refresh |
123
+
124
+ ### 4.3 Cache-Compatible Function Signatures
125
+ - `compute_velocity()` and `build_heatmap_data()` refactored to accept JSON strings instead of DataFrames — `st.cache_data` requires hashable arguments and DataFrames are not hashable
126
+
127
+ ### 4.4 DataFrame Construction
128
+ - `all_df` built once from `all_data`, `df` sliced from it — no duplicate parsing
129
+
130
+ ---
131
+
132
+ ## 5. Bug Fixes
133
+
134
+ ### 5.1 Multi-Stream Widget Re-render Bug
135
+ - **Problem:** `st.text_input(value=stream["video_id"])` was resetting the field to the old value on every Streamlit rerun, so video IDs typed for Stream B/C were wiped before the Start button handler could read them
136
+ - **Fix:** Widget keys (`vid_0`, `rkey_0`, etc.) initialized once via `st.session_state[key] = ...` and used as the sole source of truth. `value=` parameter removed entirely.
137
+
138
+ ### 5.2 Active Stream Detection
139
+ - **Problem:** `r.exists(key)` returns an integer (0 or 1), not a bool, and returns 1 for any existing key including empty lists
140
+ - **Fix:** Changed to `r.llen(key) > 0` which correctly checks for actual message data
141
+
142
+ ### 5.3 WordCloud Crash
143
+ - **Problem:** `background_color="transparent"` is not a valid PIL color specifier, causing `ValueError: unknown color specifier: 'transparent'`
144
+ - **Fix:** Changed to `background_color="white"` and render via `wc.to_array()` directly — removes the matplotlib pipeline entirely
145
+
146
+ ### 5.4 Streamlit Deprecation Warning
147
+ - **Problem:** `use_container_width=True/False` deprecated, removed after 2025-12-31
148
+ - **Fix:** All 21 occurrences replaced with `width='stretch'` / `width='content'`
149
+
150
+ ---
151
+
152
+ ## 6. Dependencies Added (`requirements.txt`)
153
+
154
+ ```
155
+ matplotlib
156
+ wordcloud
157
+ ```
158
+
159
+ ---
160
+
161
+ ## Architecture Overview (Post-Session)
162
+
163
+ ```
164
+ Redis
165
+ ├── chat_messages ← Stream A scraper writes here
166
+ ├── chat_messages_b ← Stream B scraper writes here
167
+ ├── chat_messages_c ← Stream C scraper writes here
168
+ ├── chat_messages_d ← Stream D scraper writes here
169
+ ├── chat_messages_e ← Stream E scraper writes here
170
+ └── video_title ← Stream A title for page header
171
+
172
+ backend/scraper.py ← One process per stream, --video_id + --redis_key args
173
+ backend/main.py ← FastAPI REST API (reads from chat_messages)
174
+ frontend/streamlit_app.py ← Dashboard (reads from all active Redis keys)
175
+ ml/sentiment_model.py ← 3-model ensemble (MuRIL + XLM-R + Multilingual)
176
+ ml/topic_model.py ← Keyword fast-path + BART zero-shot fallback
177
+ ```
Dockerfile CHANGED
@@ -1,3 +1,4 @@
 
1
  FROM python:3.11-slim
2
 
3
  WORKDIR /app
@@ -8,3 +9,25 @@ COPY . .
8
 
9
  EXPOSE 7860
10
  CMD ["streamlit", "run", "app.py", "--server.port", "7860", "--server.address", "0.0.0.0"]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <<<<<<< HEAD
2
  FROM python:3.11-slim
3
 
4
  WORKDIR /app
 
9
 
10
  EXPOSE 7860
11
  CMD ["streamlit", "run", "app.py", "--server.port", "7860", "--server.address", "0.0.0.0"]
12
+ =======
13
+ FROM python:3.13.5-slim
14
+
15
+ WORKDIR /app
16
+
17
+ RUN apt-get update && apt-get install -y \
18
+ build-essential \
19
+ curl \
20
+ git \
21
+ && rm -rf /var/lib/apt/lists/*
22
+
23
+ COPY requirements.txt ./
24
+ COPY src/ ./src/
25
+
26
+ RUN pip3 install -r requirements.txt
27
+
28
+ EXPOSE 8501
29
+
30
+ HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
31
+
32
+ ENTRYPOINT ["streamlit", "run", "src/streamlit_app.py", "--server.port=8501", "--server.address=0.0.0.0"]
33
+ >>>>>>> 58cbb3ad16a724133db9fe31bffce8783a85648a
README.md CHANGED
@@ -1,5 +1,6 @@
1
  ---
2
  title: LivePulse
 
3
  emoji: 📡
4
  colorFrom: purple
5
  colorTo: indigo
@@ -31,3 +32,22 @@ Real-time Hinglish sentiment and topic analysis for YouTube live streams.
31
  1. Paste a YouTube live video ID or URL in the **Stream Control** section in the sidebar
32
  2. Click **▶ Start** — the scraper launches in the background
33
  3. The dashboard auto-refreshes and shows live sentiment + topic data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: LivePulse
3
+ <<<<<<< HEAD
4
  emoji: 📡
5
  colorFrom: purple
6
  colorTo: indigo
 
32
  1. Paste a YouTube live video ID or URL in the **Stream Control** section in the sidebar
33
  2. Click **▶ Start** — the scraper launches in the background
34
  3. The dashboard auto-refreshes and shows live sentiment + topic data
35
+ =======
36
+ emoji: 🚀
37
+ colorFrom: red
38
+ colorTo: red
39
+ sdk: docker
40
+ app_port: 8501
41
+ tags:
42
+ - streamlit
43
+ pinned: false
44
+ short_description: YoutubeLive Comment Analysis
45
+ ---
46
+
47
+ # Welcome to Streamlit!
48
+
49
+ Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
50
+
51
+ If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
52
+ forums](https://discuss.streamlit.io).
53
+ >>>>>>> 58cbb3ad16a724133db9fe31bffce8783a85648a
app.py CHANGED
@@ -1,4 +1,4 @@
1
- # -*- coding: utf-8 -*-
2
  """
3
  app.py — Hugging Face Spaces adaptation of frontend/streamlit_app.py
4
  All features identical; infrastructure layer uses in-memory deque store
@@ -684,7 +684,7 @@ with st.sidebar:
684
 
685
  sc1, sc2 = st.columns(2)
686
  with sc1:
687
- if st.button("▶ Start", key=f"start_{idx}", width='stretch'):
688
  vid = extract_video_id(st.session_state[vid_skey])
689
  rkey = st.session_state[rkey_skey].strip() or f"chat_messages_{label.lower()}"
690
  if vid:
@@ -703,7 +703,7 @@ with st.sidebar:
703
  else:
704
  st.error("Invalid video ID or URL")
705
  with sc2:
706
- if st.button("⏹ Stop", key=f"stop_{idx}", width='stretch'):
707
  if is_scraper_running(idx):
708
  stop_scraper(idx)
709
  st.session_state.streams[idx]["proc"] = None
@@ -722,7 +722,7 @@ with st.sidebar:
722
  add_col, rem_col = st.columns(2)
723
  with add_col:
724
  if len(st.session_state.streams) < MAX_STREAMS:
725
- if st.button("+ Add stream", width='stretch'):
726
  n = len(st.session_state.streams)
727
  st.session_state.streams.append({
728
  "video_id": "",
@@ -733,7 +733,7 @@ with st.sidebar:
733
  st.rerun()
734
  with rem_col:
735
  if len(st.session_state.streams) > 1:
736
- if st.button("- Remove last", width='stretch'):
737
  removed = st.session_state.streams.pop()
738
  removed_idx = len(st.session_state.streams)
739
  stop_scraper(removed_idx)
@@ -745,14 +745,14 @@ with st.sidebar:
745
  st.markdown('<p style="font-size:0.68rem;font-weight:700;color:var(--accent);text-transform:uppercase;letter-spacing:0.1em;margin-bottom:8px;">Pinned Messages</p>', unsafe_allow_html=True)
746
  pin_count = len(st.session_state.pinned_messages)
747
  st.markdown(f'<div style="font-size:0.78rem;color:var(--text-3);">{pin_count} message{"s" if pin_count != 1 else ""} pinned</div>', unsafe_allow_html=True)
748
- if pin_count > 0 and st.button("🗑 Clear pins", width='stretch'):
749
  st.session_state.pinned_messages = []
750
  st.rerun()
751
  st.divider()
752
 
753
  # ── Danger Zone ──
754
  st.markdown('<p style="font-size:0.68rem;font-weight:700;color:#ef4444;text-transform:uppercase;letter-spacing:0.1em;margin-bottom:8px;">Danger Zone</p>', unsafe_allow_html=True)
755
- if st.button("🗑 Clear all data", width='stretch'):
756
  for s in st.session_state.streams:
757
  store_delete(s["redis_key"])
758
  st.session_state.pinned_messages = []
@@ -952,7 +952,7 @@ with col_l:
952
  hovertemplate="<b>%{x}</b><br>Count: %{y}<extra></extra>",
953
  ))
954
  fig_bar.update_layout(**plotly_layout(260))
955
- st.plotly_chart(fig_bar, width='stretch', config={"displayModeBar": False})
956
  bar_hdr, bar_dl = st.columns([1, 1])
957
  with bar_hdr:
958
  show_bar_data = st.checkbox("View data", key="show_bar")
@@ -960,7 +960,7 @@ with col_l:
960
  bar_df = pd.DataFrame({"Sentiment": ["Positive", "Neutral", "Negative"], "Count": [pos, neu, neg]})
961
  csv_download(bar_df, "Download CSV", "sentiment_distribution.csv")
962
  if show_bar_data:
963
- st.dataframe(bar_df, width='stretch', hide_index=True)
964
  st.markdown('</div>', unsafe_allow_html=True)
965
 
966
  with col_r:
@@ -979,7 +979,7 @@ with col_r:
979
  "showlegend": True,
980
  "legend": dict(orientation="h", y=-0.08, font=dict(size=11))}
981
  )
982
- st.plotly_chart(fig_pie, width='stretch', config={"displayModeBar": False})
983
  pie_hdr, pie_dl = st.columns([1, 1])
984
  with pie_hdr:
985
  show_pie_data = st.checkbox("View data", key="show_pie")
@@ -991,7 +991,7 @@ with col_r:
991
  })
992
  csv_download(pie_df, "Download CSV", "sentiment_breakdown.csv")
993
  if show_pie_data:
994
- st.dataframe(pie_df, width='stretch', hide_index=True)
995
  st.markdown('</div>', unsafe_allow_html=True)
996
 
997
  # ── Confidence trend ──────────────────────────────────────────
@@ -1012,7 +1012,7 @@ if "confidence" in df.columns:
1012
  ))
1013
  fig_line.update_layout(**plotly_layout(180))
1014
  fig_line.update_yaxes(range=[0, 1])
1015
- st.plotly_chart(fig_line, width='stretch', config={"displayModeBar": False})
1016
  conf_hdr, conf_dl = st.columns([1, 1])
1017
  with conf_hdr:
1018
  show_conf_data = st.checkbox("View data", key="show_conf")
@@ -1021,7 +1021,7 @@ if "confidence" in df.columns:
1021
  conf_export.columns = ["message_index", "confidence"]
1022
  csv_download(conf_export, "Download CSV", "confidence_trend.csv")
1023
  if show_conf_data:
1024
- st.dataframe(conf_export, width='stretch', hide_index=True)
1025
  st.markdown('</div>', unsafe_allow_html=True)
1026
 
1027
 
@@ -1054,7 +1054,7 @@ if not heatmap_data.empty:
1054
  layout["legend"] = dict(orientation="h", y=1.08, font=dict(size=11))
1055
  layout["xaxis"]["tickformat"] = "%H:%M"
1056
  fig_heat.update_layout(**layout)
1057
- st.plotly_chart(fig_heat, width='stretch', config={"displayModeBar": False})
1058
 
1059
  heat_hdr, heat_dl = st.columns([1, 1])
1060
  with heat_hdr:
@@ -1062,7 +1062,7 @@ if not heatmap_data.empty:
1062
  with heat_dl:
1063
  csv_download(heatmap_data.rename(columns={"bucket": "time_bucket"}), "Download CSV", "sentiment_heatmap.csv")
1064
  if show_heat_data:
1065
- st.dataframe(heatmap_data.rename(columns={"bucket": "time_bucket"}), width='stretch', hide_index=True)
1066
  st.markdown('</div>', unsafe_allow_html=True)
1067
  else:
1068
  st.info("Not enough timestamped data for heatmap yet.")
@@ -1105,7 +1105,7 @@ fig_topic = go.Figure(go.Bar(
1105
  hovertemplate="<b>%{x}</b><br>Count: %{y}<extra></extra>",
1106
  ))
1107
  fig_topic.update_layout(**plotly_layout(250))
1108
- st.plotly_chart(fig_topic, width='stretch', config={"displayModeBar": False})
1109
  topic_hdr, topic_dl = st.columns([1, 1])
1110
  with topic_hdr:
1111
  show_topic_data = st.checkbox("View data", key="show_topic")
@@ -1113,7 +1113,7 @@ with topic_dl:
1113
  topic_df = pd.DataFrame({"Topic": TOPIC_LABELS, "Count": [topic_counts[l] for l in TOPIC_LABELS]})
1114
  csv_download(topic_df, "Download CSV", "topic_distribution.csv")
1115
  if show_topic_data:
1116
- st.dataframe(topic_df, width='stretch', hide_index=True)
1117
  st.markdown('</div>', unsafe_allow_html=True)
1118
 
1119
 
@@ -1208,7 +1208,7 @@ if contributors:
1208
  layout_lb["xaxis"]["range"] = [0, 100]
1209
  layout_lb["xaxis"]["ticksuffix"] = "%"
1210
  fig_lb.update_layout(**layout_lb)
1211
- st.plotly_chart(fig_lb, width='stretch', config={"displayModeBar": False})
1212
 
1213
  contrib_df = pd.DataFrame(contributors)
1214
  csv_download(contrib_df, "Download CSV", "top_contributors.csv")
@@ -1248,7 +1248,7 @@ if word_freq:
1248
  ).generate_from_frequencies(freq_dict)
1249
 
1250
  st.markdown('<div class="chart-wrap">', unsafe_allow_html=True)
1251
- st.image(wc.to_array(), width="stretch")
1252
  st.markdown('</div>', unsafe_allow_html=True)
1253
 
1254
  top20 = word_freq[:20]
@@ -1261,7 +1261,7 @@ if word_freq:
1261
  ))
1262
  layout_wf = plotly_layout(180)
1263
  fig_wf.update_layout(**layout_wf)
1264
- st.plotly_chart(fig_wf, width='stretch', config={"displayModeBar": False})
1265
 
1266
  except ImportError:
1267
  top20 = word_freq[:20]
@@ -1272,7 +1272,7 @@ if word_freq:
1272
  marker_line_width=0,
1273
  ))
1274
  fig_wf.update_layout(**plotly_layout(200))
1275
- st.plotly_chart(fig_wf, width='stretch', config={"displayModeBar": False})
1276
  else:
1277
  st.info("Not enough text data yet.")
1278
 
@@ -1328,7 +1328,7 @@ if len(active_streams) > 1:
1328
  f'Stream {slabel} — {stream["redis_key"]}</span>',
1329
  unsafe_allow_html=True
1330
  )
1331
- st.plotly_chart(fig, width='stretch', config={"displayModeBar": False})
1332
  st.markdown(
1333
  f'<div style="font-size:0.78rem;color:var(--text-3);margin-bottom:8px;">'
1334
  f'{t} msgs · <span style="color:#22c55e;">{p/t*100:.1f}% pos</span> · '
@@ -1363,7 +1363,7 @@ if len(active_streams) > 1:
1363
  layout_ov["legend"] = dict(orientation="h", y=1.1, font=dict(size=11))
1364
  layout_ov["yaxis"]["range"] = [0, 100]
1365
  fig_overlay.update_layout(**layout_ov)
1366
- st.plotly_chart(fig_overlay, width='stretch', config={"displayModeBar": False})
1367
  st.markdown('</div>', unsafe_allow_html=True)
1368
 
1369
  elif len(st.session_state.streams) > 1:
@@ -1483,3 +1483,5 @@ for i, (_, row) in enumerate(filtered.iloc[::-1].iterrows()):
1483
  if auto_refresh:
1484
  time.sleep(refresh_rate)
1485
  st.rerun()
 
 
 
1
+ # -*- coding: utf-8 -*-
2
  """
3
  app.py — Hugging Face Spaces adaptation of frontend/streamlit_app.py
4
  All features identical; infrastructure layer uses in-memory deque store
 
684
 
685
  sc1, sc2 = st.columns(2)
686
  with sc1:
687
+ if st.button("▶ Start", key=f"start_{idx}"):
688
  vid = extract_video_id(st.session_state[vid_skey])
689
  rkey = st.session_state[rkey_skey].strip() or f"chat_messages_{label.lower()}"
690
  if vid:
 
703
  else:
704
  st.error("Invalid video ID or URL")
705
  with sc2:
706
+ if st.button("⏹ Stop", key=f"stop_{idx}"):
707
  if is_scraper_running(idx):
708
  stop_scraper(idx)
709
  st.session_state.streams[idx]["proc"] = None
 
722
  add_col, rem_col = st.columns(2)
723
  with add_col:
724
  if len(st.session_state.streams) < MAX_STREAMS:
725
+ if st.button("+ Add stream"):
726
  n = len(st.session_state.streams)
727
  st.session_state.streams.append({
728
  "video_id": "",
 
733
  st.rerun()
734
  with rem_col:
735
  if len(st.session_state.streams) > 1:
736
+ if st.button("- Remove last"):
737
  removed = st.session_state.streams.pop()
738
  removed_idx = len(st.session_state.streams)
739
  stop_scraper(removed_idx)
 
745
  st.markdown('<p style="font-size:0.68rem;font-weight:700;color:var(--accent);text-transform:uppercase;letter-spacing:0.1em;margin-bottom:8px;">Pinned Messages</p>', unsafe_allow_html=True)
746
  pin_count = len(st.session_state.pinned_messages)
747
  st.markdown(f'<div style="font-size:0.78rem;color:var(--text-3);">{pin_count} message{"s" if pin_count != 1 else ""} pinned</div>', unsafe_allow_html=True)
748
+ if pin_count > 0 and st.button("🗑 Clear pins"):
749
  st.session_state.pinned_messages = []
750
  st.rerun()
751
  st.divider()
752
 
753
  # ── Danger Zone ──
754
  st.markdown('<p style="font-size:0.68rem;font-weight:700;color:#ef4444;text-transform:uppercase;letter-spacing:0.1em;margin-bottom:8px;">Danger Zone</p>', unsafe_allow_html=True)
755
+ if st.button("🗑 Clear all data"):
756
  for s in st.session_state.streams:
757
  store_delete(s["redis_key"])
758
  st.session_state.pinned_messages = []
 
952
  hovertemplate="<b>%{x}</b><br>Count: %{y}<extra></extra>",
953
  ))
954
  fig_bar.update_layout(**plotly_layout(260))
955
+ st.plotly_chart(fig_bar, config={"displayModeBar": False})
956
  bar_hdr, bar_dl = st.columns([1, 1])
957
  with bar_hdr:
958
  show_bar_data = st.checkbox("View data", key="show_bar")
 
960
  bar_df = pd.DataFrame({"Sentiment": ["Positive", "Neutral", "Negative"], "Count": [pos, neu, neg]})
961
  csv_download(bar_df, "Download CSV", "sentiment_distribution.csv")
962
  if show_bar_data:
963
+ st.dataframe(bar_df, hide_index=True)
964
  st.markdown('</div>', unsafe_allow_html=True)
965
 
966
  with col_r:
 
979
  "showlegend": True,
980
  "legend": dict(orientation="h", y=-0.08, font=dict(size=11))}
981
  )
982
+ st.plotly_chart(fig_pie, config={"displayModeBar": False})
983
  pie_hdr, pie_dl = st.columns([1, 1])
984
  with pie_hdr:
985
  show_pie_data = st.checkbox("View data", key="show_pie")
 
991
  })
992
  csv_download(pie_df, "Download CSV", "sentiment_breakdown.csv")
993
  if show_pie_data:
994
+ st.dataframe(pie_df, hide_index=True)
995
  st.markdown('</div>', unsafe_allow_html=True)
996
 
997
  # ── Confidence trend ──────────────────────────────────────────
 
1012
  ))
1013
  fig_line.update_layout(**plotly_layout(180))
1014
  fig_line.update_yaxes(range=[0, 1])
1015
+ st.plotly_chart(fig_line, config={"displayModeBar": False})
1016
  conf_hdr, conf_dl = st.columns([1, 1])
1017
  with conf_hdr:
1018
  show_conf_data = st.checkbox("View data", key="show_conf")
 
1021
  conf_export.columns = ["message_index", "confidence"]
1022
  csv_download(conf_export, "Download CSV", "confidence_trend.csv")
1023
  if show_conf_data:
1024
+ st.dataframe(conf_export, hide_index=True)
1025
  st.markdown('</div>', unsafe_allow_html=True)
1026
 
1027
 
 
1054
  layout["legend"] = dict(orientation="h", y=1.08, font=dict(size=11))
1055
  layout["xaxis"]["tickformat"] = "%H:%M"
1056
  fig_heat.update_layout(**layout)
1057
+ st.plotly_chart(fig_heat, config={"displayModeBar": False})
1058
 
1059
  heat_hdr, heat_dl = st.columns([1, 1])
1060
  with heat_hdr:
 
1062
  with heat_dl:
1063
  csv_download(heatmap_data.rename(columns={"bucket": "time_bucket"}), "Download CSV", "sentiment_heatmap.csv")
1064
  if show_heat_data:
1065
+ st.dataframe(heatmap_data.rename(columns={"bucket": "time_bucket"}), hide_index=True)
1066
  st.markdown('</div>', unsafe_allow_html=True)
1067
  else:
1068
  st.info("Not enough timestamped data for heatmap yet.")
 
1105
  hovertemplate="<b>%{x}</b><br>Count: %{y}<extra></extra>",
1106
  ))
1107
  fig_topic.update_layout(**plotly_layout(250))
1108
+ st.plotly_chart(fig_topic, config={"displayModeBar": False})
1109
  topic_hdr, topic_dl = st.columns([1, 1])
1110
  with topic_hdr:
1111
  show_topic_data = st.checkbox("View data", key="show_topic")
 
1113
  topic_df = pd.DataFrame({"Topic": TOPIC_LABELS, "Count": [topic_counts[l] for l in TOPIC_LABELS]})
1114
  csv_download(topic_df, "Download CSV", "topic_distribution.csv")
1115
  if show_topic_data:
1116
+ st.dataframe(topic_df, hide_index=True)
1117
  st.markdown('</div>', unsafe_allow_html=True)
1118
 
1119
 
 
1208
  layout_lb["xaxis"]["range"] = [0, 100]
1209
  layout_lb["xaxis"]["ticksuffix"] = "%"
1210
  fig_lb.update_layout(**layout_lb)
1211
+ st.plotly_chart(fig_lb, config={"displayModeBar": False})
1212
 
1213
  contrib_df = pd.DataFrame(contributors)
1214
  csv_download(contrib_df, "Download CSV", "top_contributors.csv")
 
1248
  ).generate_from_frequencies(freq_dict)
1249
 
1250
  st.markdown('<div class="chart-wrap">', unsafe_allow_html=True)
1251
+ st.image(wc.to_array())
1252
  st.markdown('</div>', unsafe_allow_html=True)
1253
 
1254
  top20 = word_freq[:20]
 
1261
  ))
1262
  layout_wf = plotly_layout(180)
1263
  fig_wf.update_layout(**layout_wf)
1264
+ st.plotly_chart(fig_wf, config={"displayModeBar": False})
1265
 
1266
  except ImportError:
1267
  top20 = word_freq[:20]
 
1272
  marker_line_width=0,
1273
  ))
1274
  fig_wf.update_layout(**plotly_layout(200))
1275
+ st.plotly_chart(fig_wf, config={"displayModeBar": False})
1276
  else:
1277
  st.info("Not enough text data yet.")
1278
 
 
1328
  f'Stream {slabel} — {stream["redis_key"]}</span>',
1329
  unsafe_allow_html=True
1330
  )
1331
+ st.plotly_chart(fig, config={"displayModeBar": False})
1332
  st.markdown(
1333
  f'<div style="font-size:0.78rem;color:var(--text-3);margin-bottom:8px;">'
1334
  f'{t} msgs · <span style="color:#22c55e;">{p/t*100:.1f}% pos</span> · '
 
1363
  layout_ov["legend"] = dict(orientation="h", y=1.1, font=dict(size=11))
1364
  layout_ov["yaxis"]["range"] = [0, 100]
1365
  fig_overlay.update_layout(**layout_ov)
1366
+ st.plotly_chart(fig_overlay, config={"displayModeBar": False})
1367
  st.markdown('</div>', unsafe_allow_html=True)
1368
 
1369
  elif len(st.session_state.streams) > 1:
 
1483
  if auto_refresh:
1484
  time.sleep(refresh_rate)
1485
  st.rerun()
1486
+
1487
+
requirements.txt CHANGED
@@ -1,3 +1,4 @@
 
1
  # Core ML
2
  torch>=2.0.0
3
  transformers>=4.38.0
@@ -16,3 +17,8 @@ pandas>=2.0.0
16
  plotly>=5.18.0
17
  wordcloud>=1.9.3
18
  matplotlib>=3.8.0
 
 
 
 
 
 
1
+ <<<<<<< HEAD
2
  # Core ML
3
  torch>=2.0.0
4
  transformers>=4.38.0
 
17
  plotly>=5.18.0
18
  wordcloud>=1.9.3
19
  matplotlib>=3.8.0
20
+ =======
21
+ altair
22
+ pandas
23
+ streamlit
24
+ >>>>>>> 58cbb3ad16a724133db9fe31bffce8783a85648a
src/streamlit_app.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import altair as alt
2
+ import numpy as np
3
+ import pandas as pd
4
+ import streamlit as st
5
+
6
+ """
7
+ # Welcome to Streamlit!
8
+
9
+ Edit `/streamlit_app.py` to customize this app to your heart's desire :heart:.
10
+ If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
11
+ forums](https://discuss.streamlit.io).
12
+
13
+ In the meantime, below is an example of what you can do with just a few lines of code:
14
+ """
15
+
16
+ num_points = st.slider("Number of points in spiral", 1, 10000, 1100)
17
+ num_turns = st.slider("Number of turns in spiral", 1, 300, 31)
18
+
19
+ indices = np.linspace(0, 1, num_points)
20
+ theta = 2 * np.pi * num_turns * indices
21
+ radius = indices
22
+
23
+ x = radius * np.cos(theta)
24
+ y = radius * np.sin(theta)
25
+
26
+ df = pd.DataFrame({
27
+ "x": x,
28
+ "y": y,
29
+ "idx": indices,
30
+ "rand": np.random.randn(num_points),
31
+ })
32
+
33
+ st.altair_chart(alt.Chart(df, height=700, width=700)
34
+ .mark_point(filled=True)
35
+ .encode(
36
+ x=alt.X("x", axis=None),
37
+ y=alt.Y("y", axis=None),
38
+ color=alt.Color("idx", legend=None, scale=alt.Scale()),
39
+ size=alt.Size("rand", legend=None, scale=alt.Scale(range=[1, 150])),
40
+ ))