Sulitha commited on
Commit
7a7d9e7
·
1 Parent(s): 880916c

Add counter, CSV logging, secret-gated ZIP download

Browse files
Files changed (2) hide show
  1. README.md +57 -22
  2. app.py +122 -101
README.md CHANGED
@@ -1,30 +1,65 @@
1
- ---
2
- title: Spell Recorder
3
- emoji: ✨
4
- colorFrom: indigo
5
- colorTo: purple
6
- colorBottom: purple
7
- sdk: gradio
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- short_description: Collect microphone recordings for six spells
12
- ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
15
 
16
- ## Persistence of Recordings
 
 
 
 
 
 
17
 
18
- Recordings created via the UI are written at runtime into the `recordings/` folder inside the Space container. In addition, this app uploads each saved WAV file to MongoDB using GridFS (if configured).
 
 
 
 
 
 
 
19
 
20
- ### MongoDB configuration
21
 
22
- Set the following Space secrets:
23
- - `MONGO_URI`: your MongoDB connection string (supports `mongodb+srv://`)
24
- - `MONGO_DB` (optional): database name, default `spells`
25
- - `GRIDFS_BUCKET` (optional): GridFS bucket prefix, default `fs`
 
26
 
27
- On submit, each provided spell is saved locally and uploaded to your Mongo database with metadata: `spell`, `username`, `timestamp`, and original `filename`.
28
 
29
- If Mongo is not configured, files are still saved locally under `recordings/`.
 
 
 
 
 
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Spell Recorder (Gradio)
 
 
 
 
 
 
 
 
 
 
 
2
 
3
+ Collect microphone recordings for a small set of Harry Potter spells and save them to disk for training a classifier.
4
 
5
+ Spells collected:
6
+ - Lumos
7
+ - Nox
8
+ - Alohomora
9
+ - Wingardium Leviosa
10
+ - Accio
11
+ - Reparo
12
 
13
+ ## How it works
14
+ - Enter a username (used in filenames; sanitized to safe characters).
15
+ - Record with your microphone (preferred) or upload an audio file for any spell.
16
+ - Click Submit.
17
+ - The app will save any provided recordings to `recordings/` as 16 kHz mono WAVs named: `<spell>_<username>_<timestamp>.wav`.
18
+ - A live counter shows how many spells are selected (recorded/uploaded) before submitting.
19
+ - A CSV log is written to `recordings/log.csv` with columns: `timestamp_ms, session_id, username, spell, filename`.
20
+ - You can prepare a ZIP of this session's saved files if you enter the correct download key.
21
 
22
+ ## Run locally
23
 
24
+ Requirements (see `requirements.txt`):
25
+ - gradio
26
+ - numpy
27
+ - soundfile
28
+ - scipy
29
 
30
+ On Windows PowerShell:
31
 
32
+ ```powershell
33
+ python -m venv .venv
34
+ .\.venv\Scripts\Activate.ps1
35
+ pip install -r requirements.txt
36
+ python app.py
37
+ ```
38
 
39
+ Then open the printed local URL in your browser.
40
+
41
+ ## Deploy on Hugging Face Spaces
42
+ 1. Create a new Space (Gradio) in your account.
43
+ 2. Upload `app.py`, `requirements.txt`, and optionally `README.md`.
44
+ 3. Spaces will auto-build and run the app.
45
+ 4. Recordings will be saved inside the Space's `recordings/` directory. You can download them from the Space files tab or via `git lfs` if you commit them.
46
+
47
+ Notes:
48
+ - Microphone recording is enabled in the browser; no need to upload.
49
+ - If you need more durable storage or collaboration, consider pushing saved WAVs to a dataset repo programmatically.
50
+
51
+ ### Enable password-protected ZIP download
52
+
53
+ Set a Space secret so only people with the key can generate a ZIP of their session files:
54
+
55
+ - Go to your Space → Settings → Variables and secrets
56
+ - Add a secret named `ZIP_DOWNLOAD_KEY` with your chosen value
57
+ - In the app UI, paste that key into the "Download Key" field before clicking "Prepare ZIP"
58
+
59
+ ## Privacy and consent
60
+ - Only collect voices from people who consent to being recorded.
61
+ - Consider informing contributors how their audio will be used and stored.
62
+ - Do not collect sensitive information in the username.
63
+
64
+ ## Why 16 kHz mono?
65
+ Standardizing sample rate and channels simplifies downstream model training and reduces storage.
app.py CHANGED
@@ -1,19 +1,23 @@
1
  import os
2
- import json
3
  import re
4
  import time
5
  import math
6
- from typing import List, Tuple, Optional, Sequence
 
 
 
 
7
  import numpy as np
8
  import gradio as gr
9
  import soundfile as sf
10
  from scipy.signal import resample_poly
11
- from pymongo import MongoClient
12
- import gridfs
13
 
14
  # Output directory for saved recordings
15
  OUT_DIR = "recordings"
16
  os.makedirs(OUT_DIR, exist_ok=True)
 
 
 
17
 
18
  # Fixed target sample rate for ML training
19
  TARGET_SR = 16000
@@ -61,9 +65,23 @@ def resample_to_target(audio: np.ndarray, sr: int, target_sr: int = TARGET_SR) -
61
  return resample_poly(audio, up=up, down=down)
62
 
63
 
64
- def save_one_from_path(filepath: Optional[str], spell: str, username: str) -> Optional[str]:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  """Load an audio file path (from mic/upload), process to 16k mono, and save.
66
- Returns saved file path or None if no audio provided.
67
  """
68
  if not filepath:
69
  return None
@@ -83,82 +101,7 @@ def save_one_from_path(filepath: Optional[str], spell: str, username: str) -> Op
83
  out_path = os.path.join(OUT_DIR, fname)
84
 
85
  sf.write(out_path, audio, TARGET_SR, subtype="PCM_16")
86
- return out_path
87
-
88
-
89
- def _parse_meta_from_filename(basename: str) -> Tuple[str, str, Optional[int]]:
90
- """Parse (spell_slug, username, timestamp) from `<spell_slug>_<username>_<ts>.wav`."""
91
- name = basename
92
- if name.endswith(".wav"):
93
- name = name[:-4]
94
- parts = name.split("_")
95
- if len(parts) < 3:
96
- return name, "", None
97
- try:
98
- ts = int(parts[-1])
99
- except Exception:
100
- ts = None
101
- username = parts[-2]
102
- spell_slug = "_".join(parts[:-2])
103
- return spell_slug, username, ts
104
-
105
-
106
- def upload_recordings_to_mongo(paths: Sequence[str]) -> Tuple[int, Optional[str]]:
107
- """Upload files to MongoDB using GridFS.
108
-
109
- Env configuration:
110
- - MONGO_URI: connection string (e.g., mongodb+srv://user:pass@cluster/)
111
- - MONGO_DB: database name (default: spells)
112
- - GRIDFS_BUCKET: GridFS bucket prefix (default: fs)
113
- Returns (uploaded_count, error_message).
114
- """
115
- if not paths:
116
- return 0, None
117
- if not (MongoClient and gridfs):
118
- return 0, "pymongo/gridfs not installed."
119
- uri = os.getenv("MONGO_URI")
120
- if not uri:
121
- return 0, "Missing MONGO_URI."
122
- db_name = os.getenv("MONGO_DB", "spells")
123
- bucket = os.getenv("GRIDFS_BUCKET", "fs")
124
-
125
- try:
126
- client = MongoClient(uri, serverSelectionTimeoutMS=5000)
127
- # quick connectivity check
128
- client.admin.command("ping")
129
- db = client[db_name]
130
- fs = gridfs.GridFS(db, collection=bucket)
131
- except Exception as e:
132
- return 0, f"Mongo connect error: {e}"
133
-
134
- uploaded = 0
135
- try:
136
- for p in paths:
137
- if not os.path.isfile(p):
138
- continue
139
- base = os.path.basename(p)
140
- spell_slug, username, ts = _parse_meta_from_filename(base)
141
- with open(p, "rb") as f:
142
- fs.put(
143
- f.read(),
144
- filename=base,
145
- contentType="audio/wav",
146
- metadata={
147
- "spell": spell_slug,
148
- "username": username,
149
- "timestamp": ts,
150
- "path": p,
151
- },
152
- )
153
- uploaded += 1
154
- except Exception as e:
155
- return uploaded, f"Mongo upload error: {e}"
156
- finally:
157
- try:
158
- client.close()
159
- except Exception:
160
- pass
161
- return uploaded, None
162
 
163
 
164
  def submit_recordings(
@@ -169,7 +112,9 @@ def submit_recordings(
169
  wingardium_path: Optional[str],
170
  accio_path: Optional[str],
171
  reparo_path: Optional[str],
172
- ) -> str:
 
 
173
  user = sanitize_username(username)
174
 
175
  pairs: List[Tuple[str, Optional[str]]] = [
@@ -182,41 +127,83 @@ def submit_recordings(
182
  ]
183
 
184
  saved = []
185
- saved_paths: List[str] = []
186
  skipped = []
 
 
187
  for spell, path in pairs:
188
  out = save_one_from_path(path, spell, user)
189
  if out:
190
- saved.append(f"{spell} -> {os.path.basename(out)}")
191
- saved_paths.append(out)
 
 
 
192
  else:
193
  skipped.append(spell)
194
 
195
- lines: List[str] = []
196
  if saved:
197
- lines.append("Saved recordings (local runtime):")
198
  lines += [f"- {s}" for s in saved]
199
  if skipped:
200
  lines.append("")
201
  lines.append("Missing (not provided):")
202
  lines += [f"- {s}" for s in skipped]
203
  if not lines:
204
- return "No audio captured. Please record at least one spell."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
205
 
206
- mup, merr = upload_recordings_to_mongo(saved_paths)
207
- lines.append("")
208
- if merr:
209
- lines.append(f"Mongo upload attempted: {mup} succeeded, error: {merr}")
210
- else:
211
- lines.append(f"Mongo upload: {mup} file(s) stored in GridFS.")
212
 
213
- return "\n".join(lines)
214
  def build_ui() -> gr.Blocks:
215
  with gr.Blocks(title="Spell Recorder") as demo:
216
- gr.Markdown("""# Spell Recorder\nRecord any of the listed spells and press Submit. You can use your microphone directly (preferred) or upload a file.\n\nSpells to collect: Lumos, Nox, Alohomora, Wingardium Leviosa, Accio, Reparo.""")
 
 
 
 
 
 
 
 
 
217
 
218
  with gr.Row():
219
- username = gr.Textbox(label="Your Name (for filename)", placeholder="e.g., harry_p", autofocus=True)
220
 
221
  with gr.Row():
222
  with gr.Column():
@@ -228,16 +215,50 @@ def build_ui() -> gr.Blocks:
228
  accio = gr.Audio(label="Accio", sources=["microphone", "upload"], type="filepath")
229
  reparo = gr.Audio(label="Reparo", sources=["microphone", "upload"], type="filepath")
230
 
 
 
 
231
  submit = gr.Button("Submit")
232
  result = gr.Markdown()
 
 
 
 
 
 
 
 
 
233
 
234
  submit.click(
235
  fn=submit_recordings,
236
- inputs=[username, lumos, nox, alohomora, wingardium, accio, reparo],
237
- outputs=[result],
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
238
  )
239
 
240
- gr.Markdown("""Notes:\n- Files are saved locally in `recordings/` with `<spell>_<username>_<timestamp>.wav`.\n- Files are also uploaded to MongoDB (GridFS) automatically if MONGO_URI is configured.\n- 16 kHz mono WAV ensures consistent model training.\n- You can submit partial sets; only provided spells are saved.""")
 
 
 
 
 
 
 
241
 
242
  return demo
243
 
 
1
  import os
 
2
  import re
3
  import time
4
  import math
5
+ import csv
6
+ import uuid
7
+ import zipfile
8
+ from typing import List, Tuple, Optional
9
+
10
  import numpy as np
11
  import gradio as gr
12
  import soundfile as sf
13
  from scipy.signal import resample_poly
 
 
14
 
15
  # Output directory for saved recordings
16
  OUT_DIR = "recordings"
17
  os.makedirs(OUT_DIR, exist_ok=True)
18
+ ZIP_DIR = os.path.join(OUT_DIR, "zips")
19
+ os.makedirs(ZIP_DIR, exist_ok=True)
20
+ LOG_CSV = os.path.join(OUT_DIR, "log.csv")
21
 
22
  # Fixed target sample rate for ML training
23
  TARGET_SR = 16000
 
65
  return resample_poly(audio, up=up, down=down)
66
 
67
 
68
+ def ensure_log_header():
69
+ if not os.path.exists(LOG_CSV):
70
+ with open(LOG_CSV, mode="w", newline="", encoding="utf-8") as f:
71
+ writer = csv.writer(f)
72
+ writer.writerow(["timestamp_ms", "session_id", "username", "spell", "filename"]) # header
73
+
74
+
75
+ def log_row(timestamp_ms: int, session_id: str, username: str, spell: str, filename: str) -> None:
76
+ ensure_log_header()
77
+ with open(LOG_CSV, mode="a", newline="", encoding="utf-8") as f:
78
+ writer = csv.writer(f)
79
+ writer.writerow([timestamp_ms, session_id, username, spell, filename])
80
+
81
+
82
+ def save_one_from_path(filepath: Optional[str], spell: str, username: str) -> Optional[Tuple[str, int]]:
83
  """Load an audio file path (from mic/upload), process to 16k mono, and save.
84
+ Returns (saved file path, timestamp_ms) or None if no audio provided.
85
  """
86
  if not filepath:
87
  return None
 
101
  out_path = os.path.join(OUT_DIR, fname)
102
 
103
  sf.write(out_path, audio, TARGET_SR, subtype="PCM_16")
104
+ return out_path, ts
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
 
106
 
107
  def submit_recordings(
 
112
  wingardium_path: Optional[str],
113
  accio_path: Optional[str],
114
  reparo_path: Optional[str],
115
+ session_id: str,
116
+ session_files: List[str],
117
+ ) -> Tuple[str, List[str], int]:
118
  user = sanitize_username(username)
119
 
120
  pairs: List[Tuple[str, Optional[str]]] = [
 
127
  ]
128
 
129
  saved = []
 
130
  skipped = []
131
+ newly_saved_paths: List[str] = []
132
+
133
  for spell, path in pairs:
134
  out = save_one_from_path(path, spell, user)
135
  if out:
136
+ out_path, ts = out
137
+ saved.append(f"{spell} -> {os.path.basename(out_path)}")
138
+ newly_saved_paths.append(out_path)
139
+ # CSV log
140
+ log_row(ts, session_id, user, spell, os.path.basename(out_path))
141
  else:
142
  skipped.append(spell)
143
 
144
+ lines = []
145
  if saved:
146
+ lines.append("Saved recordings:")
147
  lines += [f"- {s}" for s in saved]
148
  if skipped:
149
  lines.append("")
150
  lines.append("Missing (not provided):")
151
  lines += [f"- {s}" for s in skipped]
152
  if not lines:
153
+ return "No audio captured. Please record at least one spell.", session_files, 0
154
+
155
+ # Update session files list
156
+ session_files = list(session_files or []) + newly_saved_paths
157
+ return "\n".join(lines), session_files, len(newly_saved_paths)
158
+
159
+
160
+ def count_selected(
161
+ lumos_path: Optional[str],
162
+ nox_path: Optional[str],
163
+ alohomora_path: Optional[str],
164
+ wingardium_path: Optional[str],
165
+ accio_path: Optional[str],
166
+ reparo_path: Optional[str],
167
+ ) -> str:
168
+ paths = [lumos_path, nox_path, alohomora_path, wingardium_path, accio_path, reparo_path]
169
+ n = sum(1 for p in paths if p)
170
+ return f"Selected: {n}/6"
171
+
172
+
173
+ def prepare_zip(download_key: str, session_files: List[str]) -> Tuple[Optional[str], str]:
174
+ expected = os.getenv("ZIP_DOWNLOAD_KEY", "")
175
+ if not expected:
176
+ return None, "Download disabled: ZIP_DOWNLOAD_KEY not set in environment."
177
+ if (download_key or "").strip() != expected:
178
+ return None, "Invalid key. Please enter the correct download key."
179
+ files = [p for p in (session_files or []) if p and os.path.exists(p)]
180
+ if not files:
181
+ return None, "No files in this session to zip. Submit recordings first."
182
+
183
+ session_id = uuid.uuid4().hex[:8]
184
+ ts = int(time.time() * 1000)
185
+ zip_path = os.path.join(ZIP_DIR, f"submissions_{session_id}_{ts}.zip")
186
+ with zipfile.ZipFile(zip_path, mode="w", compression=zipfile.ZIP_DEFLATED) as zf:
187
+ for f in files:
188
+ zf.write(f, arcname=os.path.basename(f))
189
+ return zip_path, f"Prepared ZIP with {len(files)} files."
190
 
 
 
 
 
 
 
191
 
 
192
  def build_ui() -> gr.Blocks:
193
  with gr.Blocks(title="Spell Recorder") as demo:
194
+ gr.Markdown("""
195
+ # Spell Recorder
196
+ Record any of the listed spells and press Submit. You can use your microphone directly (preferred) or upload a file.
197
+
198
+ Spells to collect: Lumos, Nox, Alohomora, Wingardium Leviosa, Accio, Reparo.
199
+ """)
200
+
201
+ # Per-session state
202
+ session_id = gr.State(uuid.uuid4().hex)
203
+ session_files = gr.State([]) # paths saved during this session
204
 
205
  with gr.Row():
206
+ username = gr.Textbox(label="Your Name (for filename)", placeholder="e.g., harry_p" , autofocus=True)
207
 
208
  with gr.Row():
209
  with gr.Column():
 
215
  accio = gr.Audio(label="Accio", sources=["microphone", "upload"], type="filepath")
216
  reparo = gr.Audio(label="Reparo", sources=["microphone", "upload"], type="filepath")
217
 
218
+ with gr.Row():
219
+ selected_counter = gr.Markdown(value="Selected: 0/6")
220
+
221
  submit = gr.Button("Submit")
222
  result = gr.Markdown()
223
+ submitted_count = gr.Number(label="New files saved this submit", value=0)
224
+
225
+ # Download section (password-gated)
226
+ with gr.Row():
227
+ download_key = gr.Textbox(label="Download Key", type="password", placeholder="Enter key to enable ZIP download")
228
+ with gr.Row():
229
+ zip_btn = gr.Button("Prepare ZIP of my session files")
230
+ zip_file = gr.File(label="Download ZIP", interactive=False)
231
+ zip_status = gr.Markdown()
232
 
233
  submit.click(
234
  fn=submit_recordings,
235
+ inputs=[username, lumos, nox, alohomora, wingardium, accio, reparo, session_id, session_files],
236
+ outputs=[result, session_files, submitted_count],
237
+ )
238
+
239
+ # Live counter updates when any audio input changes
240
+ for comp in [lumos, nox, alohomora, wingardium, accio, reparo]:
241
+ comp.change(
242
+ fn=count_selected,
243
+ inputs=[lumos, nox, alohomora, wingardium, accio, reparo],
244
+ outputs=[selected_counter],
245
+ )
246
+
247
+ # Prepare ZIP on demand (password protected)
248
+ zip_btn.click(
249
+ fn=prepare_zip,
250
+ inputs=[download_key, session_files],
251
+ outputs=[zip_file, zip_status],
252
  )
253
 
254
+ gr.Markdown("""
255
+ Notes:
256
+ - Files are saved in the app's `recordings/` folder using: `<spell>_<username>_<timestamp>.wav`.
257
+ - 16 kHz mono WAV is used to make model training consistent.
258
+ - You don't have to record all spells at once—submit whatever you have.
259
+ - A CSV log is kept at `recordings/log.csv` with username, spell, timestamp, filename.
260
+ - To enable ZIP download, set the secret env var `ZIP_DOWNLOAD_KEY` in your Space Settings → Variables & secrets.
261
+ """)
262
 
263
  return demo
264