Ruben commited on
Commit
3b56477
Β·
1 Parent(s): 28aa7d9

Fix permission error and add persistent storage fallback

Browse files
Files changed (3) hide show
  1. .gitignore +47 -0
  2. HF_SPACES_SETUP.md +138 -0
  3. config/database.py +13 -3
.gitignore ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+ env/
8
+ venv/
9
+ ENV/
10
+ build/
11
+ develop-eggs/
12
+ dist/
13
+ downloads/
14
+ eggs/
15
+ .eggs/
16
+ lib/
17
+ lib64/
18
+ parts/
19
+ sdist/
20
+ var/
21
+ wheels/
22
+ *.egg-info/
23
+ .installed.cfg
24
+ *.egg
25
+
26
+ # Database
27
+ data/
28
+ *.duckdb
29
+ *.duckdb.wal
30
+
31
+ # IDE
32
+ .vscode/
33
+ .idea/
34
+ *.swp
35
+ *.swo
36
+ *~
37
+
38
+ # OS
39
+ .DS_Store
40
+ Thumbs.db
41
+
42
+ # Logs
43
+ *.log
44
+
45
+ # Environment
46
+ .env
47
+ .env.local
HF_SPACES_SETUP.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hugging Face Spaces Setup Guide
2
+
3
+ ## Quick Fix for Permission Error
4
+
5
+ If you see this error:
6
+ ```
7
+ PermissionError: [Errno 13] Permission denied: '/data'
8
+ ```
9
+
10
+ The app will automatically fall back to using a local `data/` directory. **However**, this means data won't persist between Space restarts.
11
+
12
+ ## Enable Persistent Storage (Recommended)
13
+
14
+ To enable persistent data storage on Hugging Face Spaces:
15
+
16
+ ### Option 1: Via Space Settings (Easiest)
17
+
18
+ 1. Go to your Space on Hugging Face
19
+ 2. Click **Settings** tab
20
+ 3. Scroll to **Storage** section
21
+ 4. Click **Enable persistent storage**
22
+ 5. Choose storage size (Free: Small - 20GB is plenty)
23
+ 6. Click **Save**
24
+ 7. **Restart your Space** (Factory reboot)
25
+
26
+ ### Option 2: Via README.md Header
27
+
28
+ Add this to your `README.md` file header (the YAML section at the top):
29
+
30
+ ```yaml
31
+ ---
32
+ title: Madrid Content Analyzer
33
+ emoji: πŸ›οΈ
34
+ colorFrom: blue
35
+ colorTo: green
36
+ sdk: gradio
37
+ sdk_version: 4.44.0
38
+ app_file: app.py
39
+ pinned: false
40
+ license: mit
41
+ storage: small # <- Add this line
42
+ ---
43
+ ```
44
+
45
+ Then commit and push:
46
+ ```bash
47
+ git add README.md
48
+ git commit -m "Enable persistent storage"
49
+ git push
50
+ ```
51
+
52
+ ## What Persistent Storage Does
53
+
54
+ When enabled:
55
+ - βœ… `/data` directory becomes available
56
+ - βœ… DuckDB database persists across Space restarts
57
+ - βœ… All your analyzed content is saved permanently
58
+ - βœ… Charts and statistics accumulate over time
59
+
60
+ Without it:
61
+ - ⚠️ Data stored in local `data/` directory
62
+ - ⚠️ Database is deleted when Space restarts/rebuilds
63
+ - ⚠️ You'll lose all historical data
64
+
65
+ ## Verify It's Working
66
+
67
+ After enabling persistent storage and restarting:
68
+
69
+ 1. Check the Space logs for:
70
+ ```
71
+ βœ… Using persistent storage at /data/madrid.duckdb
72
+ ```
73
+
74
+ 2. If you see:
75
+ ```
76
+ ⚠️ Cannot use /data directory
77
+ βœ… Using local storage at .../data/madrid.duckdb
78
+ ```
79
+ Then persistent storage is NOT enabled yet.
80
+
81
+ ## Configure API Key
82
+
83
+ While you're in Settings, also add your Groq API key:
84
+
85
+ 1. Go to **Repository secrets**
86
+ 2. Click **New secret**
87
+ 3. Name: `GROQ_API_KEY`
88
+ 4. Value: Your API key from https://console.groq.com
89
+ 5. Click **Save**
90
+
91
+ ## Complete Setup Checklist
92
+
93
+ - [ ] Space created on Hugging Face
94
+ - [ ] Persistent storage enabled (Settings β†’ Storage)
95
+ - [ ] Space restarted after enabling storage
96
+ - [ ] GROQ_API_KEY added to secrets
97
+ - [ ] Code pushed to Space
98
+ - [ ] Space building successfully
99
+ - [ ] Check logs for "Using persistent storage at /data/madrid.duckdb"
100
+ - [ ] Trigger manual fetch (Settings tab in app)
101
+ - [ ] Verify data appears in Dashboard
102
+
103
+ ## Troubleshooting
104
+
105
+ ### "Building" stuck for a long time
106
+ - Normal for first build (installing dependencies)
107
+ - Usually takes 2-5 minutes
108
+ - Check build logs for errors
109
+
110
+ ### Space keeps restarting
111
+ - Check logs for errors
112
+ - Verify all dependencies in requirements.txt
113
+ - Make sure GROQ_API_KEY is set (or app will use fallback)
114
+
115
+ ### Data disappears after restart
116
+ - Persistent storage not enabled
117
+ - Follow steps above to enable it
118
+
119
+ ### "No space left on device"
120
+ - Your database grew too large
121
+ - Increase storage size in Settings
122
+ - Or implement data retention policy
123
+
124
+ ## Storage Limits
125
+
126
+ Free tier persistent storage options:
127
+ - **Small**: 20GB (recommended for this project)
128
+ - Can store millions of content items
129
+ - Monitor size in Settings tab of the app
130
+
131
+ ## Cost
132
+
133
+ Everything is **FREE**:
134
+ - βœ… Hugging Face Spaces hosting
135
+ - βœ… Persistent storage (up to 20GB free)
136
+ - βœ… Groq API (generous free tier)
137
+
138
+ No credit card required! πŸŽ‰
config/database.py CHANGED
@@ -8,9 +8,19 @@ import os
8
  from pathlib import Path
9
 
10
  # Database path (persistent on HF Spaces!)
11
- DATA_DIR = Path('/data')
12
- DATA_DIR.mkdir(exist_ok=True)
13
- DB_PATH = DATA_DIR / 'madrid.duckdb'
 
 
 
 
 
 
 
 
 
 
14
 
15
  # Global connection
16
  _conn = None
 
8
  from pathlib import Path
9
 
10
  # Database path (persistent on HF Spaces!)
11
+ # Try /data first (HF Spaces persistent storage), fall back to local directory
12
+ try:
13
+ DATA_DIR = Path('/data')
14
+ DATA_DIR.mkdir(exist_ok=True)
15
+ DB_PATH = DATA_DIR / 'madrid.duckdb'
16
+ print(f"βœ… Using persistent storage at {DB_PATH}")
17
+ except (PermissionError, OSError) as e:
18
+ print(f"⚠️ Cannot use /data directory: {e}")
19
+ print("Using local directory for database storage")
20
+ DATA_DIR = Path(__file__).parent.parent / 'data'
21
+ DATA_DIR.mkdir(exist_ok=True)
22
+ DB_PATH = DATA_DIR / 'madrid.duckdb'
23
+ print(f"βœ… Using local storage at {DB_PATH}")
24
 
25
  # Global connection
26
  _conn = None