SOY NV AI commited on
Commit
9f9640b
ยท
1 Parent(s): 9c8de54

Add PostgreSQL support and update database configuration for data persistence in Hugging Face Spaces

Browse files
DATA_PERSISTENCE_SOLUTION.md ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hugging Face Spaces ๋ฐ์ดํ„ฐ ์˜์†์„ฑ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•
2
+
3
+ ## ๋ฌธ์ œ ์›์ธ
4
+
5
+ Hugging Face Spaces๋Š” Docker ์ปจํ…Œ์ด๋„ˆ ๊ธฐ๋ฐ˜์œผ๋กœ ๋™์ž‘ํ•˜๋ฏ€๋กœ, **์ปจํ…Œ์ด๋„ˆ๊ฐ€ ์žฌ์‹œ์ž‘๋˜๊ฑฐ๋‚˜ ์—…๋ฐ์ดํŠธ๋˜๋ฉด ๋ชจ๋“  ๋ฐ์ดํ„ฐ๊ฐ€ ์‚ฌ๋ผ์ง‘๋‹ˆ๋‹ค.**
6
+
7
+ ํ˜„์žฌ ์ €์žฅ๋˜๋Š” ๋ฐ์ดํ„ฐ:
8
+ - ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค: `instance/finance_analysis.db`
9
+ - ์—…๋กœ๋“œ ํŒŒ์ผ: `uploads/` ํด๋”
10
+ - ๋ฒกํ„ฐ DB: `vector_db/` ํด๋”
11
+ - ๋กœ๊ทธ: `logs/` ํด๋”
12
+
13
+ ## ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•
14
+
15
+ ### ๋ฐฉ๋ฒ• 1: ์™ธ๋ถ€ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์‚ฌ์šฉ (๊ถŒ์žฅ)
16
+
17
+ PostgreSQL, MySQL ๋“ฑ ์™ธ๋ถ€ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฐ์ดํ„ฐ๊ฐ€ ์˜๊ตฌ์ ์œผ๋กœ ๋ณด์กด๋ฉ๋‹ˆ๋‹ค.
18
+
19
+ #### PostgreSQL ์‚ฌ์šฉ ์˜ˆ์‹œ
20
+
21
+ 1. **Supabase, Neon, ๋˜๋Š” Railway์—์„œ ๋ฌด๋ฃŒ PostgreSQL ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ƒ์„ฑ**
22
+
23
+ 2. **ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •** (Hugging Face Spaces Settings > Repository secrets):
24
+ ```
25
+ DATABASE_URL=postgresql://user:password@host:port/database
26
+ ```
27
+
28
+ 3. **requirements.txt์— PostgreSQL ๋“œ๋ผ์ด๋ฒ„ ์ถ”๊ฐ€**:
29
+ ```
30
+ psycopg2-binary
31
+ ```
32
+
33
+ 4. **์ฝ”๋“œ ์ˆ˜์ •** (`app/core/config.py`):
34
+ ```python
35
+ SQLALCHEMY_DATABASE_URI: str = os.getenv(
36
+ 'DATABASE_URL',
37
+ f'sqlite:///{PROJECT_ROOT / "instance" / "finance_analysis.db"}'
38
+ )
39
+ ```
40
+ ์ด๋ฏธ ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์œผ๋ฏ€๋กœ, `DATABASE_URL`๋งŒ ์„ค์ •ํ•˜๋ฉด ์ž๋™์œผ๋กœ PostgreSQL์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
41
+
42
+ ### ๋ฐฉ๋ฒ• 2: ์™ธ๋ถ€ ์Šคํ† ๋ฆฌ์ง€ ์‚ฌ์šฉ (ํŒŒ์ผ ์ €์žฅ์šฉ)
43
+
44
+ ์—…๋กœ๋“œ๋œ ํŒŒ์ผ๊ณผ ๋ฒกํ„ฐ DB๋ฅผ ์™ธ๋ถ€ ์Šคํ† ๋ฆฌ์ง€์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
45
+
46
+ #### AWS S3 ์‚ฌ์šฉ ์˜ˆ์‹œ
47
+
48
+ 1. **boto3 ์„ค์น˜**:
49
+ ```
50
+ pip install boto3
51
+ ```
52
+
53
+ 2. **ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •**:
54
+ ```
55
+ AWS_ACCESS_KEY_ID=your_access_key
56
+ AWS_SECRET_ACCESS_KEY=your_secret_key
57
+ AWS_S3_BUCKET=your_bucket_name
58
+ ```
59
+
60
+ 3. **์ฝ”๋“œ ์ˆ˜์ •**: ํŒŒ์ผ ์—…๋กœ๋“œ/๋‹ค์šด๋กœ๋“œ ๋กœ์ง์„ S3๋ฅผ ์‚ฌ์šฉํ•˜๋„๋ก ๋ณ€๊ฒฝ
61
+
62
+ #### Google Cloud Storage ์‚ฌ์šฉ ์˜ˆ์‹œ
63
+
64
+ 1. **google-cloud-storage ์„ค์น˜**:
65
+ ```
66
+ pip install google-cloud-storage
67
+ ```
68
+
69
+ 2. **ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •**:
70
+ ```
71
+ GCS_BUCKET_NAME=your_bucket_name
72
+ GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
73
+ ```
74
+
75
+ ### ๋ฐฉ๋ฒ• 3: Hugging Face Dataset ์‚ฌ์šฉ (๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•)
76
+
77
+ Hugging Face์˜ Dataset API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
78
+
79
+ 1. **datasets ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ค์น˜**:
80
+ ```
81
+ pip install datasets
82
+ ```
83
+
84
+ 2. **์ฝ”๋“œ ์˜ˆ์‹œ**:
85
+ ```python
86
+ from datasets import Dataset
87
+ import json
88
+
89
+ # ๋ฐ์ดํ„ฐ ์ €์žฅ
90
+ def save_to_hf_dataset(data, dataset_name):
91
+ dataset = Dataset.from_dict(data)
92
+ dataset.push_to_hub(dataset_name, token=HF_TOKEN)
93
+
94
+ # ๋ฐ์ดํ„ฐ ๋กœ๋“œ
95
+ def load_from_hf_dataset(dataset_name):
96
+ dataset = Dataset.from_hub(dataset_name, token=HF_TOKEN)
97
+ return dataset.to_dict()
98
+ ```
99
+
100
+ ### ๋ฐฉ๋ฒ• 4: ์ •๊ธฐ์ ์ธ ๋ฐฑ์—… ์‹œ์Šคํ…œ
101
+
102
+ ์Šค์ผ€์ค„๋Ÿฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ •๊ธฐ์ ์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐฑ์—…ํ•ฉ๋‹ˆ๋‹ค.
103
+
104
+ 1. **๋ฐฑ์—… ์Šคํฌ๋ฆฝํŠธ ์ƒ์„ฑ** (`backup_data.py`):
105
+ ```python
106
+ import shutil
107
+ import os
108
+ from datetime import datetime
109
+ from huggingface_hub import HfApi
110
+
111
+ def backup_to_hf():
112
+ # ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๋ฐฑ์—…
113
+ if os.path.exists('instance/finance_analysis.db'):
114
+ timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
115
+ backup_name = f'backup_{timestamp}.db'
116
+ shutil.copy('instance/finance_analysis.db', backup_name)
117
+
118
+ # Hugging Face Hub์— ์—…๋กœ๋“œ
119
+ api = HfApi()
120
+ api.upload_file(
121
+ path_or_fileobj=backup_name,
122
+ path_in_repo=f'backups/{backup_name}',
123
+ repo_id='your-username/your-repo',
124
+ token=os.getenv('HF_TOKEN')
125
+ )
126
+ ```
127
+
128
+ 2. **์Šค์ผ€์ค„๋Ÿฌ ์„ค์ •**: GitHub Actions ๋˜๋Š” ์™ธ๋ถ€ ์Šค์ผ€์ค„๋Ÿฌ ์‚ฌ์šฉ
129
+
130
+ ## ์ฆ‰์‹œ ์ ์šฉ ๊ฐ€๋Šฅํ•œ ์ž„์‹œ ํ•ด๊ฒฐ์ฑ…
131
+
132
+ ### ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋กœ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๊ฒฝ๋กœ ๋ณ€๊ฒฝ
133
+
134
+ Hugging Face Spaces์˜ ์˜์†์„ฑ ์Šคํ† ๋ฆฌ์ง€(์žˆ๋Š” ๊ฒฝ์šฐ)๋ฅผ ์‚ฌ์šฉ:
135
+
136
+ ```python
137
+ # app/core/config.py ์ˆ˜์ •
138
+ SQLALCHEMY_DATABASE_URI: str = os.getenv(
139
+ 'DATABASE_URL',
140
+ f'sqlite:///{os.getenv("HF_HOME", str(PROJECT_ROOT / "instance"))}/finance_analysis.db'
141
+ )
142
+ ```
143
+
144
+ ## ๊ถŒ์žฅ ์‚ฌํ•ญ
145
+
146
+ 1. **ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ**: ๋ฐฉ๋ฒ• 1 (์™ธ๋ถ€ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค) + ๋ฐฉ๋ฒ• 2 (์™ธ๋ถ€ ์Šคํ† ๋ฆฌ์ง€)
147
+ 2. **๊ฐœ๋ฐœ/ํ…Œ์ŠคํŠธ ํ™˜๊ฒฝ**: ๋ฐฉ๋ฒ• 3 (Hugging Face Dataset) ๋˜๋Š” ๋ฐฉ๋ฒ• 4 (์ •๊ธฐ ๋ฐฑ์—…)
148
+ 3. **์ค‘์š”ํ•œ ๋ฐ์ดํ„ฐ**: ํ•ญ์ƒ ์™ธ๋ถ€ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์‚ฌ์šฉ
149
+
150
+ ## ์ฐธ๊ณ 
151
+
152
+ - [Hugging Face Spaces ๋ฌธ์„œ](https://huggingface.co/docs/hub/spaces)
153
+ - [Supabase (๋ฌด๋ฃŒ PostgreSQL)](https://supabase.com/)
154
+ - [Neon (์„œ๋ฒ„๋ฆฌ์Šค PostgreSQL)](https://neon.tech/)
155
+ - [Railway (PostgreSQL)](https://railway.app/)
156
+
HUGGINGFACE_DEPLOY.md CHANGED
@@ -43,10 +43,17 @@ Hugging Face Spaces์˜ Settings > Repository secrets์—์„œ ๋‹ค์Œ ํ™˜๊ฒฝ ๋ณ€์ˆ˜
43
 
44
  #### ํ•„์ˆ˜ ํ™˜๊ฒฝ ๋ณ€์ˆ˜
45
  - `SECRET_KEY`: Flask ์‹œํฌ๋ฆฟ ํ‚ค (๋žœ๋ค ๋ฌธ์ž์—ด ์ƒ์„ฑ)
46
- - `GEMINI_API_KEY`: Google Gemini API ํ‚ค (Gemini ์‚ฌ์šฉ ์‹œ)
 
 
 
 
 
 
 
47
 
48
  #### ์„ ํƒ์  ํ™˜๊ฒฝ ๋ณ€์ˆ˜
49
- - `DATABASE_URL`: ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค URL (๊ธฐ๋ณธ๊ฐ’: SQLite ์‚ฌ์šฉ)
50
  - `OLLAMA_BASE_URL`: Ollama ์„œ๋ฒ„ URL (๊ธฐ๋ณธ๊ฐ’: http://localhost:11434)
51
  - `EMBEDDING_MODEL_NAME`: ์ž„๋ฒ ๋”ฉ ๋ชจ๋ธ ์ด๋ฆ„
52
  - `RERANKER_MODEL_NAME`: ๋ฆฌ๋žญ์ปค ๋ชจ๋ธ ์ด๋ฆ„
 
43
 
44
  #### ํ•„์ˆ˜ ํ™˜๊ฒฝ ๋ณ€์ˆ˜
45
  - `SECRET_KEY`: Flask ์‹œํฌ๋ฆฟ ํ‚ค (๋žœ๋ค ๋ฌธ์ž์—ด ์ƒ์„ฑ)
46
+
47
+ #### ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ํ™˜๊ฒฝ ๋ณ€์ˆ˜ (๊ถŒ์žฅ: ์™ธ๋ถ€ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค)
48
+ - `DATABASE_URL`: ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์—ฐ๊ฒฐ URL
49
+ - **PostgreSQL (๊ถŒ์žฅ)**: `postgresql://user:password@host:port/database`
50
+ - ๋ฌด๋ฃŒ PostgreSQL ์ œ๊ณต ์„œ๋น„์Šค: [Supabase](https://supabase.com/), [Neon](https://neon.tech/), [Railway](https://railway.app/)
51
+ - **SQLite (๊ธฐ๋ณธ๊ฐ’)**: ์„ค์ •ํ•˜์ง€ ์•Š์œผ๋ฉด ์ž๋™์œผ๋กœ SQLite ์‚ฌ์šฉ
52
+ - โš ๏ธ **์ฃผ์˜**: Hugging Face Spaces๋Š” ์ปจํ…Œ์ด๋„ˆ ๊ธฐ๋ฐ˜์ด๋ฏ€๋กœ SQLite ์‚ฌ์šฉ ์‹œ ์žฌ์‹œ์ž‘ ์‹œ ๋ฐ์ดํ„ฐ๊ฐ€ ์‚ฌ๋ผ์ง‘๋‹ˆ๋‹ค.
53
+ - ์˜๊ตฌ ์ €์žฅ์ด ํ•„์š”ํ•˜๋ฉด ๋ฐ˜๋“œ์‹œ ์™ธ๋ถ€ PostgreSQL์„ ์‚ฌ์šฉํ•˜์„ธ์š”.
54
 
55
  #### ์„ ํƒ์  ํ™˜๊ฒฝ ๋ณ€์ˆ˜
56
+ - `GEMINI_API_KEY`: Google Gemini API ํ‚ค (Gemini ์‚ฌ์šฉ ์‹œ)
57
  - `OLLAMA_BASE_URL`: Ollama ์„œ๋ฒ„ URL (๊ธฐ๋ณธ๊ฐ’: http://localhost:11434)
58
  - `EMBEDDING_MODEL_NAME`: ์ž„๋ฒ ๋”ฉ ๋ชจ๋ธ ์ด๋ฆ„
59
  - `RERANKER_MODEL_NAME`: ๋ฆฌ๋žญ์ปค ๋ชจ๋ธ ์ด๋ฆ„
README.md CHANGED
@@ -35,9 +35,15 @@ Settings > Repository secrets์—์„œ ๋‹ค์Œ ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ์„ค์ •ํ•˜์„ธ์š”:
35
  ### ํ•„์ˆ˜
36
  - `SECRET_KEY`: Flask ์‹œํฌ๋ฆฟ ํ‚ค (๋žœ๋ค ๋ฌธ์ž์—ด)
37
 
 
 
 
 
 
 
 
38
  ### ์„ ํƒ์‚ฌํ•ญ
39
  - `GEMINI_API_KEY`: Google Gemini API ํ‚ค
40
- - `DATABASE_URL`: ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค URL (๊ธฐ๋ณธ: SQLite)
41
  - `OLLAMA_BASE_URL`: Ollama ์„œ๋ฒ„ URL
42
  - `HUGGINGFACE_HUB_TOKEN`: Hugging Face ํ† ํฐ
43
 
 
35
  ### ํ•„์ˆ˜
36
  - `SECRET_KEY`: Flask ์‹œํฌ๋ฆฟ ํ‚ค (๋žœ๋ค ๋ฌธ์ž์—ด)
37
 
38
+ ### ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค (๊ถŒ์žฅ: ์™ธ๋ถ€ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์‚ฌ์šฉ)
39
+ - `DATABASE_URL`: ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์—ฐ๊ฒฐ URL
40
+ - **PostgreSQL (๊ถŒ์žฅ)**: `postgresql://user:password@host:port/database`
41
+ - ๋ฌด๋ฃŒ PostgreSQL ์ œ๊ณต ์„œ๋น„์Šค: [Supabase](https://supabase.com/), [Neon](https://neon.tech/), [Railway](https://railway.app/)
42
+ - **SQLite (๊ธฐ๋ณธ๊ฐ’)**: ์„ค์ •ํ•˜์ง€ ์•Š์œผ๋ฉด ์ž๋™์œผ๋กœ SQLite ์‚ฌ์šฉ (โš ๏ธ ๋ฐ์ดํ„ฐ๊ฐ€ ์˜๊ตฌ ์ €์žฅ๋˜์ง€ ์•Š์Œ)
43
+ - **MySQL**: `mysql://user:password@host:port/database`
44
+
45
  ### ์„ ํƒ์‚ฌํ•ญ
46
  - `GEMINI_API_KEY`: Google Gemini API ํ‚ค
 
47
  - `OLLAMA_BASE_URL`: Ollama ์„œ๋ฒ„ URL
48
  - `HUGGINGFACE_HUB_TOKEN`: Hugging Face ํ† ํฐ
49
 
app/core/config.py CHANGED
@@ -19,10 +19,17 @@ class Config:
19
 
20
  # Flask ์„ค์ •
21
  SECRET_KEY: str = os.getenv('SECRET_KEY', 'dev-secret-key-change-in-production')
22
- SQLALCHEMY_DATABASE_URI: str = os.getenv(
23
- 'DATABASE_URL',
24
- f'sqlite:///{PROJECT_ROOT / "instance" / "finance_analysis.db"}'
25
- )
 
 
 
 
 
 
 
26
  SQLALCHEMY_TRACK_MODIFICATIONS: bool = False
27
  MAX_CONTENT_LENGTH: int = 100 * 1024 * 1024 # 100MB
28
 
 
19
 
20
  # Flask ์„ค์ •
21
  SECRET_KEY: str = os.getenv('SECRET_KEY', 'dev-secret-key-change-in-production')
22
+
23
+ # ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค URI ์„ค์ •
24
+ # PostgreSQL: postgresql://user:password@host:port/database
25
+ # SQLite (๊ธฐ๋ณธ๊ฐ’): sqlite:///instance/finance_analysis.db
26
+ _database_url = os.getenv('DATABASE_URL', '')
27
+ if _database_url:
28
+ # DATABASE_URL์ด ์ œ๊ณต๋˜๋ฉด ์‚ฌ์šฉ (PostgreSQL, MySQL ๋“ฑ)
29
+ SQLALCHEMY_DATABASE_URI: str = _database_url
30
+ else:
31
+ # ๊ธฐ๋ณธ๊ฐ’: SQLite ์‚ฌ์šฉ
32
+ SQLALCHEMY_DATABASE_URI: str = f'sqlite:///{PROJECT_ROOT / "instance" / "finance_analysis.db"}'
33
  SQLALCHEMY_TRACK_MODIFICATIONS: bool = False
34
  MAX_CONTENT_LENGTH: int = 100 * 1024 * 1024 # 100MB
35
 
requirements.txt CHANGED
@@ -10,6 +10,8 @@ numpy==1.24.3
10
  google-generativeai==0.3.2
11
  pydantic==2.5.0
12
  pydantic-settings==2.1.0
 
 
13
  # Ollama์™€ ํŒŒ์ด์ฌ์„ ์—ฐ๊ฒฐํ•˜๋ ค๋ฉด ์•„๋ž˜ ํŒจํ‚ค์ง€๊ฐ€ ๋ณดํ†ต ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
14
  ollama
15
 
 
10
  google-generativeai==0.3.2
11
  pydantic==2.5.0
12
  pydantic-settings==2.1.0
13
+ # Database drivers
14
+ psycopg2-binary # PostgreSQL support for external database
15
  # Ollama์™€ ํŒŒ์ด์ฌ์„ ์—ฐ๊ฒฐํ•˜๋ ค๋ฉด ์•„๋ž˜ ํŒจํ‚ค์ง€๊ฐ€ ๋ณดํ†ต ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
16
  ollama
17