Junhoee commited on
Commit
51285c9
ยท
verified ยท
1 Parent(s): 6d0bcdd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -31
README.md CHANGED
@@ -9,65 +9,80 @@ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- # Megumin ADK Agent
13
 
14
- ์ด ํ”„๋กœ์ ํŠธ๋Š” `data/processed/*.json`์˜ Q/A ๋ฐ์ดํ„ฐ๋ฅผ ๋กœ์ปฌ RAG ๋ฐฉ์‹์œผ๋กœ ์กฐํšŒํ•˜๊ณ , ๋ฉ”๊ตฌ๋ฐ ํŽ˜๋ฅด์†Œ๋‚˜๋กœ ๋‹ต๋ณ€ํ•˜๋Š” Gradio ์•ฑ์ž…๋‹ˆ๋‹ค.
15
 
16
- ## Hugging Face Spaces ๋ฐฐํฌ ๊ธฐ์ค€
 
17
 
18
- ์ด ์ €์žฅ์†Œ๋Š” Hugging Face Spaces์˜ Gradio Space ํ˜•ํƒœ๋กœ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๋„๋ก ์ •๋ฆฌ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
19
 
20
- ํ•„์š”ํ•œ ๊ฒƒ์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.
 
 
 
 
21
 
22
- - ๋ฃจํŠธ `app.py`
23
- - ๋ฃจํŠธ `requirements.txt`
24
- - Space Secret์— Gemini API ํ‚ค ๋“ฑ๋ก
25
 
26
- ## Spaces์—์„œ ํ•„์š”ํ•œ Secret
 
 
 
27
 
28
- Hugging Face Spaces ์„ค์ • ํ™”๋ฉด์—์„œ ์•„๋ž˜ ํ™˜๊ฒฝ๋ณ€์ˆ˜ ์ค‘ ํ•˜๋‚˜๋ฅผ Secret์œผ๋กœ ๋“ฑ๋กํ•˜์„ธ์š”.
29
 
30
- - `GOOGLE_API_KEY`
31
- - ๋˜๋Š” `GEMINI_API_KEY`
 
32
 
33
- ๊ถŒ์žฅ:
34
-
35
- ```text
36
- GOOGLE_API_KEY=๋ฐœ๊ธ‰๋ฐ›์€_์‹ค์ œ_Gemini_API_ํ‚ค
37
- ```
38
-
39
- ## ๋กœ์ปฌ ์‹คํ–‰
40
 
41
  ```bash
42
  python app_gradio.py
43
  ```
44
 
45
- ๋˜๋Š” Spaces์™€ ๋™์ผํ•œ ์ง„์ž…์  ๊ธฐ์ค€์œผ๋กœ:
46
 
47
  ```bash
48
  python app.py
49
  ```
50
 
51
- ## ๋ชจ๋ธ ๋ณ€๊ฒฝ
52
 
53
- ๊ธฐ๋ณธ ๋ชจ๋ธ์€ `gemini-2.5-flash-lite` ์ž…๋‹ˆ๋‹ค.
 
54
 
55
- ํ•„์š”ํ•˜๋ฉด ํ™˜๊ฒฝ๋ณ€์ˆ˜๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
56
 
57
- ```bash
58
- set MEGUMIN_AGENT_MODEL=gemini-2.5-flash-lite
59
  ```
60
 
61
- ## ๋ฐ์ดํ„ฐ์…‹ ๋ณ€ํ™˜
62
 
63
- ์›๋ณธ raw txt๋ฅผ processed JSON์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋ ค๋ฉด:
64
 
65
  ```bash
66
- python scripts/convert_raw_to_processed.py
67
  ```
68
 
69
- ์ƒ์„ฑ ํŒŒ์ผ:
70
 
71
- ```text
72
- data/processed/megumin_qa_dataset.json
 
 
73
  ```
 
 
 
 
 
 
 
 
 
 
 
 
9
  pinned: false
10
  ---
11
 
12
+ # Megumin Chatbot Spec
13
 
14
+ ## ๊ฐœ์š”
15
 
16
+ ๋ฉ”๊ตฌ๋ฐ ํŽ˜๋ฅด์†Œ๋‚˜๋กœ ๋Œ€ํ™”ํ•˜๋Š” Gradio ๊ธฐ๋ฐ˜ ์ฑ—๋ด‡์ž…๋‹ˆ๋‹ค.
17
+ Google ADK Agent๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ, ๋‹ต๋ณ€ ์ „์— RAG๋กœ ์œ ์‚ฌ ์‚ฌ๋ก€๋ฅผ ์ฐพ์•„ ๋งํˆฌ์™€ ์„ค์ •์„ ์ฐธ๊ณ ํ•ฉ๋‹ˆ๋‹ค.
18
 
19
+ ## ํ•ต์‹ฌ ๊ตฌ์„ฑ
20
 
21
+ - LLM: `gemini-3.1-flash-lite-preview`
22
+ - Agent: Google ADK `LlmAgent`
23
+ - UI: Gradio
24
+ - ๊ฒ€์ƒ‰: Gemini Embedding + FAISS
25
+ - ์„ธ์…˜: `InMemorySessionService`
26
 
27
+ ## ๋™์ž‘ ๋ฐฉ์‹
 
 
28
 
29
+ 1. ์‚ฌ์šฉ์ž์˜ ์งˆ๋ฌธ์ด ๋“ค์–ด์˜ค๋ฉด Agent๊ฐ€ RAG tool์„ ํ˜ธ์ถœํ•ฉ๋‹ˆ๋‹ค.
30
+ 2. ์งˆ๋ฌธ ์ž„๋ฒ ๋”ฉ์„ ๊ธฐ์ค€์œผ๋กœ FAISS์—์„œ ์œ ์‚ฌ ์‚ฌ๋ก€ top-3๋ฅผ ์ฐพ์Šต๋‹ˆ๋‹ค.
31
+ 3. ๊ฒ€์ƒ‰๋œ ๋‹ต๋ณ€ ์‚ฌ๋ก€์™€ ํ˜„์žฌ ์งˆ๋ฌธ์„ ํ•จ๊ป˜ ์ฐธ๊ณ ํ•ด ๋ฉ”๊ตฌ๋ฐ ํŽ˜๋ฅด์†Œ๋‚˜๋กœ ์‘๋‹ตํ•ฉ๋‹ˆ๋‹ค.
32
+ 4. ๋ฉ€ํ‹ฐํ„ด ๋Œ€ํ™”์—์„œ๋Š” ์ตœ๊ทผ 6ํ„ด์„ ์œ ์ง€ํ•˜๊ณ , ๊ทธ ์ด์ „ ๋‚ด์šฉ์€ ์งง์€ ์š”์•ฝ์œผ๋กœ ์••์ถ•ํ•ฉ๋‹ˆ๋‹ค.
33
 
34
+ ## ๋ฐ์ดํ„ฐ ๊ตฌ์„ฑ
35
 
36
+ - `megumin_qa_dataset.json`: ์›๋ณธ Q/A ๋ฐ์ดํ„ฐ์…‹
37
+ - `megumin_questions.faiss`: ์งˆ๋ฌธ ์ž„๋ฒ ๋”ฉ ์ธ๋ฑ์Šค
38
+ - `megumin_questions_meta.json`: ์ธ๋ฑ์Šค์™€ ์›๋ฌธ ๋ ˆ์ฝ”๋“œ ๋งคํ•‘ ์ •๋ณด
39
 
40
+ ## ์‹คํ–‰
 
 
 
 
 
 
41
 
42
  ```bash
43
  python app_gradio.py
44
  ```
45
 
46
+ Spaces ์ง„์ž…์ :
47
 
48
  ```bash
49
  python app.py
50
  ```
51
 
52
+ ## ํ•„์ˆ˜ ํ™˜๊ฒฝ๋ณ€์ˆ˜
53
 
54
+ - `GOOGLE_API_KEY`: Gemini API ํ‚ค
55
+ - `HF_TOKEN`: private dataset repo๋ฅผ ์ฝ์„ ๋•Œ ํ•„์š”
56
 
57
+ ๊ถŒ์žฅ:
58
 
59
+ ```env
60
+ GOOGLE_API_KEY=your_gemini_api_key
61
  ```
62
 
63
+ ## ์ธ๋ฑ์Šค ์ƒ์„ฑ
64
 
65
+ ์›๋ณธ JSON์—์„œ FAISS ์ธ๋ฑ์Šค๋ฅผ ๋งŒ๋“ค๋ ค๋ฉด:
66
 
67
  ```bash
68
+ python scripts/build_faiss_index.py
69
  ```
70
 
71
+ ## ๋‚˜๋ฌด์œ„ํ‚ค QA ๋ณ€ํ™˜
72
 
73
+ ์„ ๋ณ„ํ•œ ๋‚˜๋ฌด์œ„ํ‚ค ๋ฌธ์„œ๋ฅผ QA JSON์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋ ค๋ฉด:
74
+
75
+ ```bash
76
+ python scripts/crawl_namuwiki_to_qa.py --title ๋ฉ”๊ตฌ๋ฐ --title ์นด์ฆˆ๋งˆ
77
  ```
78
+
79
+ ๋ณ€ํ™˜ ๊ทœ์น™:
80
+
81
+ - `question`: ์งˆ๋ฌธํ˜•์ด ์•„๋‹ˆ๋ผ ๊ฒ€์ƒ‰์šฉ ์†Œ์ œ๋ชฉ ์š”์•ฝ
82
+ - `answer`: ์•ฝ 200์ž ๋‚ด์™ธ์˜ ์ค‘๋ฆฝ ์š”์•ฝ
83
+ - chunk overlap: 1~2๋ฌธ์žฅ
84
+ - ํ‘œ/์ด๋ฏธ์ง€ ์ œ์™ธ
85
+
86
+ ## ๋ฐฐํฌ ๋ฉ”๋ชจ
87
+
88
+ Hugging Face Spaces์—์„œ๋Š” dataset repo์—์„œ JSON, FAISS, metadata๋ฅผ ๋Ÿฐํƒ€์ž„์— ๋‚ด๋ ค๋ฐ›์•„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.