File size: 2,748 Bytes
3a36548
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
# ๋น ๋ฅธ ์‹œ์ž‘ ๊ฐ€์ด๋“œ

## 5๋ถ„ ์•ˆ์— ์‹œ์ž‘ํ•˜๊ธฐ

### 1๋‹จ๊ณ„: ์„ค์น˜ (1๋ถ„)
```bash

setup.bat

```

### 2๋‹จ๊ณ„: ํ—ˆ๊น…ํŽ˜์ด์Šค ํ† ํฐ ์„ค์ • (2๋ถ„)

1. https://huggingface.co/settings/tokens ์ ‘์†
2. "New token" โ†’ ์ด๋ฆ„: `party-crawler` โ†’ ๊ถŒํ•œ: **Write** โ†’ ์ƒ์„ฑ ํ›„ ๋ณต์‚ฌ

3. `.env` ํŒŒ์ผ์„ ๋ฉ”๋ชจ์žฅ์œผ๋กœ ์—ด๊ณ  ์ž…๋ ฅ:
```

HF_TOKEN=์—ฌ๊ธฐ์—_๋ณต์‚ฌํ•œ_ํ† ํฐ_๋ถ™์—ฌ๋„ฃ๊ธฐ



HF_REPO_ID=your_username/minjoo-press-releases

HF_REPO_ID_PPP=your_username/ppp-press-releases

HF_REPO_ID_REBUILDING=your_username/rebuilding-press-releases

HF_REPO_ID_REFORM=your_username/reform-press-releases

HF_REPO_ID_BASIC_INCOME=your_username/basic-income-press-releases

HF_REPO_ID_JINBO=your_username/jinbo-press-releases

```

> **์ค‘์š”**: `your_username`์„ ์‹ค์ œ ํ—ˆ๊น…ํŽ˜์ด์Šค ์‚ฌ์šฉ์ž๋ช…์œผ๋กœ ๋ณ€๊ฒฝํ•˜์„ธ์š”!



### 3๋‹จ๊ณ„: ์‹คํ–‰ (2๋ถ„)



#### ์ „์ฒด ์ •๋‹น ํ•œ ๋ฒˆ์— ์ˆ˜์ง‘ (์ถ”์ฒœ)

```bash

python main.py

```



#### ํŠน์ • ์ •๋‹น๋งŒ ์ˆ˜์ง‘

```bash

python main.py --party minjoo        # ๋”๋ถˆ์–ด๋ฏผ์ฃผ๋‹น

python main.py --party ppp           # ๊ตญ๋ฏผ์˜ํž˜

python main.py --party rebuilding    # ์กฐ๊ตญํ˜์‹ ๋‹น

python main.py --party reform        # ๊ฐœํ˜์‹ ๋‹น

python main.py --party basic_income  # ๊ธฐ๋ณธ์†Œ๋“๋‹น
python main.py --party jinbo         # ์ง„๋ณด๋‹น
```



#### ๋‚ ์งœ ๋ฒ”์œ„ ์ง€์ •

```bash

python main.py --start-date 2024-01-01

python main.py --party reform --start-date 2024-01-01 --end-date 2024-06-30

```

## ์™„๋ฃŒ!

๋ฐ์ดํ„ฐ ์ €์žฅ ์œ„์น˜:
- **๋กœ์ปฌ**: `./data/` ํด๋” (CSV, Excel)
- **ํ—ˆ๊น…ํŽ˜์ด์Šค**: ๊ฐ ์ •๋‹น๋ณ„ ์ €์žฅ์†Œ์— ์ž๋™ ์—…๋กœ๋“œ

## ์ „์ฒด ์˜ต์…˜ ์š”์•ฝ

| ๋ช…๋ น์–ด | ์„ค๋ช… |
|--------|------|
| `python main.py` | 6๊ฐœ ์ •๋‹น ์ „์ฒด ์ฆ๋ถ„ ์—…๋ฐ์ดํŠธ |
| `python main.py --party [์ฝ”๋“œ]` | ํŠน์ • ์ •๋‹น๋งŒ |
| `python main.py --start-date YYYY-MM-DD` | ์‹œ์ž‘ ๋‚ ์งœ ์ง€์ • |
| `python unified_scheduler.py` | ๋งค์ผ ์ž๋™ ์‹คํ–‰ (์Šค์ผ€์ค„๋Ÿฌ) |

## ์ •๋‹น ์ฝ”๋“œ ๋ชฉ๋ก

| ์ฝ”๋“œ | ์ •๋‹น |
|------|------|
| `minjoo` | ๋”๋ถˆ์–ด๋ฏผ์ฃผ๋‹น |
| `ppp` | ๊ตญ๋ฏผ์˜ํž˜ |
| `rebuilding` | ์กฐ๊ตญํ˜์‹ ๋‹น |
| `reform` | ๊ฐœํ˜์‹ ๋‹น |
| `basic_income` | ๊ธฐ๋ณธ์†Œ๋“๋‹น |
| `jinbo` | ์ง„๋ณด๋‹น |
| `all` | ์ „์ฒด (๊ธฐ๋ณธ๊ฐ’) |

## ๋ฌธ์ œ ํ•ด๊ฒฐ

| ๋ฌธ์ œ | ํ•ด๊ฒฐ |
|------|------|
| "HF_TOKEN์ด ์„ค์ •๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค" | `.env` ํŒŒ์ผ์˜ `HF_TOKEN` ํ™•์ธ |
| "Module not found" | `setup.bat` ๋‹ค์‹œ ์‹คํ–‰ |
| ํฌ๋กค๋ง์ด ๋А๋ ค์š” | `crawler_config.json`์—์„œ `concurrent_requests`๋ฅผ 30์œผ๋กœ ์ฆ๊ฐ€ |
| ํŠน์ • ์ •๋‹น๋งŒ ์‹คํŒจ | `python main.py --party [์ฝ”๋“œ]`๋กœ ๊ฐœ๋ณ„ ์‹คํ–‰ํ•˜์—ฌ ํ™•์ธ |

## ๋„์›€๋ง

```bash

python main.py --help

```

์ „์ฒด ๋ฌธ์„œ: `README.md`