early_warning_model / PROJECT_SUMMARY.md
LLouis0622's picture
Upload folder using huggingface_hub
5092c1e verified

์ž์˜์—… ์กฐ๊ธฐ๊ฒฝ๋ณด AI v2.0

ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ

early_warning_ai_v2/
โ”‚
โ”œโ”€โ”€ README.md                    # ๋ฉ”์ธ ๊ฐ€์ด๋“œ
โ”œโ”€โ”€ CHANGELOG_V2.md              # V2.0 ๊ฐœ์„  ์‚ฌํ•ญ
โ”œโ”€โ”€ requirements.txt             # ์˜์กด์„ฑ
โ”œโ”€โ”€ LICENSE                      # โš–MIT ๋ผ์ด์„ ์Šค
โ”œโ”€โ”€ .gitignore                   # Git ์ œ์™ธ ํŒŒ์ผ
โ”‚
โ”œโ”€โ”€ data/                        # ๋ฐ์ดํ„ฐ ํด๋”
โ”‚   โ”œโ”€โ”€ README.md                # ๋ฐ์ดํ„ฐ ์ค€๋น„ ๊ฐ€์ด๋“œ
โ”‚   โ”œโ”€โ”€ raw/                     # ์—ฌ๊ธฐ์— CSV ํŒŒ์ผ ๋„ฃ๊ธฐ
โ”‚   โ”‚   โ””โ”€โ”€ .gitkeep
โ”‚   โ””โ”€โ”€ processed/               # (์ž๋™ ์ƒ์„ฑ)
โ”‚
โ”œโ”€โ”€ models/                      # ํ•™์Šต๋œ ๋ชจ๋ธ (์ž๋™ ์ƒ์„ฑ)
โ”‚
โ”œโ”€โ”€ src/                         # ์†Œ์Šค ์ฝ”๋“œ
โ”‚   โ”œโ”€โ”€ README.md                # ์ฝ”๋“œ ์„ค๋ช…
โ”‚   โ”œโ”€โ”€ predictor.py             # ์˜ˆ์ธก ํด๋ž˜์Šค (ํ—ˆ๊น…ํŽ˜์ด์Šค ์Šคํƒ€์ผ)
โ”‚   โ”œโ”€โ”€ feature_engineering.py   # ํŠน์ง• ์ƒ์„ฑ (47๊ฐœ)
โ”‚   โ””โ”€โ”€ train.py                 # ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ
โ”‚
โ””โ”€โ”€ notebooks/                   # Jupyter ๋…ธํŠธ๋ถ
    โ””โ”€โ”€ train_model.ipynb        # ์ „์ฒด ํ•™์Šต ๊ณผ์ •

์ฃผ์š” ํŠน์ง•

1. ๊น”๋”ํ•œ ๊ตฌ์กฐ

  • ํ•„์ˆ˜ ํŒŒ์ผ๋งŒ ํฌํ•จ: ์‹ค์ œ๋กœ ํ•„์š”ํ•œ ์ฝ”๋“œ์™€ ๋ฌธ์„œ๋งŒ
  • ๋ช…ํ™•ํ•œ ๋””๋ ‰ํ† ๋ฆฌ: ๊ฐ ํด๋”์˜ ์šฉ๋„๊ฐ€ ๋ถ„๋ช…ํ•จ
  • ์ƒ์„ธํ•œ ๊ฐ€์ด๋“œ: ๋ชจ๋“  ํด๋”์— README.md ํฌํ•จ

2. ์‹ค์šฉ์ ์ธ ์„ค๊ณ„

  • ๋ฐ์ดํ„ฐ ๋ถ„๋ฆฌ: data/raw/์— CSV๋งŒ ๋„ฃ์œผ๋ฉด ๋จ
  • ๋ชจ๋“ˆํ™”: ๊ฐ ๊ธฐ๋Šฅ์ด ๋…๋ฆฝ์ ์ธ ํŒŒ์ผ๋กœ ๋ถ„๋ฆฌ
  • ํ™•์žฅ ๊ฐ€๋Šฅ: ์ƒˆ๋กœ์šด ํŠน์ง•์ด๋‚˜ ๋ชจ๋ธ ์ถ”๊ฐ€ ์‰ฌ์›€

3. ์™„๋ฒฝํ•œ ๋ฌธ์„œํ™”

  • README.md: ์ „์ฒด ํ”„๋กœ์ ํŠธ ๊ฐœ์š”
  • CHANGELOG_V2.md: V2.0 ์ƒ์„ธ ๊ฐœ์„  ์‚ฌํ•ญ
  • src/README.md: ์†Œ์Šค ์ฝ”๋“œ ์„ค๋ช… ๋ฐ ์ˆ˜์ • ๋ฐฉ๋ฒ•
  • data/README.md: ๋ฐ์ดํ„ฐ ์ค€๋น„ ๊ฐ€์ด๋“œ

๋น ๋ฅธ ์‹œ์ž‘

1. ์„ค์น˜

cd early_warning_ai_v2
pip install -r requirements.txt

2. ๋ฐ์ดํ„ฐ ์ค€๋น„

data/raw/ ํด๋”์— 3๊ฐœ์˜ CSV ํŒŒ์ผ ๋„ฃ๊ธฐ:

  • big_data_set1_f.csv
  • ds2_monthly_usage.csv
  • ds3_monthly_customers.csv

3. ํ•™์Šต

# Jupyter ๋…ธํŠธ๋ถ์œผ๋กœ
jupyter notebook notebooks/train_model.ipynb

# ๋˜๋Š” ์Šคํฌ๋ฆฝํŠธ๋กœ
python src/train.py

4. ์˜ˆ์ธก

from src.predictor import EarlyWarningPredictor

model = EarlyWarningPredictor.from_pretrained("models/")
result = model.predict(store_data)

์ฃผ์š” ํŒŒ์ผ ์„ค๋ช…

README.md

  • ํ”„๋กœ์ ํŠธ ์ „์ฒด ๊ฐœ์š”
  • ๋น ๋ฅธ ์‹œ์ž‘ ๊ฐ€์ด๋“œ
  • ์‚ฌ์šฉ ๋ฐฉ๋ฒ•
  • ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ

CHANGELOG_V2.md

  • V1.0 โ†’ V2.0 ๋ชจ๋“  ๊ฐœ์„  ์‚ฌํ•ญ
  • ์„ฑ๋Šฅ ๋น„๊ตํ‘œ
  • ์‹ค์ œ ๊ฐœ์„  ์‚ฌ๋ก€
  • ๊ตฌ์กฐ ๋ณ€๊ฒฝ ๋‚ด์—ญ

src/predictor.py

  • ํ—ˆ๊น…ํŽ˜์ด์Šค ์Šคํƒ€์ผ API
  • from_pretrained() ๋ฉ”์„œ๋“œ
  • ๋‹จ์ผ/๋ฐฐ์น˜ ์˜ˆ์ธก
  • ์œ„ํ—˜ ์š”์ธ ๋ถ„์„

src/feature_engineering.py

  • 47๊ฐœ ํŠน์ง• ์ž๋™ ์ƒ์„ฑ
  • ๋งค์ถœ, ๊ณ ๊ฐ, ์šด์˜, ํŠธ๋ Œ๋“œ, ๋ณ€๋™์„ฑ, ๊ณ„์ ˆ์„ฑ
  • ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ์„ค๊ณ„

src/train.py

  • ์ „์ฒด ํ•™์Šต ํŒŒ์ดํ”„๋ผ์ธ
  • ์ปค๋งจ๋“œ๋ผ์ธ ์ธํ„ฐํŽ˜์ด์Šค
  • SMOTE ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜• ์ฒ˜๋ฆฌ
  • ์ž๋™ ํ‰๊ฐ€ ๋ฐ ์ €์žฅ

notebooks/train_model.ipynb

  • ์ „์ฒด ํ•™์Šต ๊ณผ์ • ์‹œ๊ฐํ™”
  • EDA (ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ ๋ถ„์„)
  • ๋‹จ๊ณ„๋ณ„ ์„ค๋ช…
  • ์„ฑ๋Šฅ ํ‰๊ฐ€ ๋ฐ ๋ถ„์„

๋ฐ์ดํ„ฐ ์ˆ˜์ • ๋ฐฉ๋ฒ•

์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต

1๋‹จ๊ณ„: data/raw/์— CSV ํŒŒ์ผ 3๊ฐœ ๋ฐฐ์น˜

2๋‹จ๊ณ„: ํ•™์Šต ์‹คํ–‰

python src/train.py

3๋‹จ๊ณ„: ์ƒ์„ฑ๋œ ๋ชจ๋ธ ํ™•์ธ

ls models/
# xgboost_model.pkl, lightgbm_model.pkl, config.json ๋“ฑ

ํŒŒ๋ผ๋ฏธํ„ฐ ์กฐ์ •

์˜ˆ์ธก ์ž„๊ณ„๊ฐ’ ๋ณ€๊ฒฝ

# src/predictor.py์˜ predict() ๋ฉ”์„œ๋“œ์—์„œ
result = model.predict(store_data, threshold=0.3)  # ๋” ๋ฏผ๊ฐํ•˜๊ฒŒ

์•™์ƒ๋ธ” ๊ฐ€์ค‘์น˜ ๋ณ€๊ฒฝ

// models/config.json์—์„œ
{
    "ensemble_weights": [0.6, 0.4]  // XGBoost 60%, LightGBM 40%
}

ํŠน์ง• ์ถ”๊ฐ€

# src/feature_engineering.py์˜ FeatureEngineer ํด๋ž˜์Šค์—
def _create_custom_features(self, df):
    features = {}
    # ์ƒˆ๋กœ์šด ํŠน์ง• ์ถ”๊ฐ€
    features['new_metric'] = df['col1'] / df['col2']
    return features

V2.0 ํ•ต์‹ฌ ๊ฐœ์„ 

1. ํŠน์ง• ๊ฐ•ํ™” (20๊ฐœ โ†’ 47๊ฐœ)

  • ๋‹ค์ค‘ ๊ธฐ๊ฐ„ ์ถ”์„ธ ๋ถ„์„
  • ๊ณ„์ ˆ์„ฑ ํŒจํ„ด ๊ฐ์ง€
  • ๊ณ ๊ฐ ํ–‰๋™ ๋ณ€ํ™” ์ถ”์ 

2. ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜• ํ•ด๊ฒฐ

  • SMOTE ์ ์šฉ
  • Recall +17.5%p ํ–ฅ์ƒ

3. ๋ชจ๋ธ ์ตœ์ ํ™”

  • XGBoost + LightGBM ์•™์ƒ๋ธ”
  • ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์ž๋™ ํŠœ๋‹

4. ์„ฑ๋Šฅ ํ–ฅ์ƒ

์ง€ํ‘œ V1.0 V2.0 ๊ฐœ์„ 
Accuracy 94.3% 97.2% +2.9%p
Recall 68.2% 85.7% +17.5%p
Precision 76.5% 89.3% +12.8%p

๋ฌธ์„œ ์œ„์น˜

  • ์ „์ฒด ๊ฐ€์ด๋“œ: README.md
  • ๊ฐœ์„  ์‚ฌํ•ญ: CHANGELOG_V2.md
  • ์ฝ”๋“œ ์„ค๋ช…: src/README.md
  • ๋ฐ์ดํ„ฐ ๊ฐ€์ด๋“œ: data/README.md
  • ํ•™์Šต ๊ณผ์ •: notebooks/train_model.ipynb

์‚ฌ์šฉ ํŒ

์ฒซ ํ•™์Šต ์‹œ

  1. ์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ๋กœ ํ…Œ์ŠคํŠธ (๋น ๋ฆ„)
  2. ์ „์ฒด ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต (์ •ํ™•ํ•จ)
  3. ์„ฑ๋Šฅ ํ‰๊ฐ€ ํ›„ ํŒŒ๋ผ๋ฏธํ„ฐ ์กฐ์ •

์„ฑ๋Šฅ ๊ฐœ์„  ์‹œ

  1. ํŠน์ง• ์ถ”๊ฐ€: feature_engineering.py ์ˆ˜์ •
  2. ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ: train.py์—์„œ ์กฐ์ •
  3. ์•™์ƒ๋ธ” ๊ฐ€์ค‘์น˜: config.json ์ˆ˜์ •

๋ฐฐํฌ ์‹œ

  1. models/ ํด๋” ์ „์ฒด ๋ณต์‚ฌ
  2. src/predictor.py๋งŒ ์‚ฌ์šฉ
  3. API ์„œ๋ฒ„ ๊ตฌ์ถ• (FastAPI ์ถ”์ฒœ)

์ฃผ์˜์‚ฌํ•ญ

๋ฐ์ดํ„ฐ ์ค€๋น„

  • 3๊ฐœ์˜ CSV ํŒŒ์ผ ๋ชจ๋‘ ํ•„์ˆ˜
  • ์ตœ์†Œ 3๊ฐœ์›” ์ด์ƒ์˜ ์›”๋ณ„ ๋ฐ์ดํ„ฐ
  • ํ์—… ๋งค์žฅ 1-5% ๊ถŒ์žฅ

๋ชจ๋ธ ์‚ฌ์šฉ

  • ์˜ˆ์ธก์€ ์ฐธ๊ณ ์šฉ, ์‹ค์ œ ํŒ๋‹จ์€ ์ „๋ฌธ๊ฐ€์™€ ์ƒ๋‹ด
  • ์ฃผ๊ธฐ์  ์žฌํ•™์Šต ๊ถŒ์žฅ (3-6๊ฐœ์›”)
  • ์—…์ข…๋ณ„ ์ฐจ์ด ๊ณ ๋ ค

๊ธฐ์—ฌ ๋ฐ ๋ฌธ์˜

  • GitHub Issues: ๋ฒ„๊ทธ ๋ฆฌํฌํŠธ, ๊ธฐ๋Šฅ ์ œ์•ˆ
  • Pull Request: ์ฝ”๋“œ ๊ฐœ์„ , ๋ฌธ์„œ ์ˆ˜์ • ํ™˜์˜

๋ผ์ด์„ ์Šค

MIT License - ์ž์œ ๋กญ๊ฒŒ ์‚ฌ์šฉ, ์ˆ˜์ •, ๋ฐฐํฌ ๊ฐ€๋Šฅ