FinGraph / README.md
dev-yuje's picture
fix: resolve ruff lint errors and set huggingface python version to 3.10 to fix audioop error
1ecde19
|
raw
history blame
4.74 kB
---
title: FinGraph
emoji: ๐Ÿ•ธ๏ธ
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 4.44.0
python_version: 3.10.14
app_file: app.py
pinned: false
---
# FinNode ๐Ÿ•ธ๏ธ
**Neo4j GraphRAG ๊ธฐ๋ฐ˜ AI ๋‰ด์Šค ์ง€์‹ ๊ทธ๋ž˜ํ”„ ํ”Œ๋žซํผ**
[![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/)
[![Neo4j](https://img.shields.io/badge/Neo4j-AuraDB-blue.svg)](https://neo4j.com/)
[![LangGraph](https://img.shields.io/badge/LangGraph-Pipeline-orange.svg)](https://langchain.com/)
[![Gradio](https://img.shields.io/badge/Gradio-UI-red.svg)](https://gradio.app/)
[![CI](https://github.com/yuje/FinGraph/actions/workflows/ci.yml/badge.svg)](https://github.com/yuje/FinGraph/actions/workflows/ci.yml)
---
## ๐Ÿ“ ๋ณด๊ณ ์„œ
> [์ตœ์ข… ๊ธฐํš์•ˆ ๋ฐ ํ”„๋กœ์ ํŠธ ๋ณด๊ณ ์„œ (์—…๋ฐ์ดํŠธ ์˜ˆ์ •)]()
## ๐ŸŽฅ ์‹œ์—ฐ ์˜์ƒ
> [์„œ๋น„์Šค ์‹œ์—ฐ ์˜์ƒ ๋งํฌ (์—…๋ฐ์ดํŠธ ์˜ˆ์ •)]()
---
## 1. ํ”„๋กœ์ ํŠธ ๋ฐฐ๊ฒฝ ๋ฐ ๋ชฉ์ 
์ตœ์‹  AI ๊ธฐ์ˆ ๊ณผ ํ•€ํ…Œํฌ ํŠธ๋ Œ๋“œ๋Š” ๋น ๋ฅด๊ฒŒ ๋ณ€ํ™”ํ•˜๋ฉฐ, ์ผ๋ฐ˜์ ์ธ RAG(๊ฒ€์ƒ‰ ์ฆ๊ฐ• ์ƒ์„ฑ) ๊ธฐ์ˆ ๋งŒ์œผ๋กœ๋Š” ์—ฌ๋Ÿฌ ๋‰ด์Šค ๊ธฐ์‚ฌ์— ํฉ์–ด์ ธ ์žˆ๋Š” **'๊ธฐ์—…-๊ธฐ์ˆ -์„œ๋น„์Šค' ๊ฐ„์˜ ๋ณต์žกํ•œ ๊ด€๊ณ„**๋ฅผ ํŒŒ์•…ํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
**FinNode**๋Š” ๋„ค์ด๋ฒ„ ๋‰ด์Šค์—์„œ AI ๊ด€๋ จ ๊ธฐ์‚ฌ๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ˆ˜์ง‘ํ•˜๊ณ , **LangGraph ํŒŒ์ดํ”„๋ผ์ธ**์„ ํ†ตํ•ด ์—”ํ‹ฐํ‹ฐ์™€ ๊ด€๊ณ„๋ฅผ ์ž๋™ ์ถ”์ถœํ•˜์—ฌ **Neo4j ์ง€์‹ ๊ทธ๋ž˜ํ”„**์— ์ ์žฌํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ Vector ๋ฐ Cypher ๋ณตํ•ฉ ๊ฒ€์ƒ‰(GraphRAG)์„ ์ˆ˜ํ–‰ํ•˜์—ฌ, ๋‹จ์ˆœํ•œ ๋ฌธ์„œ ๊ฒ€์ƒ‰์„ ๋„˜์–ด **"ํ˜„์žฌ ๊ธˆ์œตAI ๋ถ„์•ผ์—์„œ ๊ฐ€์žฅ ์ ๊ทน์ ์ธ ๊ธฐ์—…๊ณผ ๊ธฐ์ˆ  ํŠธ๋ Œ๋“œ"**๋ฅผ ์™„๋ฒฝํ•œ ๊ทผ๊ฑฐ์™€ ํ•จ๊ป˜ ์ถ”๋ก ํ•˜๊ณ  ๋‹ต๋ณ€ํ•˜๋Š” ์ฐจ์„ธ๋Œ€ ์ฑ—๋ด‡ ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค.
---
## 2. ์‹œ์Šคํ…œ ์•„ํ‚คํ…์ฒ˜
```text
[Naver News]
โ”‚ Selenium ํฌ๋กค๋ง
โ–ผ
[LangGraph Pipeline] (gpt-4o-mini)
check_ai โ”€โ”€(AI ์•„๋‹˜)โ”€โ”€โ–ถ ์Šคํ‚ต
โ”‚ (AI ๊ด€๋ จ)
โ–ผ
extract_entities
โ”‚
โ–ผ
extract_relations
โ”‚
โ–ผ
[Neo4j AuraDB]
Article / Content / AICompany / AITechnology / AIService / AIField / Media
โ”‚
โ–ผ
[GraphRAG ToolsRetriever] โ”€โ”€โ–ถ gpt-4o ์ตœ์ข… ๋‹ต๋ณ€ ์ƒ์„ฑ
โ”‚
โ–ผ
[Gradio ์ฑ—๋ด‡ UI (Hugging Face Spaces ๋ฐฐํฌ)]
```
---
## 3. ์ฃผ์š” ๊ธฐ๋Šฅ
- **์‹ค์‹œ๊ฐ„ ๋‰ด์Šค ํฌ๋กค๋ง**: Selenium ํ—ค๋“œ๋ฆฌ์Šค ๋ธŒ๋ผ์šฐ์ €๋กœ ๋„ค์ด๋ฒ„ ๋‰ด์Šค ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ ๊ธฐ์‚ฌ ์ž๋™ ์ˆ˜์ง‘
- **LangGraph AI ํŒŒ์ดํ”„๋ผ์ธ**: ์ˆ˜์ง‘๋œ ๊ธฐ์‚ฌ๋ฅผ 3๋‹จ๊ณ„ ์ž๋™ ๋ถ„์„ (`ํŒ๋ณ„` โ†’ `์—”ํ‹ฐํ‹ฐ ์ถ”์ถœ` โ†’ `๊ด€๊ณ„ ์ถ”์ถœ`)
- **Neo4j ์ง€์‹ ๊ทธ๋ž˜ํ”„ ์ ์žฌ**: ์ถ”์ถœ๋œ ์—”ํ‹ฐํ‹ฐ(Company, Tech, Service ๋“ฑ)์™€ ๊ด€๊ณ„๋ฅผ MERGE ํŠธ๋žœ์žญ์…˜์œผ๋กœ ์ค‘๋ณต ์—†์ด DB ์ ์žฌ
- **GraphRAG ์ฑ—๋ด‡**: 3๊ฐ€์ง€ Retriever๋ฅผ ํ†ตํ•ฉํ•œ ToolsRetriever ๊ธฐ๋ฐ˜ ์ž์—ฐ์–ด ์งˆ์˜์‘๋‹ต
- `Vector Retriever`: ๋ณธ๋ฌธ ์ฒญํฌ ์˜๋ฏธ ์œ ์‚ฌ๋„ ๊ฒ€์ƒ‰
- `VectorCypher Retriever`: ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ํ›„ ํ•ด๋‹น ๊ธฐ์‚ฌ์˜ ์—ฐ๊ด€ ๊ทธ๋ž˜ํ”„(๊ธฐ์—…ยท๊ธฐ์ˆ ยท์„œ๋น„์Šค) ๋ฐ˜ํ™˜ (ํŠธ๋ Œ๋“œ ๋ถ„์„์— ์ตœ์ ํ™”)
- `Text2Cypher Retriever`: ์ž์—ฐ์–ด โ†’ Cypher ์ฟผ๋ฆฌ ์ž๋™ ๋ณ€ํ™˜ ๋ฐ ๋ฐ์ดํ„ฐ ์ง‘๊ณ„
---
## 4. ๊ธฐ์ˆ  ์Šคํƒ
- **Language**: Python 3.10
- **AI / LLM**: LangChain, LangGraph, OpenAI (`gpt-4o`, `text-embedding-3-small`)
- **Database**: Neo4j (AuraDB Cloud)
- **Web / Crawling**: Gradio, Selenium, Pandas
- **CI/CD**: GitHub Actions, Hugging Face Spaces
---
## 5. ๊ทธ๋ž˜ํ”„ ์Šคํ‚ค๋งˆ
### ๋…ธ๋“œ ๋ฐ ๊ด€๊ณ„
| ๊ตฌ๋ถ„ | ๋‚ด์šฉ |
|------|-----------|
| **๋…ธ๋“œ (Nodes)** | `Article`, `Content`, `AICompany`, `AITechnology`, `AIService`, `AIField`, `Media`, `Category` |
| **๊ด€๊ณ„ (Edges)** | `HAS_CHUNK`, `PUBLISHED`, `BELONGS_TO`, `MENTIONS`, `DEVELOPS`, `INVESTS_IN`, `PARTNERS_WITH`, `APPLIES`, `USED_IN`, `RELATED_TO` |
---
## 6. ์„ค์น˜ ๋ฐ ์‹คํ–‰ ๊ฐ€์ด๋“œ
### ์‚ฌ์ „ ์ค€๋น„
- Python 3.10+
- Neo4j AuraDB ์ธ์Šคํ„ด์Šค (๋˜๋Š” ๋กœ์ปฌ Neo4j)
- OpenAI API Key
### ๋กœ์ปฌ ์‹คํ–‰
```bash
# 1. ์ €์žฅ์†Œ ํด๋ก 
git clone https://github.com/yuje/FinGraph.git
cd FinGraph
# 2. ๊ฐ€์ƒํ™˜๊ฒฝ ์ƒ์„ฑ ๋ฐ ์˜์กด์„ฑ ์„ค์น˜
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# 3. ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •
cp .env.example .env
# .env ํŒŒ์ผ์— OpenAI Key, Neo4j ์ ‘์† ์ •๋ณด ์ž…๋ ฅ
# 4. Gradio ์•ฑ ์‹คํ–‰
python app.py
```
๋ธŒ๋ผ์šฐ์ €์—์„œ `http://localhost:7860` ์ ‘์†
---
## 7. ๋ฐฐํฌ (Hugging Face Spaces)
GitHub โ†’ Hugging Face Spaces ์ž๋™ ๋ฐฐํฌ๊ฐ€ `deploy.yml`์„ ํ†ตํ•ด ์„ค์ •๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
`main` ๋ธŒ๋žœ์น˜์— Push ์‹œ ์ž๋™์œผ๋กœ ๋™๊ธฐํ™”๋ฉ๋‹ˆ๋‹ค.
1. **Hugging Face ํ† ํฐ ๋ฐœ๊ธ‰**: Settings โ†’ Tokens์—์„œ Write ๊ถŒํ•œ ํ† ํฐ ์ƒ์„ฑ
2. **GitHub Secrets ๋“ฑ๋ก**: `HF_TOKEN`, `HF_REPO` (์˜ˆ: yuje/FinNode) ๋“ฑ๋ก
3. **HF Space Secrets ๋“ฑ๋ก**: `.env` ํ•ญ๋ชฉ(OpenAI, Neo4j ํ‚ค) ๋™์ผํ•˜๊ฒŒ ๋“ฑ๋ก