FinGraph / README.md
dev-yuje's picture
refactor: AGENTS.md ๋ฐ README.md ๋””๋ ‰ํ† ๋ฆฌ ๊ตฌ์กฐ ์ตœ์‹ ํ™” ๋ฐ ์ฐธ๊ณ ์ž๋ฃŒ ํฌ๋ ˆ๋”ง ์ถ”๊ฐ€
f0b1337
---
title: FinGraph
emoji: ๐Ÿ•ธ๏ธ
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 6.14.0
python_version: 3.10.14
app_file: app.py
pinned: false
---
# FinNode ๐Ÿ•ธ๏ธ
**Neo4j GraphRAG ๊ธฐ๋ฐ˜ AI ๋‰ด์Šค ์ง€์‹ ๊ทธ๋ž˜ํ”„ ํ”Œ๋žซํผ**
[![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/)
[![Neo4j](https://img.shields.io/badge/Neo4j-AuraDB-blue.svg)](https://neo4j.com/)
[![LangGraph](https://img.shields.io/badge/LangGraph-Pipeline-orange.svg)](https://langchain.com/)
[![Gradio](https://img.shields.io/badge/Gradio-UI-red.svg)](https://gradio.app/)
[![CI](https://github.com/yuje/FinGraph/actions/workflows/ci.yml/badge.svg)](https://github.com/yuje/FinGraph/actions/workflows/ci.yml)
---
## ๐Ÿ“ ๋ณด๊ณ ์„œ
> [์ตœ์ข… ๊ธฐํš์•ˆ ๋ฐ ํ”„๋กœ์ ํŠธ ๋ณด๊ณ ์„œ (์—…๋ฐ์ดํŠธ ์˜ˆ์ •)]()
## ๐ŸŽฅ ์‹œ์—ฐ ์˜์ƒ
> [์„œ๋น„์Šค ์‹œ์—ฐ ์˜์ƒ ๋งํฌ (์—…๋ฐ์ดํŠธ ์˜ˆ์ •)]()
---
## 1. ํ”„๋กœ์ ํŠธ ๋ฐฐ๊ฒฝ ๋ฐ ๋ชฉ์ 
์ตœ์‹  AI ๊ธฐ์ˆ ๊ณผ ํ•€ํ…Œํฌ ํŠธ๋ Œ๋“œ๋Š” ๋น ๋ฅด๊ฒŒ ๋ณ€ํ™”ํ•˜๋ฉฐ, ์ผ๋ฐ˜์ ์ธ RAG(๊ฒ€์ƒ‰ ์ฆ๊ฐ• ์ƒ์„ฑ) ๊ธฐ์ˆ ๋งŒ์œผ๋กœ๋Š” ์—ฌ๋Ÿฌ ๋‰ด์Šค ๊ธฐ์‚ฌ์— ํฉ์–ด์ ธ ์žˆ๋Š” **'๊ธฐ์—…-๊ธฐ์ˆ -์„œ๋น„์Šค' ๊ฐ„์˜ ๋ณต์žกํ•œ ๊ด€๊ณ„**๋ฅผ ํŒŒ์•…ํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
**FinNode**๋Š” ๋„ค์ด๋ฒ„ ๋‰ด์Šค์—์„œ AI ๊ด€๋ จ ๊ธฐ์‚ฌ๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ˆ˜์ง‘ํ•˜๊ณ , **LangGraph ํŒŒ์ดํ”„๋ผ์ธ**์„ ํ†ตํ•ด ์—”ํ‹ฐํ‹ฐ์™€ ๊ด€๊ณ„๋ฅผ ์ž๋™ ์ถ”์ถœํ•˜์—ฌ **Neo4j ์ง€์‹ ๊ทธ๋ž˜ํ”„**์— ์ ์žฌํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ Vector ๋ฐ Cypher ๋ณตํ•ฉ ๊ฒ€์ƒ‰(GraphRAG)์„ ์ˆ˜ํ–‰ํ•˜์—ฌ, ๋‹จ์ˆœํ•œ ๋ฌธ์„œ ๊ฒ€์ƒ‰์„ ๋„˜์–ด **"ํ˜„์žฌ ๊ธˆ์œตAI ๋ถ„์•ผ์—์„œ ๊ฐ€์žฅ ์ ๊ทน์ ์ธ ๊ธฐ์—…๊ณผ ๊ธฐ์ˆ  ํŠธ๋ Œ๋“œ"**๋ฅผ ์™„๋ฒฝํ•œ ๊ทผ๊ฑฐ์™€ ํ•จ๊ป˜ ์ถ”๋ก ํ•˜๊ณ  ๋‹ต๋ณ€ํ•˜๋Š” ์ฐจ์„ธ๋Œ€ ์ฑ—๋ด‡ ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค.
---
## 2. ์‹œ์Šคํ…œ ์•„ํ‚คํ…์ฒ˜
```text
[Naver News]
โ”‚ Selenium ํฌ๋กค๋ง
โ–ผ
[LangGraph Pipeline] (gpt-4o-mini)
check_ai โ”€โ”€(AI ์•„๋‹˜)โ”€โ”€โ–ถ ์Šคํ‚ต
โ”‚ (AI ๊ด€๋ จ)
โ–ผ
extract_entities
โ”‚
โ–ผ
extract_relations
โ”‚
โ–ผ
[Neo4j AuraDB]
Article / Content / AICompany / AITechnology / AIService / AIField / Media
โ”‚
โ–ผ
[GraphRAG ToolsRetriever] โ”€โ”€โ–ถ gpt-4o ์ตœ์ข… ๋‹ต๋ณ€ ์ƒ์„ฑ
โ”‚
โ–ผ
[Gradio ์ฑ—๋ด‡ UI (Hugging Face Spaces ๋ฐฐํฌ)]
```
---
## 3. ์ฃผ์š” ๊ธฐ๋Šฅ
- **์‹ค์‹œ๊ฐ„ ๋‰ด์Šค ํฌ๋กค๋ง**: Selenium ํ—ค๋“œ๋ฆฌ์Šค ๋ธŒ๋ผ์šฐ์ €๋กœ ๋„ค์ด๋ฒ„ ๋‰ด์Šค ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ ๊ธฐ์‚ฌ ์ž๋™ ์ˆ˜์ง‘
- **LangGraph AI ํŒŒ์ดํ”„๋ผ์ธ**: ์ˆ˜์ง‘๋œ ๊ธฐ์‚ฌ๋ฅผ 3๋‹จ๊ณ„ ์ž๋™ ๋ถ„์„ (`ํŒ๋ณ„` โ†’ `์—”ํ‹ฐํ‹ฐ ์ถ”์ถœ` โ†’ `๊ด€๊ณ„ ์ถ”์ถœ`)
- **Neo4j ์ง€์‹ ๊ทธ๋ž˜ํ”„ ์ ์žฌ**: ์ถ”์ถœ๋œ ์—”ํ‹ฐํ‹ฐ(Company, Tech, Service ๋“ฑ)์™€ ๊ด€๊ณ„๋ฅผ MERGE ํŠธ๋žœ์žญ์…˜์œผ๋กœ ์ค‘๋ณต ์—†์ด DB ์ ์žฌ
- **GraphRAG ์ฑ—๋ด‡**: 3๊ฐ€์ง€ Retriever๋ฅผ ํ†ตํ•ฉํ•œ ToolsRetriever ๊ธฐ๋ฐ˜ ์ž์—ฐ์–ด ์งˆ์˜์‘๋‹ต
- `Vector Retriever`: ๋ณธ๋ฌธ ์ฒญํฌ ์˜๋ฏธ ์œ ์‚ฌ๋„ ๊ฒ€์ƒ‰
- `VectorCypher Retriever`: ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ํ›„ ํ•ด๋‹น ๊ธฐ์‚ฌ์˜ ์—ฐ๊ด€ ๊ทธ๋ž˜ํ”„(๊ธฐ์—…ยท๊ธฐ์ˆ ยท์„œ๋น„์Šค) ๋ฐ˜ํ™˜ (ํŠธ๋ Œ๋“œ ๋ถ„์„์— ์ตœ์ ํ™”)
- `Text2Cypher Retriever`: ์ž์—ฐ์–ด โ†’ Cypher ์ฟผ๋ฆฌ ์ž๋™ ๋ณ€ํ™˜ ๋ฐ ๋ฐ์ดํ„ฐ ์ง‘๊ณ„
- **๋Œ€ํ™” ๋งฅ๋ฝ ๋ฐ˜์˜ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ RAG & ์ง€๋Šฅํ˜• Fallback ๋ผ์šฐํŒ…**:
- Neo4j์—์„œ ๊ฒ€์ƒ‰๋œ ์ง€์‹ ๊ทธ๋ž˜ํ”„ ์ •๋ณด์™€ ์‚ฌ์šฉ์ž์˜ ์งˆ๋ฌธ, ๊ทธ๋ฆฌ๊ณ  **์ตœ๊ทผ ๋Œ€ํ™” ํžˆ์Šคํ† ๋ฆฌ(์ตœ๊ทผ 3๊ฐœ ๋ฉ”์‹œ์ง€)**๋ฅผ ์ข…ํ•ฉ ๋ถ„์„ํ•˜์—ฌ GPT-4o ๊ธฐ๋ฐ˜ ์ž๊ฐ€ ํŒ์ • ๊ฐ€๋“œ๋ ˆ์ผ(`_is_context_sufficient`)์„ ์‹ค์‹œ๊ฐ„ ๊ตฌ๋™.
- ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๊ฐ€ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•˜๊ธฐ ์ถฉ๋ถ„ํ•œ ๊ฒฝ์šฐ `GraphRAG` ๋ชจ๋“œ๋กœ ๊ตฌ๋™ํ•˜์—ฌ ๊ตฌ์ฒด์ ์ธ ๊ธฐ์‚ฌ ์ถœ์ฒ˜ ๋งํฌ(URL, ์ œ๋ชฉ ๋“ฑ)๋ฅผ ํฌํ•จํ•œ ํŒฉํŠธ ๊ธฐ๋ฐ˜ ๋‹ต๋ณ€ ์ œ๊ณต.
- ๊ด€๋ จ ์ •๋ณด๊ฐ€ ๋ถ€์กฑํ•˜๊ฑฐ๋‚˜ ๊ธˆ์œต/IT ๋‰ด์Šค๋ฅผ ๋ฒ—์–ด๋‚œ ์ผ๋ฐ˜ ์งˆ๋ฌธ(์˜ˆ: ์ˆ˜ํ•™ ๊ณต์‹, ์ผ์ƒ ๋Œ€ํ™” ๋“ฑ), ํ˜น์€ ์ด์ „ ๋Œ€ํ™” ๋งฅ๋ฝ์— ์˜์กดํ•˜๋Š” ์งˆ๋ฌธ์— ๋Œ€ํ•ด์„œ๋„ ํžˆ์Šคํ† ๋ฆฌ๋ฅผ ์ข…ํ•ฉ ๋ถ„์„ํ•˜์—ฌ ์ผ๋ฐ˜ ์ง€์‹ ๋‹ต๋ณ€(`general` ๋ชจ๋“œ)์œผ๋กœ ์œ ์—ฐํ•˜๊ฒŒ ์Šค์œ„์นญํ•˜์—ฌ ํ™˜๊ฐ(Hallucination) ๋ฐฉ์ง€ ๋ฐ ์•ˆ์ •์„ฑ ๊ทน๋Œ€ํ™”.
- **์‚ฌ์šฉ์ž ๊ฒฝํ—˜(UX) ๋ฐ ํ”„๋ฆฌ๋ฏธ์—„ UI ์ตœ์ ํ™”**:
- **์Šฌ๋ฆผ ์ฑ— ๋ฒ„๋ธ”**: Gradio ๋งˆํฌ๋‹ค์šด ๋ Œ๋”๋Ÿฌ๊ฐ€ ๋‚ด๋ถ€ `<p>`, `<li>` ํƒœ๊ทธ ๋“ฑ์— ๋ถ€์—ฌํ•˜๋Š” ๋น„์ •์ƒ์ ์œผ๋กœ ํฐ ์ƒํ•˜ ์—ฌ๋ฐฑ๊ณผ ๋งˆ์ง„์„ ์ถ•์†Œ(`.message` ํŒจ๋”ฉ `10px 14px` ์กฐ์ • ๋ฐ ๋‚ด๋ถ€ ๋งˆ์ง„ ์ตœ์ ํ™”)ํ•˜์—ฌ ๊ฐ€๋…์„ฑ ๋†’๊ณ  ์Šฌ๋ฆผํ•œ ํ”„๋ฆฌ๋ฏธ์—„ ๋งํ’์„  UI ๊ตฌํ˜„.
- **์‹ค์‹œ๊ฐ„ ํƒ์ƒ‰ ์ง„ํ–‰์ƒํ™ฉ ์‹œ๊ฐํ™”**: LangGraph ๋Œ€ํ™” ์ŠคํŠธ๋ฆผ(Stream) ์—ฐ๋™์„ ํ†ตํ•ด `"๐Ÿ” ๊ฒ€์ƒ‰ ์ง„ํ–‰ ์ค‘..."`, `"๐Ÿ’ก ๋‹ต๋ณ€ ์ƒ์„ฑ ์ค‘..."` ๊ณผ์ •์„ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋…ธ์ถœํ•˜์—ฌ ๋‹ค๋‹จ๊ณ„ RAG์˜ ์ง€์—ฐ ์‹œ๊ฐ„ ๋™์•ˆ์˜ ์‚ฌ์šฉ์ž ์ฒด๊ฐ ๋Œ€๊ธฐ ์„ฑ๋Šฅ ํ˜์‹ .
---
## 4. ๊ธฐ์ˆ  ์Šคํƒ
- **Language**: Python 3.10
- **AI / LLM**: LangChain, LangGraph, OpenAI (`gpt-4o`, `text-embedding-3-small`)
- **Database**: Neo4j (AuraDB Cloud)
- **Web / Crawling**: Gradio, Selenium, Pandas
- **CI/CD**: GitHub Actions, Hugging Face Spaces
---
## 5. ๊ทธ๋ž˜ํ”„ ์Šคํ‚ค๋งˆ
### ๋…ธ๋“œ ๋ฐ ๊ด€๊ณ„
| ๊ตฌ๋ถ„ | ๋‚ด์šฉ |
|------|-----------|
| **๋…ธ๋“œ (Nodes)** | `Article`, `Content`, `AICompany`, `AITechnology`, `AIService`, `AIField`, `Media`, `Category` |
| **๊ด€๊ณ„ (Edges)** | `HAS_CHUNK`, `PUBLISHED`, `BELONGS_TO`, `MENTIONS`, `DEVELOPS`, `INVESTS_IN`, `PARTNERS_WITH`, `APPLIES`, `USED_IN`, `RELATED_TO` |
---
## 6. ์„ค์น˜ ๋ฐ ์‹คํ–‰ ๊ฐ€์ด๋“œ
### ์‚ฌ์ „ ์ค€๋น„
- Python 3.10+
- Neo4j AuraDB ์ธ์Šคํ„ด์Šค (๋˜๋Š” ๋กœ์ปฌ Neo4j)
- OpenAI API Key
### ๋กœ์ปฌ ์‹คํ–‰
```bash
# 1. ์ €์žฅ์†Œ ํด๋ก 
git clone https://github.com/yuje/FinGraph.git
cd FinGraph
# 2. ๊ฐ€์ƒํ™˜๊ฒฝ ์ƒ์„ฑ ๋ฐ ์˜์กด์„ฑ ์„ค์น˜
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# 3. ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •
cp .env.example .env
# .env ํŒŒ์ผ์— OpenAI Key, Neo4j ์ ‘์† ์ •๋ณด ์ž…๋ ฅ
# 4. Gradio ์•ฑ ์‹คํ–‰
python app.py
```
๋ธŒ๋ผ์šฐ์ €์—์„œ `http://localhost:7860` ์ ‘์†
---
## 7. ๋ฐฐํฌ (Hugging Face Spaces)
GitHub โ†’ Hugging Face Spaces ์ž๋™ ๋ฐฐํฌ๊ฐ€ `deploy.yml`์„ ํ†ตํ•ด ์„ค์ •๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
`main` ๋ธŒ๋žœ์น˜์— Push ์‹œ ์ž๋™์œผ๋กœ ๋™๊ธฐํ™”๋ฉ๋‹ˆ๋‹ค.
1. **Hugging Face ํ† ํฐ ๋ฐœ๊ธ‰**: Settings โ†’ Tokens์—์„œ Write ๊ถŒํ•œ ํ† ํฐ ์ƒ์„ฑ
2. **GitHub Secrets ๋“ฑ๋ก**: `HF_TOKEN`, `HF_REPO` (์˜ˆ: yuje/FinNode) ๋“ฑ๋ก
3. **HF Space Secrets ๋“ฑ๋ก**: `.env` ํ•ญ๋ชฉ(OpenAI, Neo4j ํ‚ค) ๋™์ผํ•˜๊ฒŒ ๋“ฑ๋ก
---
## 8. ์ฐธ๊ณ  ์ž๋ฃŒ ๋ฐ ์˜คํ”ˆ์†Œ์Šค ํฌ๋ ˆ๋”ง (References & Credits)
- **GraphRAG ToolsRetriever**: [graphrag-tools-retriever (GitHub)](https://github.com/gongwon-nayeon/graphrag-tools-retriever)
- ๋ณธ ํ”„๋กœ์ ํŠธ์˜ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ๊ฒ€์ƒ‰ ๋ฐ GraphRAG ๋ผ์šฐํŒ…(Context-Sufficient Fallback) ์„ค๊ณ„์˜ ํ•ต์‹ฌ ์ฐธ๊ณ  ๋ชจ๋ธ๋กœ ํ™œ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
- **Neo4j ๋ฐ LLM ์—ฐ๋™ ์‹ค๋ฌด**: [์œ„ํ‚ค๋…์Šค Neo4j GraphRAG ๊ฐ€์ด๋“œ](https://wikidocs.net/340866)