FinGraph / README.md
dev-yuje's picture
refactor: AGENTS.md ๋ฐ README.md ๋””๋ ‰ํ† ๋ฆฌ ๊ตฌ์กฐ ์ตœ์‹ ํ™” ๋ฐ ์ฐธ๊ณ ์ž๋ฃŒ ํฌ๋ ˆ๋”ง ์ถ”๊ฐ€
f0b1337

A newer version of the Gradio SDK is available: 6.15.2

Upgrade
metadata
title: FinGraph
emoji: ๐Ÿ•ธ๏ธ
colorFrom: indigo
colorTo: indigo
sdk: gradio
sdk_version: 6.14.0
python_version: 3.10.14
app_file: app.py
pinned: false

FinNode ๐Ÿ•ธ๏ธ

Neo4j GraphRAG ๊ธฐ๋ฐ˜ AI ๋‰ด์Šค ์ง€์‹ ๊ทธ๋ž˜ํ”„ ํ”Œ๋žซํผ

Python Neo4j LangGraph Gradio CI


๐Ÿ“ ๋ณด๊ณ ์„œ

์ตœ์ข… ๊ธฐํš์•ˆ ๋ฐ ํ”„๋กœ์ ํŠธ ๋ณด๊ณ ์„œ (์—…๋ฐ์ดํŠธ ์˜ˆ์ •)

๐ŸŽฅ ์‹œ์—ฐ ์˜์ƒ

์„œ๋น„์Šค ์‹œ์—ฐ ์˜์ƒ ๋งํฌ (์—…๋ฐ์ดํŠธ ์˜ˆ์ •)


1. ํ”„๋กœ์ ํŠธ ๋ฐฐ๊ฒฝ ๋ฐ ๋ชฉ์ 

์ตœ์‹  AI ๊ธฐ์ˆ ๊ณผ ํ•€ํ…Œํฌ ํŠธ๋ Œ๋“œ๋Š” ๋น ๋ฅด๊ฒŒ ๋ณ€ํ™”ํ•˜๋ฉฐ, ์ผ๋ฐ˜์ ์ธ RAG(๊ฒ€์ƒ‰ ์ฆ๊ฐ• ์ƒ์„ฑ) ๊ธฐ์ˆ ๋งŒ์œผ๋กœ๋Š” ์—ฌ๋Ÿฌ ๋‰ด์Šค ๊ธฐ์‚ฌ์— ํฉ์–ด์ ธ ์žˆ๋Š” '๊ธฐ์—…-๊ธฐ์ˆ -์„œ๋น„์Šค' ๊ฐ„์˜ ๋ณต์žกํ•œ ๊ด€๊ณ„๋ฅผ ํŒŒ์•…ํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.

FinNode๋Š” ๋„ค์ด๋ฒ„ ๋‰ด์Šค์—์„œ AI ๊ด€๋ จ ๊ธฐ์‚ฌ๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ˆ˜์ง‘ํ•˜๊ณ , LangGraph ํŒŒ์ดํ”„๋ผ์ธ์„ ํ†ตํ•ด ์—”ํ‹ฐํ‹ฐ์™€ ๊ด€๊ณ„๋ฅผ ์ž๋™ ์ถ”์ถœํ•˜์—ฌ Neo4j ์ง€์‹ ๊ทธ๋ž˜ํ”„์— ์ ์žฌํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ Vector ๋ฐ Cypher ๋ณตํ•ฉ ๊ฒ€์ƒ‰(GraphRAG)์„ ์ˆ˜ํ–‰ํ•˜์—ฌ, ๋‹จ์ˆœํ•œ ๋ฌธ์„œ ๊ฒ€์ƒ‰์„ ๋„˜์–ด **"ํ˜„์žฌ ๊ธˆ์œตAI ๋ถ„์•ผ์—์„œ ๊ฐ€์žฅ ์ ๊ทน์ ์ธ ๊ธฐ์—…๊ณผ ๊ธฐ์ˆ  ํŠธ๋ Œ๋“œ"**๋ฅผ ์™„๋ฒฝํ•œ ๊ทผ๊ฑฐ์™€ ํ•จ๊ป˜ ์ถ”๋ก ํ•˜๊ณ  ๋‹ต๋ณ€ํ•˜๋Š” ์ฐจ์„ธ๋Œ€ ์ฑ—๋ด‡ ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค.


2. ์‹œ์Šคํ…œ ์•„ํ‚คํ…์ฒ˜

[Naver News] 
     โ”‚ Selenium ํฌ๋กค๋ง
     โ–ผ
[LangGraph Pipeline] (gpt-4o-mini)
  check_ai โ”€โ”€(AI ์•„๋‹˜)โ”€โ”€โ–ถ ์Šคํ‚ต
     โ”‚ (AI ๊ด€๋ จ)
     โ–ผ
  extract_entities
     โ”‚
     โ–ผ
  extract_relations
     โ”‚
     โ–ผ
[Neo4j AuraDB]
  Article / Content / AICompany / AITechnology / AIService / AIField / Media
     โ”‚
     โ–ผ
[GraphRAG ToolsRetriever] โ”€โ”€โ–ถ gpt-4o ์ตœ์ข… ๋‹ต๋ณ€ ์ƒ์„ฑ
     โ”‚
     โ–ผ
[Gradio ์ฑ—๋ด‡ UI (Hugging Face Spaces ๋ฐฐํฌ)]

3. ์ฃผ์š” ๊ธฐ๋Šฅ

  • ์‹ค์‹œ๊ฐ„ ๋‰ด์Šค ํฌ๋กค๋ง: Selenium ํ—ค๋“œ๋ฆฌ์Šค ๋ธŒ๋ผ์šฐ์ €๋กœ ๋„ค์ด๋ฒ„ ๋‰ด์Šค ์นดํ…Œ๊ณ ๋ฆฌ๋ณ„ ๊ธฐ์‚ฌ ์ž๋™ ์ˆ˜์ง‘
  • LangGraph AI ํŒŒ์ดํ”„๋ผ์ธ: ์ˆ˜์ง‘๋œ ๊ธฐ์‚ฌ๋ฅผ 3๋‹จ๊ณ„ ์ž๋™ ๋ถ„์„ (ํŒ๋ณ„ โ†’ ์—”ํ‹ฐํ‹ฐ ์ถ”์ถœ โ†’ ๊ด€๊ณ„ ์ถ”์ถœ)
  • Neo4j ์ง€์‹ ๊ทธ๋ž˜ํ”„ ์ ์žฌ: ์ถ”์ถœ๋œ ์—”ํ‹ฐํ‹ฐ(Company, Tech, Service ๋“ฑ)์™€ ๊ด€๊ณ„๋ฅผ MERGE ํŠธ๋žœ์žญ์…˜์œผ๋กœ ์ค‘๋ณต ์—†์ด DB ์ ์žฌ
  • GraphRAG ์ฑ—๋ด‡: 3๊ฐ€์ง€ Retriever๋ฅผ ํ†ตํ•ฉํ•œ ToolsRetriever ๊ธฐ๋ฐ˜ ์ž์—ฐ์–ด ์งˆ์˜์‘๋‹ต
    • Vector Retriever: ๋ณธ๋ฌธ ์ฒญํฌ ์˜๋ฏธ ์œ ์‚ฌ๋„ ๊ฒ€์ƒ‰
    • VectorCypher Retriever: ๋ฒกํ„ฐ ๊ฒ€์ƒ‰ ํ›„ ํ•ด๋‹น ๊ธฐ์‚ฌ์˜ ์—ฐ๊ด€ ๊ทธ๋ž˜ํ”„(๊ธฐ์—…ยท๊ธฐ์ˆ ยท์„œ๋น„์Šค) ๋ฐ˜ํ™˜ (ํŠธ๋ Œ๋“œ ๋ถ„์„์— ์ตœ์ ํ™”)
    • Text2Cypher Retriever: ์ž์—ฐ์–ด โ†’ Cypher ์ฟผ๋ฆฌ ์ž๋™ ๋ณ€ํ™˜ ๋ฐ ๋ฐ์ดํ„ฐ ์ง‘๊ณ„
  • ๋Œ€ํ™” ๋งฅ๋ฝ ๋ฐ˜์˜ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ RAG & ์ง€๋Šฅํ˜• Fallback ๋ผ์šฐํŒ…:
    • Neo4j์—์„œ ๊ฒ€์ƒ‰๋œ ์ง€์‹ ๊ทธ๋ž˜ํ”„ ์ •๋ณด์™€ ์‚ฌ์šฉ์ž์˜ ์งˆ๋ฌธ, ๊ทธ๋ฆฌ๊ณ  **์ตœ๊ทผ ๋Œ€ํ™” ํžˆ์Šคํ† ๋ฆฌ(์ตœ๊ทผ 3๊ฐœ ๋ฉ”์‹œ์ง€)**๋ฅผ ์ข…ํ•ฉ ๋ถ„์„ํ•˜์—ฌ GPT-4o ๊ธฐ๋ฐ˜ ์ž๊ฐ€ ํŒ์ • ๊ฐ€๋“œ๋ ˆ์ผ(_is_context_sufficient)์„ ์‹ค์‹œ๊ฐ„ ๊ตฌ๋™.
    • ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๊ฐ€ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•˜๊ธฐ ์ถฉ๋ถ„ํ•œ ๊ฒฝ์šฐ GraphRAG ๋ชจ๋“œ๋กœ ๊ตฌ๋™ํ•˜์—ฌ ๊ตฌ์ฒด์ ์ธ ๊ธฐ์‚ฌ ์ถœ์ฒ˜ ๋งํฌ(URL, ์ œ๋ชฉ ๋“ฑ)๋ฅผ ํฌํ•จํ•œ ํŒฉํŠธ ๊ธฐ๋ฐ˜ ๋‹ต๋ณ€ ์ œ๊ณต.
    • ๊ด€๋ จ ์ •๋ณด๊ฐ€ ๋ถ€์กฑํ•˜๊ฑฐ๋‚˜ ๊ธˆ์œต/IT ๋‰ด์Šค๋ฅผ ๋ฒ—์–ด๋‚œ ์ผ๋ฐ˜ ์งˆ๋ฌธ(์˜ˆ: ์ˆ˜ํ•™ ๊ณต์‹, ์ผ์ƒ ๋Œ€ํ™” ๋“ฑ), ํ˜น์€ ์ด์ „ ๋Œ€ํ™” ๋งฅ๋ฝ์— ์˜์กดํ•˜๋Š” ์งˆ๋ฌธ์— ๋Œ€ํ•ด์„œ๋„ ํžˆ์Šคํ† ๋ฆฌ๋ฅผ ์ข…ํ•ฉ ๋ถ„์„ํ•˜์—ฌ ์ผ๋ฐ˜ ์ง€์‹ ๋‹ต๋ณ€(general ๋ชจ๋“œ)์œผ๋กœ ์œ ์—ฐํ•˜๊ฒŒ ์Šค์œ„์นญํ•˜์—ฌ ํ™˜๊ฐ(Hallucination) ๋ฐฉ์ง€ ๋ฐ ์•ˆ์ •์„ฑ ๊ทน๋Œ€ํ™”.
  • ์‚ฌ์šฉ์ž ๊ฒฝํ—˜(UX) ๋ฐ ํ”„๋ฆฌ๋ฏธ์—„ UI ์ตœ์ ํ™”:
    • ์Šฌ๋ฆผ ์ฑ— ๋ฒ„๋ธ”: Gradio ๋งˆํฌ๋‹ค์šด ๋ Œ๋”๋Ÿฌ๊ฐ€ ๋‚ด๋ถ€ <p>, <li> ํƒœ๊ทธ ๋“ฑ์— ๋ถ€์—ฌํ•˜๋Š” ๋น„์ •์ƒ์ ์œผ๋กœ ํฐ ์ƒํ•˜ ์—ฌ๋ฐฑ๊ณผ ๋งˆ์ง„์„ ์ถ•์†Œ(.message ํŒจ๋”ฉ 10px 14px ์กฐ์ • ๋ฐ ๋‚ด๋ถ€ ๋งˆ์ง„ ์ตœ์ ํ™”)ํ•˜์—ฌ ๊ฐ€๋…์„ฑ ๋†’๊ณ  ์Šฌ๋ฆผํ•œ ํ”„๋ฆฌ๋ฏธ์—„ ๋งํ’์„  UI ๊ตฌํ˜„.
    • ์‹ค์‹œ๊ฐ„ ํƒ์ƒ‰ ์ง„ํ–‰์ƒํ™ฉ ์‹œ๊ฐํ™”: LangGraph ๋Œ€ํ™” ์ŠคํŠธ๋ฆผ(Stream) ์—ฐ๋™์„ ํ†ตํ•ด "๐Ÿ” ๊ฒ€์ƒ‰ ์ง„ํ–‰ ์ค‘...", "๐Ÿ’ก ๋‹ต๋ณ€ ์ƒ์„ฑ ์ค‘..." ๊ณผ์ •์„ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋…ธ์ถœํ•˜์—ฌ ๋‹ค๋‹จ๊ณ„ RAG์˜ ์ง€์—ฐ ์‹œ๊ฐ„ ๋™์•ˆ์˜ ์‚ฌ์šฉ์ž ์ฒด๊ฐ ๋Œ€๊ธฐ ์„ฑ๋Šฅ ํ˜์‹ .

4. ๊ธฐ์ˆ  ์Šคํƒ

  • Language: Python 3.10
  • AI / LLM: LangChain, LangGraph, OpenAI (gpt-4o, text-embedding-3-small)
  • Database: Neo4j (AuraDB Cloud)
  • Web / Crawling: Gradio, Selenium, Pandas
  • CI/CD: GitHub Actions, Hugging Face Spaces

5. ๊ทธ๋ž˜ํ”„ ์Šคํ‚ค๋งˆ

๋…ธ๋“œ ๋ฐ ๊ด€๊ณ„

๊ตฌ๋ถ„ ๋‚ด์šฉ
๋…ธ๋“œ (Nodes) Article, Content, AICompany, AITechnology, AIService, AIField, Media, Category
๊ด€๊ณ„ (Edges) HAS_CHUNK, PUBLISHED, BELONGS_TO, MENTIONS, DEVELOPS, INVESTS_IN, PARTNERS_WITH, APPLIES, USED_IN, RELATED_TO

6. ์„ค์น˜ ๋ฐ ์‹คํ–‰ ๊ฐ€์ด๋“œ

์‚ฌ์ „ ์ค€๋น„

  • Python 3.10+
  • Neo4j AuraDB ์ธ์Šคํ„ด์Šค (๋˜๋Š” ๋กœ์ปฌ Neo4j)
  • OpenAI API Key

๋กœ์ปฌ ์‹คํ–‰

# 1. ์ €์žฅ์†Œ ํด๋ก 
git clone https://github.com/yuje/FinGraph.git
cd FinGraph

# 2. ๊ฐ€์ƒํ™˜๊ฒฝ ์ƒ์„ฑ ๋ฐ ์˜์กด์„ฑ ์„ค์น˜
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# 3. ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •
cp .env.example .env
# .env ํŒŒ์ผ์— OpenAI Key, Neo4j ์ ‘์† ์ •๋ณด ์ž…๋ ฅ

# 4. Gradio ์•ฑ ์‹คํ–‰
python app.py

๋ธŒ๋ผ์šฐ์ €์—์„œ http://localhost:7860 ์ ‘์†


7. ๋ฐฐํฌ (Hugging Face Spaces)

GitHub โ†’ Hugging Face Spaces ์ž๋™ ๋ฐฐํฌ๊ฐ€ deploy.yml์„ ํ†ตํ•ด ์„ค์ •๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. main ๋ธŒ๋žœ์น˜์— Push ์‹œ ์ž๋™์œผ๋กœ ๋™๊ธฐํ™”๋ฉ๋‹ˆ๋‹ค.

  1. Hugging Face ํ† ํฐ ๋ฐœ๊ธ‰: Settings โ†’ Tokens์—์„œ Write ๊ถŒํ•œ ํ† ํฐ ์ƒ์„ฑ
  2. GitHub Secrets ๋“ฑ๋ก: HF_TOKEN, HF_REPO (์˜ˆ: yuje/FinNode) ๋“ฑ๋ก
  3. HF Space Secrets ๋“ฑ๋ก: .env ํ•ญ๋ชฉ(OpenAI, Neo4j ํ‚ค) ๋™์ผํ•˜๊ฒŒ ๋“ฑ๋ก

8. ์ฐธ๊ณ  ์ž๋ฃŒ ๋ฐ ์˜คํ”ˆ์†Œ์Šค ํฌ๋ ˆ๋”ง (References & Credits)