bluewhale2025's picture
Initial commit: Add ParseAI document processor application
3022fd1

A newer version of the Gradio SDK is available: 6.2.0

Upgrade
metadata
title: ParseAI Document Processor
emoji: ๐Ÿ“Š
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.30.0
app_file: app.py
pinned: true

ParseAI - Document Processing and Analysis

ParseAI๋Š” PDF ๋ฌธ์„œ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•œ ๊ฐ•๋ ฅํ•œ ๋„๊ตฌ์ž…๋‹ˆ๋‹ค. ๋ฌธ์„œ์—์„œ ํ…์ŠคํŠธ๋ฅผ ์ถ”์ถœํ•˜๊ณ , ์š”์•ฝํ•˜๋ฉฐ, ๋ฒกํ„ฐ ๊ฒ€์ƒ‰์„ ํ†ตํ•ด ๊ด€๋ จ ๋ฌธ์„œ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿš€ ์ฃผ์š” ๊ธฐ๋Šฅ

  • PDF ๋ฌธ์„œ ์—…๋กœ๋“œ ๋ฐ ํ…์ŠคํŠธ ์ถ”์ถœ
  • ๋ฌธ์„œ ๋‚ด์šฉ ์š”์•ฝ
  • ๋ฒกํ„ฐ ๊ธฐ๋ฐ˜ ๋ฌธ์„œ ๊ฒ€์ƒ‰
  • Gradio ๊ธฐ๋ฐ˜์˜ ์‚ฌ์šฉ์ž ์นœํ™”์ ์ธ ์›น ์ธํ„ฐํŽ˜์ด์Šค

๐Ÿ› ๏ธ ๊ธฐ์ˆ  ์Šคํƒ

  • Backend: FastAPI
  • Frontend: Gradio
  • NLP: NLTK, Hugging Face Transformers
  • Vector Store: Sentence Transformers
  • Container: Docker

๐Ÿš€ ๋กœ์ปฌ์—์„œ ์‹คํ–‰ํ•˜๊ธฐ

์‚ฌ์ „ ์š”๊ตฌ์‚ฌํ•ญ

  • Docker ๋ฐ Docker Compose
  • Python 3.9+

ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •

.env ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๊ณ  ๋‹ค์Œ ๋ณ€์ˆ˜๋“ค์„ ์„ค์ •ํ•˜์„ธ์š”:

# Hugging Face Hub configuration
HUGGINGFACE_HUB_TOKEN=your_hf_token_here

# Application configuration
UPLOAD_FOLDER=/app/data/uploads
NLTK_DATA=/app/nltk_data

Docker๋ฅผ ์‚ฌ์šฉํ•œ ์‹คํ–‰

  1. Docker ์ด๋ฏธ์ง€ ๋นŒ๋“œ:

    docker build -t parseai .
    
  2. ์ปจํ…Œ์ด๋„ˆ ์‹คํ–‰:

    docker run -d -p 7860:7860 --env-file .env parseai
    
  3. ์›น ๋ธŒ๋ผ์šฐ์ €์—์„œ ์ ‘์†:

    http://localhost:7860
    

๐ŸŒ Hugging Face Spaces์— ๋ฐฐํฌํ•˜๊ธฐ

  1. ์ด ์ €์žฅ์†Œ๋ฅผ Hugging Face Spaces์— ํ‘ธ์‹œํ•ฉ๋‹ˆ๋‹ค.
  2. ์ €์žฅ์†Œ ์„ค์ •์—์„œ ๋‹ค์Œ ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ์„ค์ •ํ•˜์„ธ์š”:
    • HUGGINGFACE_HUB_TOKEN: Hugging Face API ํ† ํฐ
    • UPLOAD_FOLDER: /app/data/uploads
    • NLTK_DATA: /app/nltk_data

๐Ÿ“ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•

  1. ๋ฌธ์„œ ์—…๋กœ๋“œ ํƒญ์—์„œ PDF ํŒŒ์ผ์„ ์—…๋กœ๋“œํ•˜์„ธ์š”.
  2. ๋ฌธ์„œ ๊ฒ€์ƒ‰ ํƒญ์—์„œ ํ‚ค์›Œ๋“œ๋ฅผ ์ž…๋ ฅํ•˜์—ฌ ๊ด€๋ จ ๋ฌธ์„œ๋ฅผ ๊ฒ€์ƒ‰ํ•˜์„ธ์š”.

๐Ÿ“Š ์ƒํƒœ ํ™•์ธ

์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ƒํƒœ๋Š” ๋‹ค์Œ ์—”๋“œํฌ์ธํŠธ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

GET /health

๐Ÿ“„ ๋ผ์ด์„ ์Šค

์ด ํ”„๋กœ์ ํŠธ๋Š” MIT ๋ผ์ด์„ ์Šค ํ•˜์— ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค.