--- title: ParseAI Document Processor emoji: πŸ“Š colorFrom: blue colorTo: indigo sdk: gradio sdk_version: 5.30.0 app_file: app.py pinned: true --- # ParseAI - Document Processing and Analysis ParseAIλŠ” PDF λ¬Έμ„œλ₯Ό μ²˜λ¦¬ν•˜κ³  λΆ„μ„ν•˜κΈ° μœ„ν•œ κ°•λ ₯ν•œ λ„κ΅¬μž…λ‹ˆλ‹€. λ¬Έμ„œμ—μ„œ ν…μŠ€νŠΈλ₯Ό μΆ”μΆœν•˜κ³ , μš”μ•½ν•˜λ©°, 벑터 검색을 톡해 κ΄€λ ¨ λ¬Έμ„œλ₯Ό 찾을 수 μžˆμŠ΅λ‹ˆλ‹€. ## πŸš€ μ£Όμš” κΈ°λŠ₯ - PDF λ¬Έμ„œ μ—…λ‘œλ“œ 및 ν…μŠ€νŠΈ μΆ”μΆœ - λ¬Έμ„œ λ‚΄μš© μš”μ•½ - 벑터 기반 λ¬Έμ„œ 검색 - Gradio 기반의 μ‚¬μš©μž μΉœν™”μ μΈ μ›Ή μΈν„°νŽ˜μ΄μŠ€ ## πŸ› οΈ 기술 μŠ€νƒ - **Backend**: FastAPI - **Frontend**: Gradio - **NLP**: NLTK, Hugging Face Transformers - **Vector Store**: Sentence Transformers - **Container**: Docker ## πŸš€ λ‘œμ»¬μ—μ„œ μ‹€ν–‰ν•˜κΈ° ### 사전 μš”κ΅¬μ‚¬ν•­ - Docker 및 Docker Compose - Python 3.9+ ### ν™˜κ²½ λ³€μˆ˜ μ„€μ • `.env` νŒŒμΌμ„ μƒμ„±ν•˜κ³  λ‹€μŒ λ³€μˆ˜λ“€μ„ μ„€μ •ν•˜μ„Έμš”: ```bash # Hugging Face Hub configuration HUGGINGFACE_HUB_TOKEN=your_hf_token_here # Application configuration UPLOAD_FOLDER=/app/data/uploads NLTK_DATA=/app/nltk_data ``` ### Dockerλ₯Ό μ‚¬μš©ν•œ μ‹€ν–‰ 1. Docker 이미지 λΉŒλ“œ: ```bash docker build -t parseai . ``` 2. μ»¨ν…Œμ΄λ„ˆ μ‹€ν–‰: ```bash docker run -d -p 7860:7860 --env-file .env parseai ``` 3. μ›Ή λΈŒλΌμš°μ €μ—μ„œ 접속: ``` http://localhost:7860 ``` ## 🌐 Hugging Face Spaces에 λ°°ν¬ν•˜κΈ° 1. 이 μ €μž₯μ†Œλ₯Ό Hugging Face Spaces에 ν‘Έμ‹œν•©λ‹ˆλ‹€. 2. μ €μž₯μ†Œ μ„€μ •μ—μ„œ λ‹€μŒ ν™˜κ²½ λ³€μˆ˜λ₯Ό μ„€μ •ν•˜μ„Έμš”: - `HUGGINGFACE_HUB_TOKEN`: Hugging Face API 토큰 - `UPLOAD_FOLDER`: `/app/data/uploads` - `NLTK_DATA`: `/app/nltk_data` ## πŸ“ μ‚¬μš© 방법 1. **λ¬Έμ„œ μ—…λ‘œλ“œ** νƒ­μ—μ„œ PDF νŒŒμΌμ„ μ—…λ‘œλ“œν•˜μ„Έμš”. 2. **λ¬Έμ„œ 검색** νƒ­μ—μ„œ ν‚€μ›Œλ“œλ₯Ό μž…λ ₯ν•˜μ—¬ κ΄€λ ¨ λ¬Έμ„œλ₯Ό κ²€μƒ‰ν•˜μ„Έμš”. ## πŸ“Š μƒνƒœ 확인 μ• ν”Œλ¦¬μΌ€μ΄μ…˜ μƒνƒœλŠ” λ‹€μŒ μ—”λ“œν¬μΈνŠΈμ—μ„œ 확인할 수 μžˆμŠ΅λ‹ˆλ‹€: ``` GET /health ``` ## πŸ“„ λΌμ΄μ„ μŠ€ 이 ν”„λ‘œμ νŠΈλŠ” [MIT λΌμ΄μ„ μŠ€](LICENSE) ν•˜μ— λ°°ν¬λ©λ‹ˆλ‹€.