Spaces:
Runtime error
Runtime error
File size: 2,996 Bytes
717ee1a 651b4cf 717ee1a 651b4cf 717ee1a 651b4cf 717ee1a 651b4cf | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 | ---
title: DocTalk - Chat With PDF
emoji: ππ¬
colorFrom: indigo
colorTo: pink
sdk: streamlit
sdk_version: "1.35.0"
app_file: app.py
pinned: false
---
# ππ¬ DocTalk - Chat With PDF
An intelligent, completely free-to-run PDF chat application powered by Google's Gemma-2-2b-it model. Optimized for CPU usage on Hugging Face Spaces.
## β¨ Features
### π€ **Core Engine**
* **Model:** Google Gemma-2-2B-IT (Instruction Tuned)
* **Architecture:** Runs entirely locally on CPU (no GPU required)
* **Performance:** Optimized with FAISS for instant vector retrieval
### π― **Key Capabilities**
* β‘ **CPU Optimized** - Runs smoothly on Hugging Face Free Tier
* π€ **Easy Upload** - Simple sidebar PDF upload
* π§ **Smart Context** - Uses `all-MiniLM-L6-v2` for precise semantic search
* π¬ **Memory** - Maintains chat history within the session
* π **Secure** - Handles Hugging Face tokens via environment secrets
## π How to Use
### 1. Set Up Authentication
* This app requires a **Hugging Face Access Token** (Read permissions) to download the Gemma model.
* **For Users:** Enter your token in the app sidebar if prompted (or set it in Space secrets).
### 2. Upload Your PDF
* Navigate to the sidebar
* Click "Browse files" to upload your PDF document
* Click **"π Process Document"**
### 3. Start Chatting!
* Wait for the "β
Ready to chat!" notification
* Type your question in the chat input at the bottom
* Receive concise, context-aware answers from Gemma-2
## π οΈ Technical Stack
* **Frontend**: Streamlit
* **LLM**: google/gemma-2-2b-it
* **Embeddings**: sentence-transformers/all-MiniLM-L6-v2
* **Vector Store**: FAISS (Facebook AI Similarity Search)
* **PDF Processing**: PyPDFLoader
* **Orchestration**: LangChain
## π¦ Installation (Local)
To run this app on your own machine:
https://huggingface.co/spaces/ChiragKaushikCK/Chat_with_PDF
**π Features Breakdown**
FAISS Vector Search
Replaces heavy database lookups with lightweight, in-memory similarity search.
Ensures responses are strictly grounded in your uploaded document.
Pre-loaded Models
The embedding models are cached (@st.cache_resource) to ensure the app feels snappy after the initial cold start.
Gemma-2-2B-IT
Google's latest lightweight open model.
Instruction-tuned for better Q&A performance compared to base models.
Small enough (~2.6B params) to fit in standard RAM.
**β οΈ Limitations**
Speed: Since this runs on CPU, generating long answers may take a few seconds.
Memory: Designed for standard PDFs. Extremely large files (500+ pages) might hit RAM limits on free tiers.
Session: Chat history is cleared if the page is refreshed.
π€ Contributing
Contributions are welcome! Please feel free to submit issues or pull requests to improve the UI or add new features.
π License
MIT License
π Links
Google Gemma Models
LangChain Documentation
Streamlit
<div align="center"> Made with β€οΈ with Streamlit and Gemma model, by Tannu Yadav </div>
|