Spaces:

Tannuyadav
/

DocTalk-Chat_With_PDF

Running

DocTalk-Chat_With_PDF / README.md

Update README.md

651b4cf verified 2 months ago

3 kB

A newer version of the Streamlit SDK is available: 1.54.0

Upgrade

title: DocTalk - Chat With PDF
emoji: 📗💬
colorFrom: indigo
colorTo: pink
sdk: streamlit
sdk_version: 1.35.0
app_file: app.py
pinned: false

📗💬 DocTalk - Chat With PDF

An intelligent, completely free-to-run PDF chat application powered by Google's Gemma-2-2b-it model. Optimized for CPU usage on Hugging Face Spaces.

This app requires a Hugging Face Access Token (Read permissions) to download the Gemma model.
For Users: Enter your token in the app sidebar if prompted (or set it in Space secrets).

To run this app on your own machine:

🌟 Features Breakdown FAISS Vector Search Replaces heavy database lookups with lightweight, in-memory similarity search.

Ensures responses are strictly grounded in your uploaded document.

Pre-loaded Models The embedding models are cached (@st.cache_resource) to ensure the app feels snappy after the initial cold start.

Gemma-2-2B-IT Google's latest lightweight open model.

Instruction-tuned for better Q&A performance compared to base models.

Small enough (~2.6B params) to fit in standard RAM.

⚠️ Limitations Speed: Since this runs on CPU, generating long answers may take a few seconds.

Memory: Designed for standard PDFs. Extremely large files (500+ pages) might hit RAM limits on free tiers.

Session: Chat history is cleared if the page is refreshed.

🤝 Contributing Contributions are welcome! Please feel free to submit issues or pull requests to improve the UI or add new features.

📄 License MIT License

🔗 Links Google Gemma Models

LangChain Documentation

Streamlit

Made with ❤️ with Streamlit and Gemma model, by Tannu Yadav