MOHITRAJDEO12345 commited on
Commit
e12aa8b
·
1 Parent(s): ad82d21

readme updated

Browse files
Files changed (1) hide show
  1. README.md +72 -4
README.md CHANGED
@@ -12,9 +12,77 @@ short_description: The DocuMind system, as outlined and implemented in this rep
12
  license: mit
13
  ---
14
 
15
- # Welcome to Streamlit!
16
 
17
- Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
 
18
 
19
- If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
20
- forums](https://discuss.streamlit.io).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  license: mit
13
  ---
14
 
15
+ # DocuMind: Advanced Document Intelligence Platform
16
 
17
+ ## Overview
18
+ DocuMind is an AI-powered document intelligence platform that transforms static PDF documents into interactive knowledge sources. It leverages Google's Gemini AI, ChromaDB, and Streamlit to provide semantic search, conversational question answering, and source attribution with confidence scores.
19
 
20
+ ## Features
21
+ - Intelligent PDF ingestion and chunking
22
+ - Semantic search with Google Generative AI embeddings
23
+ - AI-powered question answering (Gemini 2.0)
24
+ - Source attribution: page numbers, file names, content previews
25
+ - Confidence scoring system (Very High to Very Low)
26
+ - Modern, responsive Streamlit web interface
27
+ - Dockerized for easy deployment (Hugging Face Spaces supported)
28
+
29
+ ## Installation Guide
30
+
31
+ ### 1. Clone the Repository
32
+ ```bash
33
+ git clone https://huggingface.co/spaces/KingArthur111/DocuMind.git
34
+ cd DocuMind
35
+ ```
36
+
37
+ ### 2. Set Up Python Environment
38
+ ```bash
39
+ python -m venv .venv
40
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
41
+ pip install --upgrade pip
42
+ pip install -r requirements.txt
43
+ ```
44
+
45
+ ### 3. Run Locally
46
+ ```bash
47
+ streamlit run src/streamlit_app.py
48
+ ```
49
+
50
+ ### 4. Docker Deployment
51
+ Build and run the app in Docker:
52
+ ```bash
53
+ docker build -t documind .
54
+ docker run -p 8501:8501 documind
55
+ ```
56
+
57
+ ### 5. Hugging Face Spaces
58
+ Just push to your Hugging Face Space and it will auto-build using the provided Dockerfile.
59
+
60
+ ## Usage
61
+ 1. Upload one or more PDF documents.
62
+ 2. Ask questions in natural language.
63
+ 3. View answers with source citations, page numbers, and confidence scores.
64
+ 4. Explore document context and preview relevant content.
65
+
66
+ ## Screenshots
67
+ Add screenshots here to showcase:
68
+ - The document upload and QA interface
69
+ - Example answer with source attribution and confidence scores
70
+
71
+ ```
72
+ ![DocuMind Upload Screen](screenshots/upload.png)
73
+ ![DocuMind QA Screen](screenshots/qa.png)
74
+ ```
75
+
76
+ ## Future Upgrades
77
+ - <Real-World Challenge: RAG systems struggle with context windows and multi-step reasoning>
78
+ - <Narrative Hook: " An AI that remembers conversations and connects the dots ">
79
+ - Build an advanced RAG system that maintains conversation memory, handles multi-turn queries, and retrieves from multiple data sources (documents, databases, APIs).
80
+ - Include advanced chunking, re-ranking, and query expansion techniques.
81
+ - Tech Stack: LangChain/LlamaIndex, vector databases, Redis, FastAPI, advanced embedding models
82
+ - Success Metrics: Handle 10+ turn conversations, improve accuracy to 90%
83
+
84
+ ## References
85
+ See [WHITEPAPER.md](WHITEPAPER.md) for a full technical and business overview.
86
+
87
+ ---
88
+ Built with ❤️ using Streamlit, Gemini AI, and ChromaDB