snakeeee commited on
Commit
678baaf
Β·
1 Parent(s): adc176e

cache models for faster startup

Browse files
Files changed (2) hide show
  1. Dockerfile +6 -0
  2. README.md +119 -10
Dockerfile CHANGED
@@ -4,8 +4,14 @@ WORKDIR /app
4
 
5
  COPY . /app
6
 
 
7
  RUN pip install --no-cache-dir -r requirements.txt
8
 
 
 
 
 
 
9
  EXPOSE 7860
10
 
11
  CMD ["uvicorn","main:app","--host","0.0.0.0","--port","7860"]
 
4
 
5
  COPY . /app
6
 
7
+ # Install dependencies
8
  RUN pip install --no-cache-dir -r requirements.txt
9
 
10
+ # Pre-download models so they are cached in the image
11
+ RUN python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"
12
+
13
+ RUN python -c "from sentence_transformers import CrossEncoder; CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')"
14
+
15
  EXPOSE 7860
16
 
17
  CMD ["uvicorn","main:app","--host","0.0.0.0","--port","7860"]
README.md CHANGED
@@ -1,10 +1,119 @@
1
- ---
2
- title: Scholar Rag Engine
3
- emoji: πŸš€
4
- colorFrom: gray
5
- colorTo: red
6
- sdk: docker
7
- pinned: false
8
- ---
9
-
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # Scholar RAG Engine
3
+
4
+ Scholar RAG Engine is a Retrieval-Augmented Generation (RAG) system designed for answering questions from PDFs and web pages.
5
+
6
+ The system extracts content, builds semantic indexes, retrieves relevant context, and generates answers using an LLM.
7
+
8
+ ## Features
9
+
10
+ - PDF document indexing
11
+ - Website content scraping
12
+ - Hybrid semantic retrieval
13
+ - ColBERT-style retrieval
14
+ - Cross-encoder reranking
15
+ - LLM answer generation
16
+ - Modern UI with dark mode
17
+ - Expandable retrieved context viewer
18
+
19
+ ## Architecture
20
+
21
+ Pipeline:
22
+
23
+ User Query
24
+ ↓
25
+ Retriever (ColBERT)
26
+ ↓
27
+ Reranker (Cross Encoder)
28
+ ↓
29
+ Context Compression
30
+ ↓
31
+ LLM (Gemini)
32
+ ↓
33
+ Final Answer
34
+
35
+ ## Tech Stack
36
+
37
+ Backend:
38
+ - FastAPI
39
+ - Python
40
+
41
+ Retrieval:
42
+ - Sentence Transformers
43
+ - FAISS
44
+ - ColBERT-style token similarity
45
+
46
+ Ranking:
47
+ - Cross Encoder (MS MARCO)
48
+
49
+ LLM:
50
+ - Google Gemini API
51
+
52
+ Frontend:
53
+ - HTML
54
+ - CSS
55
+ - JavaScript
56
+
57
+ Deployment:
58
+ - Hugging Face Spaces
59
+ - Docker
60
+
61
+ ## Project Structure
62
+ scholar-rag-engine
63
+ β”‚
64
+ β”œβ”€β”€ main.py
65
+ β”œβ”€β”€ ingestion.py
66
+ β”œβ”€β”€ chunking.py
67
+ β”œβ”€β”€ scraper.py
68
+ β”œβ”€β”€ retrieval_colbert.py
69
+ β”œβ”€β”€ reranker.py
70
+ β”œβ”€β”€ LLM.py
71
+ β”œβ”€β”€ requirements.txt
72
+ β”œβ”€β”€ Dockerfile
73
+ β”‚
74
+ └── templates
75
+ └── index.html
76
+
77
+
78
+ ## Installation
79
+
80
+ Clone the repository
81
+ git clone https://github.com/mr-snake-mr/scholar-rag-engine
82
+ cd scholar-rag-engine
83
+
84
+
85
+ Install dependencies
86
+ pip install -r requirements.txt
87
+
88
+
89
+ Run the server
90
+ uvicorn main:app --reload
91
+
92
+
93
+ Open in browser
94
+ http://localhost:8000
95
+
96
+
97
+ ## Environment Variables
98
+
99
+ Set your Gemini API key:
100
+ GOOGLE_API_KEY=your_gemini_api_key
101
+
102
+
103
+ ## Deployment
104
+
105
+ This project is deployed on Hugging Face Spaces using Docker.
106
+ https://huggingface.co/spaces/snakeeee/scholar-rag-engine
107
+
108
+
109
+ ## Future Improvements
110
+
111
+ - Streaming responses
112
+ - Chat-style UI
113
+ - Multi-document support
114
+ - Vector database integration
115
+ - GPU acceleration
116
+
117
+ ## Author
118
+
119
+ Developed as an AI-powered research assistant project.