Arif commited on
Commit
ca592ac
Β·
1 Parent(s): 68d0867

Updated readme

Browse files
Files changed (1) hide show
  1. README.md +169 -176
README.md CHANGED
@@ -1,228 +1,221 @@
1
- RAG Portfolio Project
2
- A state-of-the-art Retrieval-Augmented Generation (RAG) system leveraging modern generative AI and vector search technologies. This project demonstrates how to build a production-grade system that enables advanced question answering, document search, and contextual generation on your own infrastructureβ€”private, scalable, and fast.
3
-
4
- Table of Contents
5
- Project Overview
6
-
7
- Features
8
-
9
- Tech Stack
10
-
11
- Getting Started
12
-
13
- Architecture
14
-
15
- API Endpoints
16
-
17
- Usage Examples
18
 
19
- Testing
20
-
21
- Project Structure
22
-
23
- Troubleshooting
24
-
25
- Contributing
26
-
27
- License
28
 
29
- Project Overview
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  This project showcases how to combine large language models (LLMs), local vector databases, and a modern Python web API for secure, high-performance knowledge and document retrieval. All LLM operations run locallyβ€”no data leaves your machine.
31
- It is ideal for applications in internal research, enterprise QA, knowledge management, or compliance-sensitive AI tasks.
32
-
33
- Features
34
- Local LLM Inference: Runs entirely on your machine using Ollama and open-source models (e.g., Llama 3.1).
35
-
36
- Vector Database Search: Uses Qdrant for fast, scalable semantic retrieval.
37
-
38
- Flexible Document Ingestion: Upload PDF, DOCX, or TXT files for indexing and search.
39
 
40
- FastAPI Back End: High-concurrency, type-safe REST API with automatic docs.
41
 
42
- Modern Python Package Management: Built with uv for blazing-fast dependency resolution.
43
 
44
- Modular, Extensible Codebase: Clean architecture, easy to extend/maintain.
 
 
 
 
 
 
 
 
45
 
46
- Privacy and Security: No cloud callsβ€”ideal for regulated sectors.
47
 
48
- Fully Containerizable: Easily deploy with Docker.
 
 
 
 
 
 
 
 
49
 
50
- Tech Stack
51
- LLM: Ollama (local inference engine), Llama 3.1
52
 
53
- Vector DB: Qdrant
54
 
55
- Embeddings: Sentence Transformers
 
 
 
 
56
 
57
- API: FastAPI + Uvicorn
58
-
59
- Package Manager: uv
60
-
61
- Code Editor: Cursor (recommended)
62
-
63
- Testing & Quality: Pytest, Black, Ruff
64
-
65
- DevOps: Docker-ready
66
-
67
- Getting Started
68
- 1. Prerequisites
69
- Python 3.10+
70
-
71
- uv package manager
72
-
73
- Ollama installed locally
74
-
75
- Qdrant (Docker recommended)
76
-
77
- 2. Setup
78
-
79
- # Clone the repository
80
  git clone https://github.com/YOUR_USERNAME/rag-portfolio-project.git
81
  cd rag-portfolio-project
82
-
83
- # Install dependencies
84
  uv sync
85
-
86
- # Copy and configure environment variables
87
  cp .env.example .env
88
- # (Update .env if needed)
89
 
 
90
 
91
- 3. Start Qdrant (Vector DB)
92
 
93
  docker run -p 6333:6333 qdrant/qdrant
94
 
95
- 4. Pull Ollama LLM Model
96
 
 
97
  ollama pull llama3.1
98
 
 
99
 
100
- 5. Run the FastAPI Application
101
-
102
  uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
103
 
104
- 6. Open API Documentation
105
- Access interactive docs at:
106
- http://localhost:8000/docs
107
-
108
-
109
- Architecture
110
-
111
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
112
- β”‚ User β”‚
113
- β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
114
- β”‚
115
- β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
116
- β”‚ FastAPI REST β”‚
117
- β”‚ Backend β”‚
118
- β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
119
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€οΏ½οΏ½β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
120
- β”‚ β”‚
121
- β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
122
- β”‚ Document β”‚ β”‚ Query, RAG Chain β”‚
123
- β”‚ Ingestion β”‚ β”‚ & Generation β”‚
124
- β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
125
- β”‚
126
- β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
127
- β”‚ Embedding β”‚
128
- β”‚ Generation β”‚
129
- β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
130
- β”‚
131
- β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
132
- β”‚ Qdrant Vector β”‚
133
- β”‚ Database (DB) β”‚
134
- β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
135
- β”‚
136
- β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
137
- β”‚ Ollama LLM β”‚
138
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
139
-
140
- Document Ingestion: Split files into semantic chunks and index as vectors.
141
-
142
- Embedding Generation: Semantic vectors via Sentence Transformers.
143
-
144
- Vector Search: Qdrant returns most relevant contexts for input queries.
145
-
146
- Generative Augmentation: Ollama answers using retrieved context (true RAG).
147
-
148
- API Endpoints
149
-
150
- Method | Path | Description
151
- --------+----------------+---------------------------------
152
- GET | / | Root endpoint
153
- GET | /health | Check system status
154
- POST | /ingest/file | Upload and index document
155
- POST | /query | Query system for answer
156
- DELETE | /reset | Reset vector database (danger!)
157
-
158
- Automated docs: http://localhost:8000/docs
159
-
160
- Usage Examples
161
- 1. Upload a Document (.pdf/.docx/.txt):
162
- curl -X POST "http://localhost:8000/ingest/file" \
163
- -H "accept: application/json" \
164
- -F "file=@your_document.pdf"
165
-
166
- 2. Query the System:
167
- curl -X POST "http://localhost:8000/query" \
168
- -H "Content-Type: application/json" \
169
- -d '{"question": "What is the key insight in the uploaded document?", "top_k": 5}'
170
-
171
- 3. Reset Collection:
 
 
 
 
 
172
  curl -X DELETE "http://localhost:8000/reset"
173
 
 
174
 
175
- Testing
176
- Unit tests provided in /tests using Pytest.
177
 
178
- Run all tests:
 
 
179
  uv run pytest
180
 
181
- Ensure code quality:
 
182
  uv run black app/ tests/
183
  uv run ruff app/ tests/
184
 
 
 
 
185
 
186
- Project Structure
187
  rag-portfolio-project/
188
- β”œβ”€β”€ .env # Environment config
189
- β”œβ”€β”€ pyproject.toml # Dependencies config
190
- ��── README.md # This documentation
191
  β”œβ”€β”€ app/
192
- β”‚ β”œβ”€β”€ main.py # FastAPI app
193
- β”‚ β”œβ”€β”€ config.py # Config loader
194
- β”‚ β”œβ”€β”€ models/ # Pydantic schemas
195
- β”‚ β”œβ”€β”€ core/ # LLM, embeddings, vector DB
196
- β”‚ β”œβ”€β”€ services/ # Document ingestion, RAG chain
197
- β”‚ └── api/ # API routes and dependencies
198
  β”œβ”€β”€ data/
199
- β”‚ β”œβ”€β”€ documents/ # Raw document storage
200
- β”‚ └── processed/ # Chunked files
201
  β”œβ”€β”€ tests/
202
- β”‚ └── test_rag.py # Unit tests
203
  └── scripts/
204
- β”œβ”€β”€ setup_qdrant.py # DB utils
205
- └── ingest_documents.py # Bulk ingest
206
 
 
207
 
208
- Troubleshooting
209
- Missing Modules?
210
- Run uv add <module-name> for any missing Python packages.
211
 
212
- Ollama Model Not Found?
213
- Double-check model name with ollama list and update .env.
 
 
 
214
 
215
- Qdrant Not Running?
216
- Ensure container is up (docker ps).
217
 
218
- File Upload Errors?
219
- Check you have python-multipart installed.
220
 
221
- Contributing
222
- Contributions are welcome! Please fork the repository, open issues, or submit pull requests for bug fixes, docs improvements, or new features.
223
 
224
- License
225
  Open-source under the MIT License.
226
 
227
- Questions?
 
 
228
  Contact the repository owner or open an issue – happy to help!
 
1
+ # RAG Portfolio Project
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
+ A state-of-the-art Retrieval-Augmented Generation (RAG) system leveraging modern generative AI and vector search technologies. This project demonstrates how to build a production-grade system that enables advanced question answering, document search, and contextual generation on your own infrastructureβ€”private, scalable, and fast.
 
 
 
 
 
 
 
 
4
 
5
+ ---
6
+
7
+ ## Table of Contents
8
+ - Project Overview
9
+ - Features
10
+ - Tech Stack
11
+ - Getting Started
12
+ - Architecture
13
+ - API Endpoints
14
+ - Usage Examples
15
+ - Testing
16
+ - Project Structure
17
+ - Troubleshooting
18
+ - Contributing
19
+ - License
20
+
21
+ ---
22
+
23
+ ## Project Overview
24
  This project showcases how to combine large language models (LLMs), local vector databases, and a modern Python web API for secure, high-performance knowledge and document retrieval. All LLM operations run locallyβ€”no data leaves your machine.
 
 
 
 
 
 
 
 
25
 
26
+ Ideal for applications in internal research, enterprise QA, knowledge management, or compliance-sensitive AI tasks.
27
 
28
+ ---
29
 
30
+ ## Features
31
+ - **Local LLM Inference:** Runs entirely on your machine using Ollama and open-source models (e.g., Llama 3.1).
32
+ - **Vector Database Search:** Uses Qdrant for fast, scalable semantic retrieval.
33
+ - **Flexible Document Ingestion:** Upload PDF, DOCX, or TXT files for indexing and search.
34
+ - **FastAPI Back End:** High-concurrency, type-safe REST API with automatic documentation.
35
+ - **Modern Python Package Management:** Built with `uv` for blazing-fast dependency resolution.
36
+ - **Modular, Extensible Codebase:** Clean architecture, easy to extend and maintain.
37
+ - **Privacy and Security:** No cloud callsβ€”ideal for regulated sectors.
38
+ - **Fully Containerizable:** Easily deploy with Docker.
39
 
40
+ ---
41
 
42
+ ## Tech Stack
43
+ - **LLM:** Ollama (local inference engine), Llama 3.1
44
+ - **Vector DB:** Qdrant
45
+ - **Embeddings:** Sentence Transformers
46
+ - **API:** FastAPI + Uvicorn
47
+ - **Package Manager:** uv
48
+ - **Code Editor:** Cursor (recommended)
49
+ - **Testing & Quality:** Pytest, Black, Ruff
50
+ - **DevOps:** Docker-ready
51
 
52
+ ---
 
53
 
54
+ ## Getting Started
55
 
56
+ ### 1. Prerequisites
57
+ - Python 3.10+
58
+ - `uv` package manager
59
+ - Ollama installed locally
60
+ - Qdrant (Docker recommended)
61
 
62
+ ### 2. Setup
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  git clone https://github.com/YOUR_USERNAME/rag-portfolio-project.git
64
  cd rag-portfolio-project
 
 
65
  uv sync
 
 
66
  cp .env.example .env
 
67
 
68
+ (Update .env if needed)
69
 
70
+ ### 3. Start Qdrant (Vector DB)
71
 
72
  docker run -p 6333:6333 qdrant/qdrant
73
 
74
+ text
75
 
76
+ ### 4. Pull Ollama LLM Model
77
  ollama pull llama3.1
78
 
79
+ text
80
 
81
+ ### 5. Run the FastAPI Application
 
82
  uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
83
 
84
+ text
85
+
86
+ ### 6. Open API Documentation
87
+ Access at: [http://localhost:8000/docs](http://localhost:8000/docs)
88
+
89
+ ---
90
+
91
+ ## Architecture
92
+ text
93
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
94
+ β”‚ User β”‚
95
+ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
96
+ β”‚
97
+ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
98
+ β”‚ FastAPI REST β”‚
99
+ β”‚ Backend β”‚
100
+ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
101
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
102
+ β”‚ β”‚
103
+ β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
104
+ β”‚ Document β”‚ β”‚ Query, RAG β”‚
105
+ β”‚ Ingestionβ”‚ β”‚ Chain & Gen. β”‚
106
+ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
107
+ β”‚
108
+ β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
109
+ β”‚ Embedding β”‚
110
+ β”‚ Generation β”‚
111
+ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
112
+ β”‚
113
+ β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
114
+ β”‚ Qdrant β”‚
115
+ β”‚ Vector DB β”‚
116
+ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
117
+ β”‚
118
+ β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
119
+ β”‚ Ollama LLM β”‚
120
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
121
+
122
+ text
123
+
124
+ **Workflow:**
125
+ - Documents are split into semantic chunks and indexed as vectors.
126
+ - Sentence Transformers generate embeddings.
127
+ - Qdrant retrieves the most relevant contexts.
128
+ - Ollama answers using retrieved context (true RAG).
129
+
130
+ ---
131
+
132
+ ## API Endpoints
133
+ | Method | Path | Description |
134
+ |--------|----------------|-----------------------------------|
135
+ | GET | `/` | Root endpoint |
136
+ | GET | `/health` | Check system status |
137
+ | POST | `/ingest/file` | Upload and index document |
138
+ | POST | `/query` | Query system for answer |
139
+ | DELETE | `/reset` | Reset vector database (danger!) |
140
+
141
+ Docs available at [http://localhost:8000/docs](http://localhost:8000/docs)
142
+
143
+ ---
144
+
145
+ ## Usage Examples
146
+ 1. Upload a Document (.pdf/.docx/.txt)
147
+ curl -X POST "http://localhost:8000/ingest/file"
148
+ -H "accept: application/json"
149
+ -F "file=@your_document.pdf"
150
+
151
+ 2. Query the System
152
+ curl -X POST "http://localhost:8000/query"
153
+ -H "Content-Type: application/json"
154
+ -d '{"question": "What is the key insight in the uploaded document?", "top_k": 5}'
155
+
156
+ 3. Reset Collection
157
  curl -X DELETE "http://localhost:8000/reset"
158
 
159
+ text
160
 
161
+ ---
 
162
 
163
+ ## Testing
164
+ - Unit tests in `/tests` using Pytest.
165
+ - Run all tests:
166
  uv run pytest
167
 
168
+ text
169
+ - Ensure formatting and linting:
170
  uv run black app/ tests/
171
  uv run ruff app/ tests/
172
 
173
+ text
174
+
175
+ ---
176
 
177
+ ## Project Structure
178
  rag-portfolio-project/
179
+ β”œβ”€β”€ .env
180
+ β”œβ”€β”€ pyproject.toml
181
+ β”œβ”€β”€ README.md
182
  β”œβ”€β”€ app/
183
+ β”‚ β”œβ”€β”€ main.py
184
+ β”‚ β”œβ”€β”€ config.py
185
+ β”‚ β”œβ”€β”€ models/
186
+ β”‚ β”œβ”€β”€ core/
187
+ β”‚ β”œβ”€β”€ services/
188
+ β”‚ └── api/
189
  β”œβ”€β”€ data/
190
+ β”‚ β”œβ”€β”€ documents/
191
+ β”‚ └── processed/
192
  β”œβ”€β”€ tests/
193
+ β”‚ └── test_rag.py
194
  └── scripts/
195
+ β”œβ”€β”€ setup_qdrant.py
196
+ └── ingest_documents.py
197
 
198
+ text
199
 
200
+ ---
 
 
201
 
202
+ ## Troubleshooting
203
+ - **Missing Modules?** Run `uv add <module-name>`
204
+ - **Ollama Model Not Found?** Check with `ollama list` or update `.env`
205
+ - **Qdrant Not Running?** Ensure the Docker container is up (`docker ps`)
206
+ - **File Upload Errors?** Install `python-multipart`
207
 
208
+ ---
 
209
 
210
+ ## Contributing
211
+ Contributions are welcome! Fork the repo, open issues, or submit pull requests for enhancements or bug fixes.
212
 
213
+ ---
 
214
 
215
+ ## License
216
  Open-source under the MIT License.
217
 
218
+ ---
219
+
220
+ ## Questions?
221
  Contact the repository owner or open an issue – happy to help!