Spaces:
Sleeping
Sleeping
File size: 6,295 Bytes
9b4876f ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 92ab414 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 ca592ac 68d0867 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 |
---
title: "Generative AI Portfolio Project"
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: "0.0.0" # (or blank, Spaces fills this)
app_file: app.py
pinned: false
---
# RAG Portfolio Project
A state-of-the-art Retrieval-Augmented Generation (RAG) system leveraging modern generative AI and vector search technologies. This project demonstrates how to build a production-grade system that enables advanced question answering, document search, and contextual generation on your own infrastructureβprivate, scalable, and fast.
---
## Table of Contents
- Project Overview
- Features
- Tech Stack
- Getting Started
- Architecture
- API Endpoints
- Usage Examples
- Testing
- Project Structure
- Troubleshooting
- Contributing
- License
---
## Project Overview
This project showcases how to combine large language models (LLMs), local vector databases, and a modern Python web API for secure, high-performance knowledge and document retrieval. All LLM operations run locallyβno data leaves your machine.
Ideal for applications in internal research, enterprise QA, knowledge management, or compliance-sensitive AI tasks.
---
## Features
- **Local LLM Inference:** Runs entirely on your machine using Ollama and open-source models (e.g., Llama 3.1).
- **Vector Database Search:** Uses Qdrant for fast, scalable semantic retrieval.
- **Flexible Document Ingestion:** Upload PDF, DOCX, or TXT files for indexing and search.
- **FastAPI Back End:** High-concurrency, type-safe REST API with automatic documentation.
- **Modern Python Package Management:** Built with `uv` for blazing-fast dependency resolution.
- **Modular, Extensible Codebase:** Clean architecture, easy to extend and maintain.
- **Privacy and Security:** No cloud callsβideal for regulated sectors.
- **Fully Containerizable:** Easily deploy with Docker.
---
## Tech Stack
- **LLM:** Ollama (local inference engine), Llama 3.1
- **Vector DB:** Qdrant
- **Embeddings:** Sentence Transformers
- **API:** FastAPI + Uvicorn
- **Package Manager:** uv
- **Code Editor:** Cursor (recommended)
- **Testing & Quality:** Pytest, Black, Ruff
- **DevOps:** Docker-ready
---
## Getting Started
### 1. Prerequisites
- Python 3.10+
- `uv` package manager
- Ollama installed locally
- Qdrant (Docker recommended)
### 2. Setup
first fork my repository
git clone https://github.com/YOUR_USERNAME/rag-portfolio-project.git
cd rag-portfolio-project
uv sync
cp .env.example .env
(Update .env if needed)
### 3. Start Qdrant (Vector DB)
docker run -p 6333:6333 qdrant/qdrant
text
### 4. Pull Ollama LLM Model
ollama pull llama3.1
text
### 5. Run the FastAPI Application
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
text
### 6. Open API Documentation
Access at: [http://localhost:8000/docs](http://localhost:8000/docs)
---
## Architecture
text
ββββββββββββββ
β User β
βββββββ¬βββββββ
β
ββββββββΌββββββββ
β FastAPI REST β
β Backend β
βββββββ¬βββββββββ
ββββββββββββββ΄βββββββββββββ
β β
βββββΌββββββ βββββββββΌβββββββββ
β Document β β Query, RAG β
β Ingestionβ β Chain & Gen. β
βββββ¬βββββββ ββββββββββββββββββ
β
βββββΌβββββββββ
β Embedding β
β Generation β
βββββ¬βββββββββ
β
βββββΌββββββββββ
β Qdrant β
β Vector DB β
βββββ¬ββββββββββ
β
βββββΌββββββββββ
β Ollama LLM β
βββββββββββββββ
text
**Workflow:**
- Documents are split into semantic chunks and indexed as vectors.
- Sentence Transformers generate embeddings.
- Qdrant retrieves the most relevant contexts.
- Ollama answers using retrieved context (true RAG).
---
## API Endpoints
| Method | Path | Description |
|--------|----------------|-----------------------------------|
| GET | `/` | Root endpoint |
| GET | `/health` | Check system status |
| POST | `/ingest/file` | Upload and index document |
| POST | `/query` | Query system for answer |
| DELETE | `/reset` | Reset vector database (danger!) |
Docs available at [http://localhost:8000/docs](http://localhost:8000/docs)
---
## Usage Examples
1. Upload a Document (.pdf/.docx/.txt)
curl -X POST "http://localhost:8000/ingest/file"
-H "accept: application/json"
-F "file=@your_document.pdf"
2. Query the System
curl -X POST "http://localhost:8000/query"
-H "Content-Type: application/json"
-d '{"question": "What is the key insight in the uploaded document?", "top_k": 5}'
3. Reset Collection
curl -X DELETE "http://localhost:8000/reset"
text
---
## Testing
- Unit tests in `/tests` using Pytest.
- Run all tests:
uv run pytest
text
- Ensure formatting and linting:
uv run black app/ tests/
uv run ruff app/ tests/
text
---
## Project Structure
rag-portfolio-project/
βββ .env
βββ pyproject.toml
βββ README.md
βββ app/
β βββ main.py
β βββ config.py
β βββ models/
β βββ core/
β βββ services/
β βββ api/
βββ data/
β βββ documents/
β βββ processed/
βββ tests/
β βββ test_rag.py
βββ scripts/
βββ setup_qdrant.py
βββ ingest_documents.py
text
---
## Troubleshooting
- **Missing Modules?** Run `uv add <module-name>`
- **Ollama Model Not Found?** Check with `ollama list` or update `.env`
- **Qdrant Not Running?** Ensure the Docker container is up (`docker ps`)
- **File Upload Errors?** Install `python-multipart`
---
## Contributing
Contributions are welcome! Fork the repo, open issues, or submit pull requests for enhancements or bug fixes.
---
## License
Open-source under the MIT License.
---
## Questions?
Contact the repository owner or open an issue β happy to help! |