Arif
Fixed the readme for huggingface
9b4876f
---
title: "Generative AI Portfolio Project"
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: "0.0.0" # (or blank, Spaces fills this)
app_file: app.py
pinned: false
---
# RAG Portfolio Project
A state-of-the-art Retrieval-Augmented Generation (RAG) system leveraging modern generative AI and vector search technologies. This project demonstrates how to build a production-grade system that enables advanced question answering, document search, and contextual generation on your own infrastructureβ€”private, scalable, and fast.
---
## Table of Contents
- Project Overview
- Features
- Tech Stack
- Getting Started
- Architecture
- API Endpoints
- Usage Examples
- Testing
- Project Structure
- Troubleshooting
- Contributing
- License
---
## Project Overview
This project showcases how to combine large language models (LLMs), local vector databases, and a modern Python web API for secure, high-performance knowledge and document retrieval. All LLM operations run locallyβ€”no data leaves your machine.
Ideal for applications in internal research, enterprise QA, knowledge management, or compliance-sensitive AI tasks.
---
## Features
- **Local LLM Inference:** Runs entirely on your machine using Ollama and open-source models (e.g., Llama 3.1).
- **Vector Database Search:** Uses Qdrant for fast, scalable semantic retrieval.
- **Flexible Document Ingestion:** Upload PDF, DOCX, or TXT files for indexing and search.
- **FastAPI Back End:** High-concurrency, type-safe REST API with automatic documentation.
- **Modern Python Package Management:** Built with `uv` for blazing-fast dependency resolution.
- **Modular, Extensible Codebase:** Clean architecture, easy to extend and maintain.
- **Privacy and Security:** No cloud callsβ€”ideal for regulated sectors.
- **Fully Containerizable:** Easily deploy with Docker.
---
## Tech Stack
- **LLM:** Ollama (local inference engine), Llama 3.1
- **Vector DB:** Qdrant
- **Embeddings:** Sentence Transformers
- **API:** FastAPI + Uvicorn
- **Package Manager:** uv
- **Code Editor:** Cursor (recommended)
- **Testing & Quality:** Pytest, Black, Ruff
- **DevOps:** Docker-ready
---
## Getting Started
### 1. Prerequisites
- Python 3.10+
- `uv` package manager
- Ollama installed locally
- Qdrant (Docker recommended)
### 2. Setup
first fork my repository
git clone https://github.com/YOUR_USERNAME/rag-portfolio-project.git
cd rag-portfolio-project
uv sync
cp .env.example .env
(Update .env if needed)
### 3. Start Qdrant (Vector DB)
docker run -p 6333:6333 qdrant/qdrant
text
### 4. Pull Ollama LLM Model
ollama pull llama3.1
text
### 5. Run the FastAPI Application
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
text
### 6. Open API Documentation
Access at: [http://localhost:8000/docs](http://localhost:8000/docs)
---
## Architecture
text
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ User β”‚
β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
β”‚ FastAPI REST β”‚
β”‚ Backend β”‚
β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚
β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Document β”‚ β”‚ Query, RAG β”‚
β”‚ Ingestionβ”‚ β”‚ Chain & Gen. β”‚
β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Embedding β”‚
β”‚ Generation β”‚
β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Qdrant β”‚
β”‚ Vector DB β”‚
β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Ollama LLM β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
text
**Workflow:**
- Documents are split into semantic chunks and indexed as vectors.
- Sentence Transformers generate embeddings.
- Qdrant retrieves the most relevant contexts.
- Ollama answers using retrieved context (true RAG).
---
## API Endpoints
| Method | Path | Description |
|--------|----------------|-----------------------------------|
| GET | `/` | Root endpoint |
| GET | `/health` | Check system status |
| POST | `/ingest/file` | Upload and index document |
| POST | `/query` | Query system for answer |
| DELETE | `/reset` | Reset vector database (danger!) |
Docs available at [http://localhost:8000/docs](http://localhost:8000/docs)
---
## Usage Examples
1. Upload a Document (.pdf/.docx/.txt)
curl -X POST "http://localhost:8000/ingest/file"
-H "accept: application/json"
-F "file=@your_document.pdf"
2. Query the System
curl -X POST "http://localhost:8000/query"
-H "Content-Type: application/json"
-d '{"question": "What is the key insight in the uploaded document?", "top_k": 5}'
3. Reset Collection
curl -X DELETE "http://localhost:8000/reset"
text
---
## Testing
- Unit tests in `/tests` using Pytest.
- Run all tests:
uv run pytest
text
- Ensure formatting and linting:
uv run black app/ tests/
uv run ruff app/ tests/
text
---
## Project Structure
rag-portfolio-project/
β”œβ”€β”€ .env
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ README.md
β”œβ”€β”€ app/
β”‚ β”œβ”€β”€ main.py
β”‚ β”œβ”€β”€ config.py
β”‚ β”œβ”€β”€ models/
β”‚ β”œβ”€β”€ core/
β”‚ β”œβ”€β”€ services/
β”‚ └── api/
β”œβ”€β”€ data/
β”‚ β”œβ”€β”€ documents/
β”‚ └── processed/
β”œβ”€β”€ tests/
β”‚ └── test_rag.py
└── scripts/
β”œβ”€β”€ setup_qdrant.py
└── ingest_documents.py
text
---
## Troubleshooting
- **Missing Modules?** Run `uv add <module-name>`
- **Ollama Model Not Found?** Check with `ollama list` or update `.env`
- **Qdrant Not Running?** Ensure the Docker container is up (`docker ps`)
- **File Upload Errors?** Install `python-multipart`
---
## Contributing
Contributions are welcome! Fork the repo, open issues, or submit pull requests for enhancements or bug fixes.
---
## License
Open-source under the MIT License.
---
## Questions?
Contact the repository owner or open an issue – happy to help!