Arif
Fixed the readme for huggingface
9b4876f
metadata
title: Generative AI Portfolio Project
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: 0.0.0
app_file: app.py
pinned: false

RAG Portfolio Project

A state-of-the-art Retrieval-Augmented Generation (RAG) system leveraging modern generative AI and vector search technologies. This project demonstrates how to build a production-grade system that enables advanced question answering, document search, and contextual generation on your own infrastructureβ€”private, scalable, and fast.


Table of Contents

  • Project Overview
  • Features
  • Tech Stack
  • Getting Started
  • Architecture
  • API Endpoints
  • Usage Examples
  • Testing
  • Project Structure
  • Troubleshooting
  • Contributing
  • License

Project Overview

This project showcases how to combine large language models (LLMs), local vector databases, and a modern Python web API for secure, high-performance knowledge and document retrieval. All LLM operations run locallyβ€”no data leaves your machine.

Ideal for applications in internal research, enterprise QA, knowledge management, or compliance-sensitive AI tasks.


Features

  • Local LLM Inference: Runs entirely on your machine using Ollama and open-source models (e.g., Llama 3.1).
  • Vector Database Search: Uses Qdrant for fast, scalable semantic retrieval.
  • Flexible Document Ingestion: Upload PDF, DOCX, or TXT files for indexing and search.
  • FastAPI Back End: High-concurrency, type-safe REST API with automatic documentation.
  • Modern Python Package Management: Built with uv for blazing-fast dependency resolution.
  • Modular, Extensible Codebase: Clean architecture, easy to extend and maintain.
  • Privacy and Security: No cloud callsβ€”ideal for regulated sectors.
  • Fully Containerizable: Easily deploy with Docker.

Tech Stack

  • LLM: Ollama (local inference engine), Llama 3.1
  • Vector DB: Qdrant
  • Embeddings: Sentence Transformers
  • API: FastAPI + Uvicorn
  • Package Manager: uv
  • Code Editor: Cursor (recommended)
  • Testing & Quality: Pytest, Black, Ruff
  • DevOps: Docker-ready

Getting Started

1. Prerequisites

  • Python 3.10+
  • uv package manager
  • Ollama installed locally
  • Qdrant (Docker recommended)

2. Setup

first fork my repository git clone https://github.com/YOUR_USERNAME/rag-portfolio-project.git cd rag-portfolio-project uv sync cp .env.example .env

(Update .env if needed)

3. Start Qdrant (Vector DB)

docker run -p 6333:6333 qdrant/qdrant

text

4. Pull Ollama LLM Model

ollama pull llama3.1

text

5. Run the FastAPI Application

uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

text

6. Open API Documentation

Access at: http://localhost:8000/docs


Architecture

text β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ User β”‚ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β” β”‚ FastAPI REST β”‚ β”‚ Backend β”‚ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Document β”‚ β”‚ Query, RAG β”‚ β”‚ Ingestionβ”‚ β”‚ Chain & Gen. β”‚ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Embedding β”‚ β”‚ Generation β”‚ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Qdrant β”‚ β”‚ Vector DB β”‚ β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Ollama LLM β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

text

Workflow:

  • Documents are split into semantic chunks and indexed as vectors.
  • Sentence Transformers generate embeddings.
  • Qdrant retrieves the most relevant contexts.
  • Ollama answers using retrieved context (true RAG).

API Endpoints

Method Path Description
GET / Root endpoint
GET /health Check system status
POST /ingest/file Upload and index document
POST /query Query system for answer
DELETE /reset Reset vector database (danger!)

Docs available at http://localhost:8000/docs


Usage Examples

  1. Upload a Document (.pdf/.docx/.txt) curl -X POST "http://localhost:8000/ingest/file" -H "accept: application/json" -F "file=@your_document.pdf"

  2. Query the System curl -X POST "http://localhost:8000/query" -H "Content-Type: application/json" -d '{"question": "What is the key insight in the uploaded document?", "top_k": 5}'

  3. Reset Collection curl -X DELETE "http://localhost:8000/reset"

text


Testing

  • Unit tests in /tests using Pytest.
  • Run all tests: uv run pytest

text

  • Ensure formatting and linting: uv run black app/ tests/ uv run ruff app/ tests/

text


Project Structure

rag-portfolio-project/ β”œβ”€β”€ .env β”œβ”€β”€ pyproject.toml β”œβ”€β”€ README.md β”œβ”€β”€ app/ β”‚ β”œβ”€β”€ main.py β”‚ β”œβ”€β”€ config.py β”‚ β”œβ”€β”€ models/ β”‚ β”œβ”€β”€ core/ β”‚ β”œβ”€β”€ services/ β”‚ └── api/ β”œβ”€β”€ data/ β”‚ β”œβ”€β”€ documents/ β”‚ └── processed/ β”œβ”€β”€ tests/ β”‚ └── test_rag.py └── scripts/ β”œβ”€β”€ setup_qdrant.py └── ingest_documents.py

text


Troubleshooting

  • Missing Modules? Run uv add <module-name>
  • Ollama Model Not Found? Check with ollama list or update .env
  • Qdrant Not Running? Ensure the Docker container is up (docker ps)
  • File Upload Errors? Install python-multipart

Contributing

Contributions are welcome! Fork the repo, open issues, or submit pull requests for enhancements or bug fixes.


License

Open-source under the MIT License.


Questions?

Contact the repository owner or open an issue – happy to help!