Spaces:

Arif-Badhon
/

Generative_AI_Project

Sleeping

App Files Files Community

Generative_AI_Project / README.md

Arif

Fixed the readme for huggingface

9b4876f 3 months ago

preview code

raw

history blame contribute delete

6.3 kB

metadata

title: Generative AI Portfolio Project
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: 0.0.0
app_file: app.py
pinned: false

RAG Portfolio Project

A state-of-the-art Retrieval-Augmented Generation (RAG) system leveraging modern generative AI and vector search technologies. This project demonstrates how to build a production-grade system that enables advanced question answering, document search, and contextual generation on your own infrastructure—private, scalable, and fast.

Project Overview
Features
Tech Stack
Getting Started
Architecture
API Endpoints
Usage Examples
Testing
Project Structure
Troubleshooting
Contributing
License

Project Overview

This project showcases how to combine large language models (LLMs), local vector databases, and a modern Python web API for secure, high-performance knowledge and document retrieval. All LLM operations run locally—no data leaves your machine.

Ideal for applications in internal research, enterprise QA, knowledge management, or compliance-sensitive AI tasks.

Features

Local LLM Inference: Runs entirely on your machine using Ollama and open-source models (e.g., Llama 3.1).
Vector Database Search: Uses Qdrant for fast, scalable semantic retrieval.
Flexible Document Ingestion: Upload PDF, DOCX, or TXT files for indexing and search.
FastAPI Back End: High-concurrency, type-safe REST API with automatic documentation.
Modern Python Package Management: Built with uv for blazing-fast dependency resolution.
Modular, Extensible Codebase: Clean architecture, easy to extend and maintain.
Privacy and Security: No cloud calls—ideal for regulated sectors.
Fully Containerizable: Easily deploy with Docker.

Tech Stack

LLM: Ollama (local inference engine), Llama 3.1
Vector DB: Qdrant
Embeddings: Sentence Transformers
API: FastAPI + Uvicorn
Package Manager: uv
Code Editor: Cursor (recommended)
Testing & Quality: Pytest, Black, Ruff
DevOps: Docker-ready

Getting Started

1. Prerequisites

Python 3.10+
uv package manager
Ollama installed locally
Qdrant (Docker recommended)

2. Setup

first fork my repository git clone https://github.com/YOUR_USERNAME/rag-portfolio-project.git cd rag-portfolio-project uv sync cp .env.example .env

(Update .env if needed)

3. Start Qdrant (Vector DB)

docker run -p 6333:6333 qdrant/qdrant

text

4. Pull Ollama LLM Model

ollama pull llama3.1

text

5. Run the FastAPI Application

uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

text

6. Open API Documentation

Access at: http://localhost:8000/docs

Architecture

text ┌────────────┐ │ User │ └─────┬──────┘ │ ┌──────▼───────┐ │ FastAPI REST │ │ Backend │ └─────┬────────┘ ┌────────────┴────────────┐ │ │ ┌───▼─────┐ ┌───────▼────────┐ │ Document │ │ Query, RAG │ │ Ingestion│ │ Chain & Gen. │ └───┬──────┘ └────────────────┘ │ ┌───▼────────┐ │ Embedding │ │ Generation │ └───┬────────┘ │ ┌───▼─────────┐ │ Qdrant │ │ Vector DB │ └───┬─────────┘ │ ┌───▼─────────┐ │ Ollama LLM │ └─────────────┘

text

Workflow:

Documents are split into semantic chunks and indexed as vectors.
Sentence Transformers generate embeddings.
Qdrant retrieves the most relevant contexts.
Ollama answers using retrieved context (true RAG).

API Endpoints

Method	Path	Description
GET	`/`	Root endpoint
GET	`/health`	Check system status
POST	`/ingest/file`	Upload and index document
POST	`/query`	Query system for answer
DELETE	`/reset`	Reset vector database (danger!)

Docs available at http://localhost:8000/docs

Usage Examples

Upload a Document (.pdf/.docx/.txt) curl -X POST "http://localhost:8000/ingest/file" -H "accept: application/json" -F "file=@your_document.pdf"
Query the System curl -X POST "http://localhost:8000/query" -H "Content-Type: application/json" -d '{"question": "What is the key insight in the uploaded document?", "top_k": 5}'
Reset Collection curl -X DELETE "http://localhost:8000/reset"

text

Testing

Unit tests in /tests using Pytest.
Run all tests: uv run pytest

text

Ensure formatting and linting: uv run black app/ tests/ uv run ruff app/ tests/

text

Project Structure

rag-portfolio-project/ ├── .env ├── pyproject.toml ├── README.md ├── app/ │ ├── main.py │ ├── config.py │ ├── models/ │ ├── core/ │ ├── services/ │ └── api/ ├── data/ │ ├── documents/ │ └── processed/ ├── tests/ │ └── test_rag.py └── scripts/ ├── setup_qdrant.py └── ingest_documents.py

text

Troubleshooting

Missing Modules? Run uv add <module-name>
Ollama Model Not Found? Check with ollama list or update .env
Qdrant Not Running? Ensure the Docker container is up (docker ps)
File Upload Errors? Install python-multipart

Contributing

Contributions are welcome! Fork the repo, open issues, or submit pull requests for enhancements or bug fixes.

License

Open-source under the MIT License.

Questions?

Contact the repository owner or open an issue – happy to help!

RAG Portfolio Project

Table of Contents

Project Overview

Features

Tech Stack

Getting Started

1. Prerequisites

2. Setup

3. Start Qdrant (Vector DB)

4. Pull Ollama LLM Model

5. Run the FastAPI Application

6. Open API Documentation

Architecture

API Endpoints

Usage Examples

Testing

Project Structure

Troubleshooting

Contributing

License

Questions?