Spaces:
Sleeping
Sleeping
| title: "Generative AI Portfolio Project" | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: docker | |
| sdk_version: "0.0.0" # (or blank, Spaces fills this) | |
| app_file: app.py | |
| pinned: false | |
| # RAG Portfolio Project | |
| A state-of-the-art Retrieval-Augmented Generation (RAG) system leveraging modern generative AI and vector search technologies. This project demonstrates how to build a production-grade system that enables advanced question answering, document search, and contextual generation on your own infrastructureβprivate, scalable, and fast. | |
| --- | |
| ## Table of Contents | |
| - Project Overview | |
| - Features | |
| - Tech Stack | |
| - Getting Started | |
| - Architecture | |
| - API Endpoints | |
| - Usage Examples | |
| - Testing | |
| - Project Structure | |
| - Troubleshooting | |
| - Contributing | |
| - License | |
| --- | |
| ## Project Overview | |
| This project showcases how to combine large language models (LLMs), local vector databases, and a modern Python web API for secure, high-performance knowledge and document retrieval. All LLM operations run locallyβno data leaves your machine. | |
| Ideal for applications in internal research, enterprise QA, knowledge management, or compliance-sensitive AI tasks. | |
| --- | |
| ## Features | |
| - **Local LLM Inference:** Runs entirely on your machine using Ollama and open-source models (e.g., Llama 3.1). | |
| - **Vector Database Search:** Uses Qdrant for fast, scalable semantic retrieval. | |
| - **Flexible Document Ingestion:** Upload PDF, DOCX, or TXT files for indexing and search. | |
| - **FastAPI Back End:** High-concurrency, type-safe REST API with automatic documentation. | |
| - **Modern Python Package Management:** Built with `uv` for blazing-fast dependency resolution. | |
| - **Modular, Extensible Codebase:** Clean architecture, easy to extend and maintain. | |
| - **Privacy and Security:** No cloud callsβideal for regulated sectors. | |
| - **Fully Containerizable:** Easily deploy with Docker. | |
| --- | |
| ## Tech Stack | |
| - **LLM:** Ollama (local inference engine), Llama 3.1 | |
| - **Vector DB:** Qdrant | |
| - **Embeddings:** Sentence Transformers | |
| - **API:** FastAPI + Uvicorn | |
| - **Package Manager:** uv | |
| - **Code Editor:** Cursor (recommended) | |
| - **Testing & Quality:** Pytest, Black, Ruff | |
| - **DevOps:** Docker-ready | |
| --- | |
| ## Getting Started | |
| ### 1. Prerequisites | |
| - Python 3.10+ | |
| - `uv` package manager | |
| - Ollama installed locally | |
| - Qdrant (Docker recommended) | |
| ### 2. Setup | |
| first fork my repository | |
| git clone https://github.com/YOUR_USERNAME/rag-portfolio-project.git | |
| cd rag-portfolio-project | |
| uv sync | |
| cp .env.example .env | |
| (Update .env if needed) | |
| ### 3. Start Qdrant (Vector DB) | |
| docker run -p 6333:6333 qdrant/qdrant | |
| text | |
| ### 4. Pull Ollama LLM Model | |
| ollama pull llama3.1 | |
| text | |
| ### 5. Run the FastAPI Application | |
| uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 | |
| text | |
| ### 6. Open API Documentation | |
| Access at: [http://localhost:8000/docs](http://localhost:8000/docs) | |
| --- | |
| ## Architecture | |
| text | |
| ββββββββββββββ | |
| β User β | |
| βββββββ¬βββββββ | |
| β | |
| ββββββββΌββββββββ | |
| β FastAPI REST β | |
| β Backend β | |
| βββββββ¬βββββββββ | |
| ββββββββββββββ΄βββββββββββββ | |
| β β | |
| βββββΌββββββ βββββββββΌβββββββββ | |
| β Document β β Query, RAG β | |
| β Ingestionβ β Chain & Gen. β | |
| βββββ¬βββββββ ββββββββββββββββββ | |
| β | |
| βββββΌβββββββββ | |
| β Embedding β | |
| β Generation β | |
| βββββ¬βββββββββ | |
| β | |
| βββββΌββββββββββ | |
| β Qdrant β | |
| β Vector DB β | |
| βββββ¬ββββββββββ | |
| β | |
| βββββΌββββββββββ | |
| β Ollama LLM β | |
| βββββββββββββββ | |
| text | |
| **Workflow:** | |
| - Documents are split into semantic chunks and indexed as vectors. | |
| - Sentence Transformers generate embeddings. | |
| - Qdrant retrieves the most relevant contexts. | |
| - Ollama answers using retrieved context (true RAG). | |
| --- | |
| ## API Endpoints | |
| | Method | Path | Description | | |
| |--------|----------------|-----------------------------------| | |
| | GET | `/` | Root endpoint | | |
| | GET | `/health` | Check system status | | |
| | POST | `/ingest/file` | Upload and index document | | |
| | POST | `/query` | Query system for answer | | |
| | DELETE | `/reset` | Reset vector database (danger!) | | |
| Docs available at [http://localhost:8000/docs](http://localhost:8000/docs) | |
| --- | |
| ## Usage Examples | |
| 1. Upload a Document (.pdf/.docx/.txt) | |
| curl -X POST "http://localhost:8000/ingest/file" | |
| -H "accept: application/json" | |
| -F "file=@your_document.pdf" | |
| 2. Query the System | |
| curl -X POST "http://localhost:8000/query" | |
| -H "Content-Type: application/json" | |
| -d '{"question": "What is the key insight in the uploaded document?", "top_k": 5}' | |
| 3. Reset Collection | |
| curl -X DELETE "http://localhost:8000/reset" | |
| text | |
| --- | |
| ## Testing | |
| - Unit tests in `/tests` using Pytest. | |
| - Run all tests: | |
| uv run pytest | |
| text | |
| - Ensure formatting and linting: | |
| uv run black app/ tests/ | |
| uv run ruff app/ tests/ | |
| text | |
| --- | |
| ## Project Structure | |
| rag-portfolio-project/ | |
| βββ .env | |
| βββ pyproject.toml | |
| βββ README.md | |
| βββ app/ | |
| β βββ main.py | |
| β βββ config.py | |
| β βββ models/ | |
| β βββ core/ | |
| β βββ services/ | |
| β βββ api/ | |
| βββ data/ | |
| β βββ documents/ | |
| β βββ processed/ | |
| βββ tests/ | |
| β βββ test_rag.py | |
| βββ scripts/ | |
| βββ setup_qdrant.py | |
| βββ ingest_documents.py | |
| text | |
| --- | |
| ## Troubleshooting | |
| - **Missing Modules?** Run `uv add <module-name>` | |
| - **Ollama Model Not Found?** Check with `ollama list` or update `.env` | |
| - **Qdrant Not Running?** Ensure the Docker container is up (`docker ps`) | |
| - **File Upload Errors?** Install `python-multipart` | |
| --- | |
| ## Contributing | |
| Contributions are welcome! Fork the repo, open issues, or submit pull requests for enhancements or bug fixes. | |
| --- | |
| ## License | |
| Open-source under the MIT License. | |
| --- | |
| ## Questions? | |
| Contact the repository owner or open an issue β happy to help! |