Update README.md
Browse files
README.md
CHANGED
|
@@ -1,4 +1,15 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
## 📖 Overview
|
| 4 |
LocalAGI is a multimodal Retrieval-Augmented Generation (RAG) application that acts as an intelligent, interactive bartender. By combining state-of-the-art computer vision with vector search, the application allows users to upload a photo of any liquor bottle and instantly receive curated cocktail recipes utilizing that specific spirit from a custom-ingested library.
|
|
@@ -10,10 +21,10 @@ Engineered to run entirely on CPU-bound cloud environments (like Hugging Face Sp
|
|
| 10 |
* **Custom Knowledge Base (RAG):** Ingests raw `.txt` and `.pdf` recipe books, intelligently splitting them into discrete recipe chunks using RegEx and LangChain, and stores them in a local Chroma vector database.
|
| 11 |
* **Smart Cropping Pipeline:** Implements YOLOv8 to locate bottles or glasses in an image, applying dynamic 25% padding to isolate the label and strip away background noise.
|
| 12 |
* **Hardware-Optimized Processing:** Features custom logic to downscale images and restrict token generation limits, allowing complex 2-billion-parameter models to run efficiently on free-tier cloud CPUs.
|
| 13 |
-
* **Interactive UI:** A Gradio
|
| 14 |
|
| 15 |
## 🛠️ Technical Stack
|
| 16 |
-
* **Frontend/UI:** Gradio
|
| 17 |
* **Computer Vision:** Ultralytics YOLOv8 (Object Detection)
|
| 18 |
* **Vision-Language Model:** HuggingFaceTB/SmolVLM-Instruct (Label OCR & Context)
|
| 19 |
* **Vector Database:** ChromaDB
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: LocalAGI AI Mixologist
|
| 3 |
+
emoji: 🍸
|
| 4 |
+
colorFrom: indigo
|
| 5 |
+
colorTo: purple
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: "5.0"
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
# 🍸 LocalAGI: The AI Mixologist
|
| 13 |
|
| 14 |
## 📖 Overview
|
| 15 |
LocalAGI is a multimodal Retrieval-Augmented Generation (RAG) application that acts as an intelligent, interactive bartender. By combining state-of-the-art computer vision with vector search, the application allows users to upload a photo of any liquor bottle and instantly receive curated cocktail recipes utilizing that specific spirit from a custom-ingested library.
|
|
|
|
| 21 |
* **Custom Knowledge Base (RAG):** Ingests raw `.txt` and `.pdf` recipe books, intelligently splitting them into discrete recipe chunks using RegEx and LangChain, and stores them in a local Chroma vector database.
|
| 22 |
* **Smart Cropping Pipeline:** Implements YOLOv8 to locate bottles or glasses in an image, applying dynamic 25% padding to isolate the label and strip away background noise.
|
| 23 |
* **Hardware-Optimized Processing:** Features custom logic to downscale images and restrict token generation limits, allowing complex 2-billion-parameter models to run efficiently on free-tier cloud CPUs.
|
| 24 |
+
* **Interactive UI:** A Gradio interface featuring a conversational chat format, session state memory, and a hidden "Vision Debug" gallery for real-time insight into the AI's detection process.
|
| 25 |
|
| 26 |
## 🛠️ Technical Stack
|
| 27 |
+
* **Frontend/UI:** Gradio
|
| 28 |
* **Computer Vision:** Ultralytics YOLOv8 (Object Detection)
|
| 29 |
* **Vision-Language Model:** HuggingFaceTB/SmolVLM-Instruct (Label OCR & Context)
|
| 30 |
* **Vector Database:** ChromaDB
|