Spaces:
Sleeping
Sleeping
metadata
title: Vision Engine
emoji: 👁️
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 7860
Learning OS Vision Engine (MVP)
An intelligence service for extracting insights from images, including descriptions and verbatim text (OCR).
Features
- ✅ Vision Analysis: Describes image content using Hugging Face Router models.
- ✅ OCR: Verbatim text extraction via Tesseract.
- ✅ Local Fallback: CPU-friendly captioning/OCR if API is unavailable.
- ✅ Media Integration: Directly downloads from the Media Service.
Env vars
- MEDIA_SERVICE_URL (default http://127.0.0.1:7860)
- HF_TOKEN (required for API)
- HF_VISION_MODEL (default CohereLabs/aya-vision-32b:cohere)
Run (Windows)
py -3.11 -m venv venv
.\venv\Scripts\activate
pip install -r requirements.txt
python -m uvicorn app.main:app --reload --port 7860