viz-engine / README.md
Godswill-IoT's picture
Upload 17 files
9c2a788 verified
metadata
title: Vision Engine
emoji: 👁️
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 7860

Learning OS Vision Engine (MVP)

An intelligence service for extracting insights from images, including descriptions and verbatim text (OCR).

Features

  • Vision Analysis: Describes image content using Hugging Face Router models.
  • OCR: Verbatim text extraction via Tesseract.
  • Local Fallback: CPU-friendly captioning/OCR if API is unavailable.
  • Media Integration: Directly downloads from the Media Service.

Env vars

  • MEDIA_SERVICE_URL (default http://127.0.0.1:7860)
  • HF_TOKEN (required for API)
  • HF_VISION_MODEL (default CohereLabs/aya-vision-32b:cohere)

Run (Windows)

py -3.11 -m venv venv
.\venv\Scripts\activate
pip install -r requirements.txt
python -m uvicorn app.main:app --reload --port 7860

Open: http://127.0.0.1:7860/docs