Spaces:

mapslab
/

AISVIZ-BOT

Runtime error

App Files Files Community

AISVIZ-BOT / CLAUDE.md

vaishnav

Fix HuggingFace provider crash and harden chain init

332bfab 18 days ago

preview code

raw

history blame contribute delete

3.68 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

AIVIZ-BOT is a RAG-based conversational chatbot built with Gradio that answers questions about AISdb (Automatic Identification System Database) documentation. The assistant is named "Stormy" and helps users learn about AIS data processing, machine learning research, and maritime vessel tracking.

Running the Application

python app.py

Required environment variables (in .env), depending on which provider is selected:

HF_TOKEN - HuggingFace inference token (default provider)
GOOGLE_API_KEY - Google Generative AI API key
OPENAI_API_KEY - OpenAI API key
ANTHROPIC_API_KEY - Anthropic API key
USER_AGENT=myagent
GRPC_VERBOSITY=ERROR
GLOG_minloglevel=2

set_envs() is a no-op — keys are read from the environment at LLM construction time, or supplied via the Model Settings panel in the UI. Never re-introduce a getpass prompt there; it will hang in Docker / HF Spaces.

Architecture

Data Flow

Initialization: Scrapes 50+ AISdb documentation URLs → chunks documents → creates embeddings → stores in ChromaDB
Chat Request: User input → session lookup in LFU cache → contextualize with chat history → retrieve from vector store (k = MAX_SIZE) → generate response via RAG chain → stream in 8-char chunks

Key Components

app.py: Main Gradio interface with ocean/maritime themed UI, async echo() that offloads respond() via asyncio.to_thread, streams in 8-char chunks (10 ms delay), example questions, model settings panel, and collapsible help section
configs/config.py: URLs to scrape, LLM settings, embedding model config, system prompt, MODEL_REGISTRY and PROVIDER_ENV_KEYS for the multi-provider switcher
llm_setup/llm_setup.py: Conversational RAG chain setup with LangChain, manages session-based chat history. create_llm() is a factory over Google Gemini / OpenAI / Anthropic / HuggingFace. The HuggingFace branch must pass provider="hf-inference" and task="conversational" to HuggingFaceEndpoint, otherwise huggingface_hub raises StopIteration → RuntimeError: generator raised StopIteration
services/scraper.py: Web scraping service that preserves per-document source URL metadata
stores/chroma.py: ChromaDB vector store with HuggingFace embeddings (BAAI/bge-base-en-v1.5), skips re-ingestion if already populated
processing/documents.py: Document loading with RecursiveCharacterTextSplitter using configurable chunk size/overlap and structure-aware separators
processing/texts.py: Text cleaning that preserves document structure (newlines, paragraphs) while removing control characters
caching/lfu.py: LFU cache for session-based chat histories (capacity: 50 sessions). Exposes get / put / delete — never replace llm_svc.store with a plain dict, the rest of the code calls these methods.

Tech Stack

LLM: Pluggable — default is HuggingFace (meta-llama/Llama-3.1-8B-Instruct via hf-inference); also supports Google Gemini, OpenAI, Anthropic
Embeddings: HuggingFace BAAI/bge-base-en-v1.5 (CPU)
RAG Framework: LangChain
Vector Store: ChromaDB
UI: Gradio 5.x
Deployment: HuggingFace Spaces

Configuration Values (in `configs/config.py`)

Chunk size: 768 chars with 100 char overlap
Chunk separators: \n\n, \n, . , , `` (structure-aware)
Max retrieved documents: 100
LFU cache capacity: 50 sessions
ChromaDB deduplication: skips ingestion on restart if data exists