Spaces:
Runtime error
A newer version of the Gradio SDK is available: 6.13.0
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
AIVIZ-BOT is a RAG-based conversational chatbot built with Gradio that answers questions about AISdb (Automatic Identification System Database) documentation. The assistant is named "Stormy" and helps users learn about AIS data processing, machine learning research, and maritime vessel tracking.
Running the Application
python app.py
Required environment variables (in .env), depending on which provider is selected:
HF_TOKEN- HuggingFace inference token (default provider)GOOGLE_API_KEY- Google Generative AI API keyOPENAI_API_KEY- OpenAI API keyANTHROPIC_API_KEY- Anthropic API keyUSER_AGENT=myagentGRPC_VERBOSITY=ERRORGLOG_minloglevel=2
set_envs() is a no-op β keys are read from the environment at LLM construction time, or supplied via the Model Settings panel in the UI. Never re-introduce a getpass prompt there; it will hang in Docker / HF Spaces.
Architecture
Data Flow
- Initialization: Scrapes 50+ AISdb documentation URLs β chunks documents β creates embeddings β stores in ChromaDB
- Chat Request: User input β session lookup in LFU cache β contextualize with chat history β retrieve from vector store (k =
MAX_SIZE) β generate response via RAG chain β stream in 8-char chunks
Key Components
app.py: Main Gradio interface with ocean/maritime themed UI, asyncecho()that offloadsrespond()viaasyncio.to_thread, streams in 8-char chunks (10 ms delay), example questions, model settings panel, and collapsible help sectionconfigs/config.py: URLs to scrape, LLM settings, embedding model config, system prompt,MODEL_REGISTRYandPROVIDER_ENV_KEYSfor the multi-provider switcherllm_setup/llm_setup.py: Conversational RAG chain setup with LangChain, manages session-based chat history.create_llm()is a factory over Google Gemini / OpenAI / Anthropic / HuggingFace. The HuggingFace branch must passprovider="hf-inference"andtask="conversational"toHuggingFaceEndpoint, otherwisehuggingface_hubraisesStopIterationβRuntimeError: generator raised StopIterationservices/scraper.py: Web scraping service that preserves per-document source URL metadatastores/chroma.py: ChromaDB vector store with HuggingFace embeddings (BAAI/bge-base-en-v1.5), skips re-ingestion if already populatedprocessing/documents.py: Document loading with RecursiveCharacterTextSplitter using configurable chunk size/overlap and structure-aware separatorsprocessing/texts.py: Text cleaning that preserves document structure (newlines, paragraphs) while removing control characterscaching/lfu.py: LFU cache for session-based chat histories (capacity: 50 sessions). Exposesget/put/deleteβ never replacellm_svc.storewith a plaindict, the rest of the code calls these methods.
Tech Stack
- LLM: Pluggable β default is HuggingFace (
meta-llama/Llama-3.1-8B-Instructviahf-inference); also supports Google Gemini, OpenAI, Anthropic - Embeddings: HuggingFace
BAAI/bge-base-en-v1.5(CPU) - RAG Framework: LangChain
- Vector Store: ChromaDB
- UI: Gradio 5.x
- Deployment: HuggingFace Spaces
Configuration Values (in configs/config.py)
- Chunk size: 768 chars with 100 char overlap
- Chunk separators:
\n\n,\n,.,, `` (structure-aware) - Max retrieved documents: 100
- LFU cache capacity: 50 sessions
- ChromaDB deduplication: skips ingestion on restart if data exists