CarsRUS / README.md
galbendavids's picture
OpenRouter: log when key missing; try more env names; doc HF secret OPENROUTER_API_KEY
126332e

A newer version of the Gradio SDK is available: 6.9.0

Upgrade
metadata
title: CarsRUS
emoji: 🚗
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 5.8.0
app_file: app.py
pinned: false
short_description: Advanced RAG Chatbot for car comparisons

AutoGuru AI — CarsRUS

A lightweight RAG chatbot that answers Hebrew/English questions about specific cars and generates structured comparisons using a local knowledge base scraped from auto.co.il, plus Google Gemini for final generation.

What this repo contains

  • Gradio app: app.py
  • RAG engine (retrieval + prompting + memory): rag_engine.py
  • Scraper + dataset: data_ingestion/scraper.pydata_ingestion/scraped_data.json

How it works (high level)

  • Normalize car names (e.g. rs3audi_rs3)
  • Retrieve relevant chunks using hybrid search (vectors + keyword matches)
  • (If comparison) extract structured specs with regex and include them in the prompt
  • Generate the final answer with Gemini, using the retrieved context
  • Remember a few recent turns for follow‑ups

Deploy to Hugging Face

Before pushing to CarsRUS on Hugging Face, run tests and follow the steps in DEPLOY.md. Quick check:

cd CarsRUS
./run_tests.sh   # or: python test_business_logic.py && python test_rag.py

Quick start (local)

Prerequisites

  • Python 3.10+ recommended (Torch + sentence-transformers)
  • A Gemini API key from Google AI Studio

Install & run

pip install -r CarsRUS/requirements.txt

export gemini_api="YOUR_GEMINI_API_KEY"

# Optional: reduce 429s by spacing requests (seconds)
export GEMINI_REQUEST_DELAY="8"

python CarsRUS/app.py

Open the local Gradio URL printed in the terminal.


Configuration

Environment variables

  • OPENROUTER_API_KEY (recommended on HF): OpenRouter API key — tried first for faster answers and to avoid Gemini rate limits. On Hugging Face, add as Repository secret. Local: set in .env as openRouter_API_KEY or OPENROUTER_API_KEY.
  • gemini_api: Gemini API key — used when OpenRouter is not set or fails. Required if you don’t use OpenRouter.
  • GEMINI_REQUEST_DELAY (optional): minimum seconds between Gemini requests per process (default: 0.5).

Rate limiting behavior (important)

Gemini can return rate limits (HTTP 429). The RAG engine implements:

  • Thread-safe throttling (prevents concurrent requests from spiking)
  • Exponential backoff + jitter (keeps retrying up to ~2 minutes)
  • No caching of error/rate-limit responses (so a temporary 429 won’t “poison” the cache)

Data ingestion (scraping)

The app loads its local knowledge base from:

  • CarsRUS/data_ingestion/scraped_data.json

To re-scrape the articles and overwrite the JSON:

python CarsRUS/data_ingestion/scraper.py

Notes:

  • The scraper targets a fixed list of article URLs inside data_ingestion/scraper.py.
  • Website structure changes can break scraping; if that happens, update the selectors in the scraper.

Project layout

CarsRUS/
  app.py                      # Gradio UI + chat routing
  rag_engine.py                # Retrieval + prompting + memory + Gemini calling
  requirements.txt
  data_ingestion/
    scraper.py                 # Scrape auto.co.il pages
    scraped_data.json          # Cached dataset used by the app

Usage examples

  • General: “How is the Audi RS3?”
  • Comparison: “Compare RS3 vs Elantra N”
  • Follow-up: “ומה לגבי הבטיחות?” (uses recent conversation context)

Troubleshooting

“Configuration Error” / no API key

  • Set at least one: OPENROUTER_API_KEY (recommended on HF: Settings → Repository secrets) or gemini_api in your environment or HF Space secrets.
  • If you see “OpenRouter key not set” in logs and get Gemini rate limits, add OPENROUTER_API_KEY in the Space secrets so the app uses OpenRouter first.

Rate limit / 429 errors

  • Wait a bit and retry (the app also retries automatically).
  • Increase GEMINI_REQUEST_DELAY (e.g. 812).
  • If you expect traffic, consider a higher-quota key.

Model not available (404 / not supported)

  • The engine tries multiple Gemini model IDs; if your key/region doesn’t support one, it will fall back to others.

Notes & limitations

  • This is a dataset-grounded chatbot: answers are only as good as scraped_data.json.
  • The in-memory cache and history reset when the process restarts.
  • Always verify critical specs from an official source before making decisions.

License

Proof-of-concept / educational use.

Last updated: 2026-01-28