CarsRUS / README.md
galbendavids's picture
OpenRouter: log when key missing; try more env names; doc HF secret OPENROUTER_API_KEY
126332e
---
title: CarsRUS
emoji: 🚗
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 5.8.0
app_file: app.py
pinned: false
short_description: 'Advanced RAG Chatbot for car comparisons'
---
## AutoGuru AI — CarsRUS
A lightweight RAG chatbot that answers **Hebrew/English** questions about specific cars and generates **structured comparisons** using a local knowledge base scraped from `auto.co.il`, plus **Google Gemini** for final generation.
### What this repo contains
- **Gradio app**: `app.py`
- **RAG engine (retrieval + prompting + memory)**: `rag_engine.py`
- **Scraper + dataset**: `data_ingestion/scraper.py``data_ingestion/scraped_data.json`
### How it works (high level)
- **Normalize** car names (e.g. `rs3``audi_rs3`)
- **Retrieve** relevant chunks using hybrid search (vectors + keyword matches)
- **(If comparison)** extract structured specs with regex and include them in the prompt
- **Generate** the final answer with Gemini, using the retrieved context
- **Remember** a few recent turns for follow‑ups
---
## Deploy to Hugging Face
Before pushing to [CarsRUS on Hugging Face](https://huggingface.co/spaces/galbendavids/CarsRUS/tree/main), run tests and follow the steps in **[DEPLOY.md](DEPLOY.md)**. Quick check:
```bash
cd CarsRUS
./run_tests.sh # or: python test_business_logic.py && python test_rag.py
```
---
## Quick start (local)
### Prerequisites
- Python **3.10+** recommended (Torch + `sentence-transformers`)
- A Gemini API key from [Google AI Studio](https://aistudio.google.com/apikey)
### Install & run
```bash
pip install -r CarsRUS/requirements.txt
export gemini_api="YOUR_GEMINI_API_KEY"
# Optional: reduce 429s by spacing requests (seconds)
export GEMINI_REQUEST_DELAY="8"
python CarsRUS/app.py
```
Open the local Gradio URL printed in the terminal.
---
## Configuration
### Environment variables
- **`OPENROUTER_API_KEY`** *(recommended on HF)*: OpenRouter API key — tried first for faster answers and to avoid Gemini rate limits. On Hugging Face, add as Repository secret. Local: set in `.env` as `openRouter_API_KEY` or `OPENROUTER_API_KEY`.
- **`gemini_api`**: Gemini API key — used when OpenRouter is not set or fails. Required if you don’t use OpenRouter.
- **`GEMINI_REQUEST_DELAY`** *(optional)*: minimum seconds between Gemini requests per process (default: `0.5`).
### Rate limiting behavior (important)
Gemini can return rate limits (HTTP 429). The RAG engine implements:
- **Thread-safe throttling** (prevents concurrent requests from spiking)
- **Exponential backoff + jitter** (keeps retrying up to ~2 minutes)
- **No caching of error/rate-limit responses** (so a temporary 429 won’t “poison” the cache)
---
## Data ingestion (scraping)
The app loads its local knowledge base from:
- `CarsRUS/data_ingestion/scraped_data.json`
To re-scrape the articles and overwrite the JSON:
```bash
python CarsRUS/data_ingestion/scraper.py
```
Notes:
- The scraper targets a fixed list of article URLs inside `data_ingestion/scraper.py`.
- Website structure changes can break scraping; if that happens, update the selectors in the scraper.
---
## Project layout
```text
CarsRUS/
app.py # Gradio UI + chat routing
rag_engine.py # Retrieval + prompting + memory + Gemini calling
requirements.txt
data_ingestion/
scraper.py # Scrape auto.co.il pages
scraped_data.json # Cached dataset used by the app
```
---
## Usage examples
- **General**: “How is the Audi RS3?”
- **Comparison**: “Compare RS3 vs Elantra N”
- **Follow-up**: “ומה לגבי הבטיחות?” (uses recent conversation context)
---
## Troubleshooting
### “Configuration Error” / no API key
- Set at least one: **OPENROUTER_API_KEY** (recommended on HF: Settings → Repository secrets) or **gemini_api** in your environment or HF Space secrets.
- If you see “OpenRouter key not set” in logs and get Gemini rate limits, add **OPENROUTER_API_KEY** in the Space secrets so the app uses OpenRouter first.
### Rate limit / 429 errors
- Wait a bit and retry (the app also retries automatically).
- Increase `GEMINI_REQUEST_DELAY` (e.g. `8`–`12`).
- If you expect traffic, consider a higher-quota key.
### Model not available (404 / not supported)
- The engine tries multiple Gemini model IDs; if your key/region doesn’t support one, it will fall back to others.
---
## Notes & limitations
- This is a **dataset-grounded** chatbot: answers are only as good as `scraped_data.json`.
- The in-memory cache and history reset when the process restarts.
- Always verify critical specs from an official source before making decisions.
---
## License
Proof-of-concept / educational use.
**Last updated**: 2026-01-28