galbendavids commited on
Commit
849c690
Β·
verified Β·
1 Parent(s): 37bbf25
DEPLOY.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deploy CarsRUS to Hugging Face Spaces
2
+
3
+ This guide ensures you can push new versions to [CarsRUS on Hugging Face](https://huggingface.co/spaces/galbendavids/CarsRUS/tree/main) with confidence.
4
+
5
+ ---
6
+
7
+ ## 1. Pre-push checklist (run locally)
8
+
9
+ Before pushing, run the test suite from the **CarsRUS** directory:
10
+
11
+ ```bash
12
+ cd CarsRUS
13
+
14
+ # One-liner: run all required tests
15
+ chmod +x run_tests.sh # once, if needed
16
+ ./run_tests.sh
17
+
18
+ # Or run manually:
19
+ # 1) Business-logic tests (QA – expected behavior from request_file.txt)
20
+ python test_business_logic.py
21
+
22
+ # 2) RAG engine tests (init, search, normalization, embeddings)
23
+ python test_rag.py
24
+
25
+ # 3) Optional: agent flow (requires gemini_api in env for full run)
26
+ # python test_agent.py
27
+ ```
28
+
29
+ - **All of `test_business_logic.py` and `test_rag.py` must pass** before you push.
30
+ - If `test_business_logic.py` or `test_rag.py` fails, fix the issue before deploying.
31
+
32
+ ---
33
+
34
+ ## 2. Push to Hugging Face
35
+
36
+ ### First-time setup (once per machine)
37
+
38
+ 1. **Install Hugging Face CLI and log in** (if not already):
39
+ ```bash
40
+ pip install huggingface_hub
41
+ huggingface-cli login
42
+ ```
43
+ Use a token with **write** access to the Space (create at [hf.co/settings/tokens](https://huggingface.co/settings/tokens)).
44
+
45
+ 2. **Clone or link the Space repo** (if you don’t have it yet):
46
+ ```bash
47
+ git clone https://huggingface.co/spaces/galbendavids/CarsRUS CarsRUS-hf
48
+ cd CarsRUS-hf
49
+ ```
50
+ Or, if your code already lives in a repo that you push to HF:
51
+ ```bash
52
+ cd /path/to/your/CarsRUS # e.g. your workspace CarsRUS folder
53
+ git remote add hf https://huggingface.co/spaces/galbendavids/CarsRUS
54
+ ```
55
+
56
+ ### Push a new version
57
+
58
+ From the **root of the repo that HF Space uses** (e.g. `CarsRUS` or `CarsRUS-hf`):
59
+
60
+ ```bash
61
+ # 1) Run tests (see Pre-push checklist above)
62
+ python test_business_logic.py && python test_rag.py
63
+
64
+ # 2) Commit changes (if needed)
65
+ git add .
66
+ git commit -m "Your release message, e.g. agentic rag update"
67
+
68
+ # 3) Push to Hugging Face
69
+ git push hf main
70
+ # or, if your default remote is HF:
71
+ # git push origin main
72
+ ```
73
+
74
+ - Space repo usually uses branch **`main`**. If your Space is set to another branch, push to that branch instead.
75
+ - After push, Hugging Face will rebuild and restart the Space; check the Space **Logs** for errors.
76
+
77
+ ---
78
+
79
+ ## 3. What the tests guarantee (QA)
80
+
81
+ | Test file | What it checks |
82
+ |-----------|----------------|
83
+ | **test_business_logic.py** | Supported car list matches knowledge base; unsupported car (e.g. BMW X5) returns refusal with supported list; single supported car β†’ no refusal; comparison with 2 supported cars β†’ no refusal; comparison with 1 supported β†’ refusal; car name normalization (RS3β†’audi_rs3, etc.); chat handles missing `gemini_api` without crashing. |
84
+ | **test_rag.py** | RAG engine init, hybrid search, car normalization, lazy embedding load. |
85
+ | **test_agent.py** | Agent graph and `prepare_generation` (optional; full run needs `gemini_api`). |
86
+
87
+ ---
88
+
89
+ ## 4. Space configuration on Hugging Face
90
+
91
+ - **SDK**: Gradio (see `README.md` β†’ `sdk: gradio`, `app_file: app.py`).
92
+ - **Secrets**: In the Space **Settings β†’ Repository secrets**, set:
93
+ - **`gemini_api`**: your Gemini API key (required for chat).
94
+ - **Hardware**: Default CPU is enough; GPU is optional for faster embedding if you change the app later.
95
+
96
+ ---
97
+
98
+ ## 5. After deploy
99
+
100
+ 1. Open the Space: `https://huggingface.co/spaces/galbendavids/CarsRUS`
101
+ 2. Check **Logs** for startup errors (e.g. missing `scraped_data.json` or dependencies).
102
+ 3. Send a test query (e.g. β€œTell me about the Audi RS3”) and confirm the answer is grounded and not a generic error.
103
+
104
+ If something fails in production, re-run `test_business_logic.py` and `test_rag.py` locally to confirm behavior matches expectations.
README.md CHANGED
@@ -28,6 +28,17 @@ A lightweight RAG chatbot that answers **Hebrew/English** questions about specif
28
 
29
  ---
30
 
 
 
 
 
 
 
 
 
 
 
 
31
  ## Quick start (local)
32
 
33
  ### Prerequisites
 
28
 
29
  ---
30
 
31
+ ## Deploy to Hugging Face
32
+
33
+ Before pushing to [CarsRUS on Hugging Face](https://huggingface.co/spaces/galbendavids/CarsRUS/tree/main), run tests and follow the steps in **[DEPLOY.md](DEPLOY.md)**. Quick check:
34
+
35
+ ```bash
36
+ cd CarsRUS
37
+ ./run_tests.sh # or: python test_business_logic.py && python test_rag.py
38
+ ```
39
+
40
+ ---
41
+
42
  ## Quick start (local)
43
 
44
  ### Prerequisites
__pycache__/app.cpython-37.pyc CHANGED
Binary files a/__pycache__/app.cpython-37.pyc and b/__pycache__/app.cpython-37.pyc differ
 
__pycache__/rag_engine.cpython-311.pyc CHANGED
Binary files a/__pycache__/rag_engine.cpython-311.pyc and b/__pycache__/rag_engine.cpython-311.pyc differ
 
__pycache__/rag_engine.cpython-37.pyc CHANGED
Binary files a/__pycache__/rag_engine.cpython-37.pyc and b/__pycache__/rag_engine.cpython-37.pyc differ
 
agent.py CHANGED
@@ -144,7 +144,7 @@ def run_stream(engine: RAGEngine, graph, query: str, api_key: str):
144
  """
145
  initial: AgentState = {"query": query, "api_key": api_key}
146
  last_state: AgentState = initial
147
- for _node_name, state in graph.stream(initial):
148
  last_state = state
149
  steps_log = state.get("steps_log") or []
150
  refusal = state.get("refusal")
 
144
  """
145
  initial: AgentState = {"query": query, "api_key": api_key}
146
  last_state: AgentState = initial
147
+ for state in graph.stream(initial, stream_mode="values"):
148
  last_state = state
149
  steps_log = state.get("steps_log") or []
150
  refusal = state.get("refusal")
deploy.sh ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Run tests, then push CarsRUS to Hugging Face (run from CarsRUS directory).
3
+ # Usage: cd CarsRUS && ./deploy.sh
4
+ set -e
5
+ cd "$(dirname "$0")"
6
+ echo "=== 1. Running tests ==="
7
+ python test_business_logic.py
8
+ python test_rag.py
9
+ echo ""
10
+ echo "=== 2. Git add / commit / push ==="
11
+ GIT_ROOT=$(git rev-parse --show-toplevel 2>/dev/null || true)
12
+ if [ -z "$GIT_ROOT" ]; then
13
+ echo "Not in a git repo. Commit and push manually from your repo root."
14
+ exit 1
15
+ fi
16
+ # Path from repo root to this folder (e.g. Desktop/carsRUS/CarsRUS)
17
+ REL_PATH=$(git rev-parse --show-prefix 2>/dev/null | sed 's|/$||')
18
+ if [ -z "$REL_PATH" ]; then
19
+ REL_PATH="."
20
+ fi
21
+ cd "$GIT_ROOT"
22
+ git add "${REL_PATH}/DEPLOY.md" "${REL_PATH}/README.md" "${REL_PATH}/run_tests.sh" "${REL_PATH}/test_business_logic.py" "${REL_PATH}/deploy.sh" 2>/dev/null || true
23
+ git add "${REL_PATH}/app.py" "${REL_PATH}/agent.py" "${REL_PATH}/rag_engine.py" "${REL_PATH}/requirements.txt" "${REL_PATH}/test_agent.py" "${REL_PATH}/test_rag.py" 2>/dev/null || true
24
+ git status --short "${REL_PATH}" | head -20
25
+ git commit -m "CarsRUS: deploy with DevOps/QA tests and DEPLOY.md" || true
26
+ git push origin main
27
+ echo "Done. Space: https://huggingface.co/spaces/galbendavids/CarsRUS"
requirements.txt CHANGED
@@ -6,4 +6,4 @@ sentence-transformers
6
  numpy<2.0.0
7
  torch>=2.0.0
8
  langgraph>=0.2.0
9
- langchain-core>=0.3.0
 
6
  numpy<2.0.0
7
  torch>=2.0.0
8
  langgraph>=0.2.0
9
+ langchain-core>=0.3.0
tests/run_tests.sh ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Run all tests before pushing to Hugging Face.
3
+ # Usage: from CarsRUS directory: ./run_tests.sh or bash run_tests.sh
4
+ set -e
5
+ cd "$(dirname "$0")"
6
+ echo "Running business-logic tests (test_business_logic.py)..."
7
+ python test_business_logic.py
8
+ echo ""
9
+ echo "Running RAG engine tests (test_rag.py)..."
10
+ python test_rag.py
11
+ echo ""
12
+ echo "All tests passed. Safe to push to Hugging Face."
tests/test_agent.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Test script for the LangGraph agent pipeline.
3
+ Runs several queries with a short wait between them to verify the full flow.
4
+ Requires gemini_api in environment for real LLM calls; otherwise only tests prepare_generation (no API).
5
+ """
6
+
7
+ import os
8
+ import time
9
+ from rag_engine import RAGEngine
10
+ from agent import build_agent_graph, run_stream
11
+
12
+
13
+ def main():
14
+ print("Loading RAG Engine and building agent graph...")
15
+ engine = RAGEngine()
16
+ graph = build_agent_graph(engine)
17
+ print("OK.\n")
18
+
19
+ api_key = os.environ.get("gemini_api")
20
+ if not api_key:
21
+ print("⚠️ gemini_api not set. Testing only prepare_generation (no LLM calls).\n")
22
+ test_queries = [
23
+ "Tell me about the Audi RS3",
24
+ "Compare Audi RS3 vs Hyundai Elantra N",
25
+ "ΧžΧ” Χ“Χ’Χͺך גל BMW X5?", # should trigger refusal
26
+ ]
27
+ for i, query in enumerate(test_queries, 1):
28
+ print(f"--- Test {i}: prepare_generation ---")
29
+ print(f"Query: {query!r}")
30
+ refusal, sys_p, user_p, steps = engine.prepare_generation(query)
31
+ if refusal:
32
+ print(f"Refusal (expected for unsupported car): {refusal[:150]}...")
33
+ else:
34
+ print(f"Steps: {len(steps)}; system_prompt length: {len(sys_p or '')}; user_prompt length: {len(user_p or '')}")
35
+ print()
36
+ print("Done (prepare_generation only). Set gemini_api to run full agent.")
37
+ return
38
+
39
+ test_queries = [
40
+ "Tell me about the Audi RS3",
41
+ "Compare Audi RS3 vs Hyundai Elantra N",
42
+ "ΧžΧ” Χ”Χ™ΧͺΧ¨Χ•Χ Χ•Χͺ של Χ§Χ™Χ” EV9?",
43
+ "ΧžΧ” Χ“Χ’Χͺך גל BMW X5?", # should trigger refusal (unsupported model)
44
+ ]
45
+ wait_seconds = 8
46
+
47
+ for i, query in enumerate(test_queries, 1):
48
+ print(f"--- Test {i}/{len(test_queries)} ---")
49
+ print(f"Query: {query!r}")
50
+ last_output = None
51
+ step_count = 0
52
+ try:
53
+ for out in run_stream(engine, graph, query, api_key):
54
+ last_output = out
55
+ step_count += 1
56
+ if last_output:
57
+ preview = last_output[:400] + "..." if len(last_output) > 400 else last_output
58
+ print(f"Steps yielded: {step_count}; final length: {len(last_output)}")
59
+ print(f"Final preview:\n{preview}\n")
60
+ else:
61
+ print("No output received.\n")
62
+ except Exception as e:
63
+ print(f"Error: {e}\n")
64
+ if i < len(test_queries):
65
+ print(f"Waiting {wait_seconds}s before next query...")
66
+ time.sleep(wait_seconds)
67
+
68
+ print("All tests finished.")
69
+
70
+
71
+ if __name__ == "__main__":
72
+ main()
tests/test_business_logic.py ADDED
@@ -0,0 +1,180 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python
2
+ """
3
+ Business-logic test suite for CarsRUS (QA / DevOps).
4
+
5
+ Validates expected behavior from request_file.txt:
6
+ - Ingest automotive review content β†’ searchable knowledge base
7
+ - Respond based on retrieved knowledge (no hallucination for unsupported cars)
8
+ - Supported cars: Citroen C3, Audi RS3, Kia EV9, MG S6, Hyundai Elantra N, Aion HT, Genesis GV80, Link & Co 01
9
+ - Unsupported car questions β†’ refusal with supported list
10
+ - Comparison: 2 supported cars β†’ proceed; 1 or 0 β†’ refusal
11
+ - Car name normalization (e.g. RS3 β†’ audi_rs3, Χ§Χ™Χ” EV9 β†’ kia_ev9)
12
+
13
+ Run before pushing to Hugging Face: python test_business_logic.py
14
+ """
15
+
16
+ import os
17
+ import sys
18
+
19
+ sys.path.insert(0, os.path.dirname(__file__))
20
+
21
+
22
+ def test_supported_cars_list():
23
+ """Supported models must match the knowledge base (scraped articles)."""
24
+ from rag_engine import RAGEngine
25
+
26
+ engine = RAGEngine()
27
+ display = engine._supported_cars_display()
28
+ expected = [
29
+ "Citroen C3",
30
+ "Audi RS3",
31
+ "Kia EV9",
32
+ "MG S6",
33
+ "Hyundai Elantra N",
34
+ "Aion HT",
35
+ "Genesis GV80",
36
+ "Link & Co 01",
37
+ ]
38
+ assert set(display) == set(expected), f"Supported cars mismatch: got {display}"
39
+ assert len(display) == 8, f"Expected 8 supported models, got {len(display)}"
40
+ print("βœ… test_supported_cars_list passed")
41
+
42
+
43
+ def test_unsupported_car_returns_refusal():
44
+ """Asking about a car not in the knowledge base must return a refusal with supported list."""
45
+ from rag_engine import RAGEngine
46
+
47
+ engine = RAGEngine()
48
+ # Hebrew: "What do you think about BMW X5?"
49
+ query = "ΧžΧ” Χ“Χ’Χͺך גל BMW X5?"
50
+ refusal, sys_prompt, user_prompt, steps = engine.prepare_generation(query)
51
+ assert refusal is not None, "Unsupported car query must return refusal"
52
+ assert sys_prompt is None and user_prompt is None, "Refusal must not return prompts"
53
+ assert "Citroen C3" in refusal or "Audi RS3" in refusal, "Refusal must list supported models"
54
+ assert "לא נמצא" in refusal or "not in my knowledge" in refusal or "not in my knowledge base" in refusal
55
+ print("βœ… test_unsupported_car_returns_refusal passed")
56
+
57
+
58
+ def test_supported_car_single_no_refusal():
59
+ """Single supported car question must NOT refuse; must return prompts for generation."""
60
+ from rag_engine import RAGEngine
61
+
62
+ engine = RAGEngine()
63
+ query = "Tell me about the Audi RS3"
64
+ refusal, sys_prompt, user_prompt, steps = engine.prepare_generation(query)
65
+ assert refusal is None, "Supported car query must not refuse"
66
+ assert sys_prompt and user_prompt, "Must return system and user prompts for LLM"
67
+ assert len(steps) >= 1, "Steps log must be populated"
68
+ print("βœ… test_supported_car_single_no_refusal passed")
69
+
70
+
71
+ def test_comparison_two_supported_no_refusal():
72
+ """Comparison of two supported cars must NOT refuse."""
73
+ from rag_engine import RAGEngine
74
+
75
+ engine = RAGEngine()
76
+ query = "Compare Audi RS3 vs Hyundai Elantra N"
77
+ refusal, sys_prompt, user_prompt, steps = engine.prepare_generation(query)
78
+ assert refusal is None, "Two supported cars comparison must not refuse"
79
+ assert sys_prompt and user_prompt
80
+ print("βœ… test_comparison_two_supported_no_refusal passed")
81
+
82
+
83
+ def test_comparison_one_supported_refusal():
84
+ """Comparison mentioning only one supported car (or one unsupported) must refuse."""
85
+ from rag_engine import RAGEngine
86
+
87
+ engine = RAGEngine()
88
+ # "Compare RS3 vs BMW X5" β€” only RS3 is supported
89
+ query = "Compare RS3 vs BMW X5"
90
+ refusal, sys_prompt, user_prompt, steps = engine.prepare_generation(query)
91
+ assert refusal is not None, "Comparison with unsupported car must refuse"
92
+ assert "supported" in refusal.lower() or "Χ ΧͺΧžΧ›Χ™Χ" in refusal
93
+ print("βœ… test_comparison_one_supported_refusal passed")
94
+
95
+
96
+ def test_car_name_normalization():
97
+ """Normalize car names: RS3 β†’ audi_rs3, Χ§Χ™Χ” EV9 β†’ kia_ev9, Citroen C3 β†’ citroen_c3."""
98
+ from rag_engine import RAGEngine
99
+
100
+ engine = RAGEngine()
101
+ cases = [
102
+ ("Audi RS3", "audi_rs3"),
103
+ ("RS3", "audi_rs3"),
104
+ ("Χ§Χ™Χ” EV9", "kia_ev9"),
105
+ ("Citroen C3", "citroen_c3"),
106
+ ("Kia EV9", "kia_ev9"),
107
+ ]
108
+ for text, expected in cases:
109
+ got = engine._normalize_car_name(text)
110
+ assert got == expected, f"Normalize {text!r}: expected {expected}, got {got}"
111
+ print("βœ… test_car_name_normalization passed")
112
+
113
+
114
+ def test_rag_engine_initialization_and_chunks():
115
+ """RAG engine must load chunks from scraped_data.json (knowledge base exists)."""
116
+ from rag_engine import RAGEngine
117
+
118
+ engine = RAGEngine()
119
+ assert len(engine.chunks) > 0, "Knowledge base must have at least one chunk"
120
+ assert len(engine.chunk_metadata) == len(engine.chunks)
121
+ print("βœ… test_rag_engine_initialization_and_chunks passed")
122
+
123
+
124
+ def test_hybrid_search_returns_relevant_results():
125
+ """Hybrid search must return results for a supported car query."""
126
+ from rag_engine import RAGEngine
127
+
128
+ engine = RAGEngine()
129
+ results = engine._hybrid_search("Tell me about the Audi RS3", top_k=3)
130
+ assert len(results) >= 1, "Search must return at least one result for supported car"
131
+ assert "metadata" in results[0] and "text" in results[0]
132
+ assert "title" in results[0]["metadata"]
133
+ print("βœ… test_hybrid_search_returns_relevant_results passed")
134
+
135
+
136
+ def test_chat_function_requires_gemini_key():
137
+ """App chat must handle missing API key with clear error (no crash)."""
138
+ from app import chat_function
139
+
140
+ # Temporarily unset if set
141
+ old_key = os.environ.pop("gemini_api", None)
142
+ try:
143
+ out = list(chat_function("Tell me about Audi RS3", []))
144
+ assert len(out) >= 1
145
+ assert "gemini" in out[0].lower() or "API key" in out[0] or "Configuration" in out[0]
146
+ finally:
147
+ if old_key is not None:
148
+ os.environ["gemini_api"] = old_key
149
+ print("βœ… test_chat_function_requires_gemini_key passed")
150
+
151
+
152
+ def run_all():
153
+ """Run all business-logic tests. Exit 0 if all pass, 1 otherwise."""
154
+ tests = [
155
+ test_supported_cars_list,
156
+ test_car_name_normalization,
157
+ test_rag_engine_initialization_and_chunks,
158
+ test_unsupported_car_returns_refusal,
159
+ test_supported_car_single_no_refusal,
160
+ test_comparison_two_supported_no_refusal,
161
+ test_comparison_one_supported_refusal,
162
+ test_hybrid_search_returns_relevant_results,
163
+ test_chat_function_requires_gemini_key,
164
+ ]
165
+ failed = []
166
+ for t in tests:
167
+ try:
168
+ t()
169
+ except Exception as e:
170
+ failed.append((t.__name__, e))
171
+ print(f"❌ {t.__name__} failed: {e}")
172
+ if failed:
173
+ print(f"\n❌ {len(failed)} test(s) failed: {[n for n, _ in failed]}")
174
+ return 1
175
+ print("\nβœ… All business-logic tests passed.")
176
+ return 0
177
+
178
+
179
+ if __name__ == "__main__":
180
+ sys.exit(run_all())
tests/test_rag.py ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python
2
+ """
3
+ Simple test file for RAG Engine
4
+ Tests basic initialization and search functionality
5
+ """
6
+
7
+ import sys
8
+ import os
9
+
10
+ # Add project to path
11
+ sys.path.insert(0, os.path.dirname(__file__))
12
+
13
+ def test_initialization():
14
+ """Test RAG engine initialization"""
15
+ print("\n" + "="*60)
16
+ print("TEST 1: RAG Engine Initialization")
17
+ print("="*60)
18
+
19
+ from rag_engine import RAGEngine
20
+
21
+ try:
22
+ engine = RAGEngine()
23
+ print(f"βœ… Engine initialized successfully")
24
+ print(f" - Chunks loaded: {len(engine.chunks)}")
25
+ print(f" - Metadata entries: {len(engine.chunk_metadata)}")
26
+ print(f" - Keyword index entries: {len(engine.keyword_index)}")
27
+ print(f" - Embeddings: {engine.embeddings}")
28
+ return True, engine
29
+ except Exception as e:
30
+ print(f"❌ Initialization failed: {e}")
31
+ import traceback
32
+ traceback.print_exc()
33
+ return False, None
34
+
35
+
36
+ def test_search(engine):
37
+ """Test hybrid search functionality"""
38
+ print("\n" + "="*60)
39
+ print("TEST 2: Hybrid Search")
40
+ print("="*60)
41
+
42
+ try:
43
+ query = "Tell me about the Audi RS3"
44
+ print(f"Testing search for: '{query}'")
45
+
46
+ results = engine._hybrid_search(query, top_k=3)
47
+ print(f"βœ… Search successful")
48
+ print(f" - Results found: {len(results)}")
49
+
50
+ if results:
51
+ print(f" - Top result score: {results[0]['score']:.3f}")
52
+ print(f" - Top result title: {results[0]['metadata']['title']}")
53
+
54
+ return True
55
+ except Exception as e:
56
+ print(f"❌ Search failed: {e}")
57
+ import traceback
58
+ traceback.print_exc()
59
+ return False
60
+
61
+
62
+ def test_car_normalization(engine):
63
+ """Test car name normalization"""
64
+ print("\n" + "="*60)
65
+ print("TEST 3: Car Name Normalization")
66
+ print("="*60)
67
+
68
+ test_cases = [
69
+ ("Audi RS3", "audi_rs3"),
70
+ ("RS3", "audi_rs3"),
71
+ ("Χ§Χ™Χ” EV9", "kia_ev9"),
72
+ ("Citroen C3", "citroen_c3"),
73
+ ]
74
+
75
+ passed = 0
76
+ failed = 0
77
+
78
+ for text, expected in test_cases:
79
+ result = engine._normalize_car_name(text)
80
+ if result == expected:
81
+ print(f"βœ… '{text}' β†’ {result}")
82
+ passed += 1
83
+ else:
84
+ print(f"❌ '{text}' β†’ {result} (expected {expected})")
85
+ failed += 1
86
+
87
+ print(f" - Passed: {passed}/{len(test_cases)}")
88
+ return failed == 0
89
+
90
+
91
+ def test_embeddings(engine):
92
+ """Test that embeddings are lazy loaded"""
93
+ print("\n" + "="*60)
94
+ print("TEST 4: Lazy Embedding Loading")
95
+ print("="*60)
96
+
97
+ try:
98
+ # Check initial state
99
+ if engine.embeddings is None:
100
+ print("βœ… Embeddings are None at startup (lazy loading working)")
101
+ else:
102
+ print("⚠️ Embeddings already loaded (not lazy)")
103
+
104
+ # Trigger embedding generation
105
+ query = "Test query"
106
+ engine._hybrid_search(query, top_k=1)
107
+
108
+ if engine.embeddings is not None:
109
+ print(f"βœ… Embeddings generated after first search")
110
+ print(f" - Shape: {engine.embeddings.shape}")
111
+ print(f" - Expected chunks: {len(engine.chunks)}")
112
+ return True
113
+ else:
114
+ print(f"❌ Embeddings not generated")
115
+ return False
116
+
117
+ except Exception as e:
118
+ print(f"❌ Embedding test failed: {e}")
119
+ import traceback
120
+ traceback.print_exc()
121
+ return False
122
+
123
+
124
+ def main():
125
+ """Run all tests"""
126
+ print("\n" + "="*60)
127
+ print("CARSRUS RAG ENGINE TEST SUITE")
128
+ print("="*60)
129
+
130
+ # Test 1: Initialization
131
+ success, engine = test_initialization()
132
+ if not success:
133
+ print("\n❌ TESTS FAILED - Initialization error")
134
+ return 1
135
+
136
+ # Test 2: Normalization
137
+ if not test_car_normalization(engine):
138
+ print("\n⚠️ Some normalization tests failed")
139
+
140
+ # Test 3: Search
141
+ if not test_search(engine):
142
+ print("\n❌ TESTS FAILED - Search error")
143
+ return 1
144
+
145
+ # Test 4: Embeddings
146
+ if not test_embeddings(engine):
147
+ print("\n⚠️ Embedding test had issues")
148
+
149
+ # Summary
150
+ print("\n" + "="*60)
151
+ print("βœ… ALL CRITICAL TESTS PASSED")
152
+ print("="*60)
153
+ print("\nRAG Engine is ready for deployment!")
154
+ print("- Initialization: βœ…")
155
+ print("- Data loading: βœ…")
156
+ print("- Search functionality: βœ…")
157
+ print("- Lazy loading: βœ…")
158
+
159
+ return 0
160
+
161
+
162
+ if __name__ == "__main__":
163
+ exit(main())