DanielKiani commited on
Commit
eb6c9fd
ยท
1 Parent(s): 01bd2ce

Fix unresolved merge conflicts

Browse files
Files changed (4) hide show
  1. README.md +74 -72
  2. requirements.txt +1 -15
  3. scripts/app.py +1 -164
  4. scripts/main.py +1 -113
README.md CHANGED
@@ -10,7 +10,7 @@ This project demonstrates an end-to-end workflow, integrating data processing, l
10
  ![demo_tab1](assets/tab1.png)
11
  ![demo_tab2](assets/tab2.png)
12
 
13
- You can find the Web demo here โžก [Web Demo](https://huggingface.co/spaces/Deathshot78/ReviewSense)
14
  **[Note]: running this model on the cpu takes a while to complete you can relax and get a cup of coffee while the model generates responses !โ˜•**
15
 
16
  ---
@@ -24,10 +24,10 @@ You can find the Web demo here โžก [Web Demo](https://huggingface.co/spaces/Deat
24
  - [๐Ÿ”ง Challenges & Limitations](#-challenges--limitations)
25
  - [๐Ÿ’ก Prompt Engineering Journey](#-prompt-engineering-journey)
26
  - [๐Ÿ”ฎ Future Improvements](#-future-improvements)
27
- - [โš™๏ธ Setup and Installation](#๏ธ-setup-and-installation)
28
  - [โ–ถ๏ธ Usage](#๏ธ-usage)
29
  - [๐Ÿ“ Project Structure (v2.0)](#-project-structure-v20)
30
- - [๐Ÿ› ๏ธ Technologies and Models (v2.0)](#๏ธ-technologies-and-models-v20)
31
  - [๐Ÿ“œ Version History](#-version-history)
32
 
33
  ---
@@ -44,17 +44,17 @@ This chatbot allows users to ask specific questions about product reviews and re
44
 
45
  Version 2.0 represents a major leap in functionality and architecture:
46
 
47
- 1. **๐Ÿค– RAG Chatbot Implementation:** Added an interactive chatbot (Phase 2) that uses Retrieval-Augmented Generation (RAG) to answer user questions based on review context.
48
- 2. **๐Ÿง  Single LLM Architecture:** Replaced the multiple specialized models (DistilBERT, DistilBART, DeBERTa, POS Tagger) from v1.0 with a single, powerful Mistral 7B GGUF model, executed locally via `LlamaCpp`. This model now handles:
49
- * Batch Analysis (Summary, Aspects, Sentiment - Phase 1) with higher quality.
50
- * RAG-based Question Answering (Phase 2).
51
- * Intent Classification (Guardrail for Phase 2).
52
- 3. **๐Ÿ“„ Dynamic Context Management:** The chatbot can now operate on a default set of reviews or dynamically update its knowledge base using user-uploaded `.txt` or `.csv` files.
53
- 4. **๐Ÿ’ฌ Conversational Memory:** Integrated LangChain's `ConversationBufferMemory` allowing the chatbot to understand follow-up questions.
54
- 5. **๐Ÿ›ก๏ธ Intent Classification Guardrail:** Implemented a robust intent classifier (using the same LLM) to prevent the chatbot from answering off-topic questions, ensuring responses stay grounded in product reviews.
55
- 6. **๐Ÿ–ฅ๏ธ Unified Gradio UI:** Developed a two-tab Gradio interface (`app.py`) providing access to both the Batch Analyzer and the RAG Chatbot in a single application.
56
- 7. **๐Ÿ’ป Local Execution Script:** Added `main.py` for command-line execution of batch analysis or interactive chat without the Gradio UI.
57
- 8. **๐Ÿงฑ Modular Code Structure:** Refactored code into `src/pipeline.py` for core logic, improving organization and maintainability.
58
 
59
  ---
60
 
@@ -63,14 +63,14 @@ Version 2.0 represents a major leap in functionality and architecture:
63
  Includes all features from v1.0 (now powered by Mistral 7B) **plus**:
64
 
65
  - **Interactive RAG Chatbot:**
66
- * Ask specific questions about product reviews (e.g., "How is the battery life?", "Is the app reliable?").
67
- * Answers synthesized directly from provided review context using RAG.
68
- * **Conversational Memory:** Understands follow-up questions ("What about the screen?").
69
- * **Grounded Responses:** Designed to answer only based on the reviews provided.
70
- * **Intent Guardrail:** Filters out and responds appropriately to off-topic questions.
71
  - **Dynamic Context Loading:**
72
- * Chatbot operates on default reviews or context loaded from user-uploaded files (`.txt`/`.csv`).
73
- * Clear indication of the currently active context.
74
  - **Unified LLM Backend:** All NLP tasks (analysis, Q&A, classification) handled by a single Mistral 7B GGUF model running locally.
75
  - **Dual Interface:** Accessible via Gradio web UI (`app.py`) or command line (`main.py`).
76
 
@@ -79,22 +79,24 @@ Includes all features from v1.0 (now powered by Mistral 7B) **plus**:
79
  ## ๐Ÿง  How It Works: The v2.0 Pipeline
80
 
81
  **Phase 1: Batch Analysis (via `analyze_reviews_only` or `analyze_reviews_logic`)**
82
- 1. User provides review text (paste or file).
83
- 2. The text is passed to the Mistral LLM using three distinct prompts (Summarization, Aspect Extraction, Sentiment Analysis).
84
- 3. The LLM generates the three analysis outputs.
 
85
 
86
  **Phase 2: RAG Chatbot (via `ask_question_with_guardrail` or `get_chatbot_response`)**
87
- 1. User asks a question.
88
- 2. **Intent Classification:** The query is first sent to the Mistral LLM with the `intent_prompt` (few-shot) to classify it as "Product" or "Off-Topic". Robust parsing checks the LLM output.
89
- 3. **Routing:**
90
- * If "Off-Topic", a canned response is returned.
91
- * If "Product", proceed to RAG.
92
- 4. **Context Retrieval:** The user's question is used to query the current FAISS vector store (containing embeddings of the active review context) to retrieve the top `k` relevant review snippets.
93
- 5. **Conversational Chain Execution (`ConversationalRetrievalChain`):**
94
- * **Condense Question:** If there's chat history, the LLM uses `CONDENSE_QUESTION_PROMPT` to rephrase the current question into a standalone query.
95
- * **RAG Generation:** The condensed question and retrieved context snippets are passed to the LLM with the strict `qa_prompt`. The LLM synthesizes an answer based *only* on the provided context.
96
- * **Memory Update:** The question and final answer are added to the `ConversationBufferMemory`.
97
- 6. **Response:** The synthesized answer is returned to the user.
 
98
 
99
  ---
100
 
@@ -102,15 +104,15 @@ Includes all features from v1.0 (now powered by Mistral 7B) **plus**:
102
 
103
  Developing v2.0 involved significant experimentation and revealed several challenges:
104
 
105
- 1. **Consistent Instruction Following:** While powerful, the Mistral 7B GGUF model sometimes struggled to consistently follow complex negative constraints or nuanced instructions in prompts, especially within the RAG chain. This led to:
106
- * **Context Leakage:** Occasionally including irrelevant details from retrieved chunks (e.g., mentioning webcam when asked about battery).
107
- * **Hallucination:** Making up information (e.g., mentioning "phone" for laptop battery, inventing prices or product names).
108
- * **Over-Cautiousness:** Incorrectly stating "cannot find information" even when relevant details were present in the context, particularly for negative aspects (e.g., hardware issues).
109
- * **Misinterpretation:** Failing to correctly understand the specific user question (e.g., "taste" vs. "type", comparison questions).
110
- 2. **Prompt Engineering Complexity:** Finding the right prompt structure required extensive iteration. Simple prompts lacked control, while overly complex prompts sometimes confused the model. Few-shot prompting proved essential for reliable intent classification. Balancing strictness (for grounding) with flexibility (to allow synthesis) in the RAG prompt was difficult.
111
- 3. **Intent Classification Brittleness:** Getting the LLM to output *only* the classification label required moving from zero-shot, to strict instructions, to few-shot examples, and finally adding robust parsing logic (`parse_intent`) to handle noisy LLM outputs reliably.
112
- 4. **Performance:** Running the 7B parameter GGUF model on a CPU is significantly slower than using smaller models or GPU acceleration. Batch analysis and RAG responses take noticeable time.
113
- 5. **Evaluation Bottleneck:** Using external APIs (like OpenAI) for RAGAs evaluation can incur costs and hit rate limits. Using the local model for evaluation is free but slower and potentially less objective.
114
 
115
  ---
116
 
@@ -120,10 +122,10 @@ Achieving the final, relatively stable performance required significant iteratio
120
 
121
  **Intent Classification (`intent_prompt`):**
122
 
123
- * Initial attempts with simple zero-shot prompts failed, with the model providing verbose, incorrect classifications.
124
- * Adding strict formatting rules (`MUST BE EXACTLY...`) helped but wasn't sufficient.
125
- * **Few-Shot Prompting** (providing explicit examples within the prompt) proved crucial for forcing the model to output the correct labels, although often with extra text.
126
- * **Robust Parsing (`parse_intent`)** was added to reliably extract the core "Product" or "Off-Topic" keyword from the model's potentially noisy output.
127
 
128
  **Final `intent_template`:**
129
 
@@ -155,13 +157,13 @@ Classification:"""
155
 
156
  **RAG Generation (`qa_system_prompt`):**
157
 
158
- * Initial simple prompts led to significant hallucination and context leakage.
159
 
160
- * Adding strict rules improved grounding but sometimes made the model overly cautious, failing to find information present in the context.
161
 
162
- * Explicitly addressing failure modes (like comparisons) helped for those specific cases.
163
 
164
- * Experimenting with different chain types (`stuff`, `map_reduce`, `refine`) showed limitations related to context window size and model instruction following. `stuff` with `ConversationalRetrievalChain` proved most practical.
165
 
166
  **Final qa_system_prompt (within qa_prompt):**
167
 
@@ -193,17 +195,17 @@ This iterative process demonstrates the practical challenges and refinement need
193
 
194
  ## ๐Ÿ”ฎ Future Improvements
195
 
196
- * **RAG Evaluation**: Fully implement and integrate RAGAs (or TruLens) evaluation using the local LLM or a free tier API to get quantitative metrics on Faithfulness, Answer Relevancy, etc.
197
 
198
- * **LLM Upgrade**: Experiment with larger or more advanced instruction-tuned models (e.g., Mixtral GGUF, Llama 3 70/8B Instruct GGUF, or API-based models like GPT-4/Claude 3) to achieve higher consistency in instruction following and synthesis.
199
 
200
- * **Advanced Retrieval**: Explore more sophisticated retrieval techniques (e.g., HyDE, MultiQueryRetriever, Re-ranking) to improve the quality of context chunks passed to the LLM, potentially reducing generation errors.
201
 
202
- * **Batch Processing for Analysis**: Re-implement batch processing for Phase 1 using techniques like `map_reduce` to handle large numbers of reviews that exceed the LLM's context window.
203
 
204
- * **Error Handling & UI**: Add more granular error handling and user feedback in the Gradio UI (e.g., clearer messages if context loading fails).
205
 
206
- * **Automated Testing**: Implement unit and integration tests using `pytest` for the core logic in `src/pipeline.py`.
207
 
208
  ---
209
 
@@ -238,11 +240,11 @@ Run the Gradio app:
238
  python app.py
239
  ```
240
 
241
- Access the interface in your browser
242
 
243
- * **Tab 1 ("Batch Analyzer"):** Paste reviews or upload a file to perform Summary, Aspect Extraction, and Sentiment Analysis. This does not affect the chatbot context.
244
 
245
- * **Tab 2 ("Ask a Question"):** Chat with the RAG bot. Use the file upload and "Update Chatbot Context" button within this tab to change the reviews the chatbot uses. Use "Reset Chatbot Context to Default" to revert to the built-in laptop reviews. Use "Reset Chat Memory" to clear the conversation history.
246
 
247
  ---
248
 
@@ -266,30 +268,30 @@ ReviewSense/
266
 
267
  **Core Technologies**
268
 
269
- * Python 3.10+
270
 
271
- * LangChain: Orchestration, Chains (ConversationalRetrievalChain), Memory, Prompts
272
 
273
- * llama-cpp-python: Local execution of GGUF models on CPU
274
 
275
- * FAISS (faiss-cpu): Efficient vector similarity search
276
 
277
- * Sentence-Transformers (all-MiniLM-L6-v2): Text embeddings
278
 
279
- * Gradio: Interactive web UI
280
 
281
- * PyTorch (dependency via transformers/sentence-transformers)
282
 
283
- * Pandas, NumPy (standard data handling)
284
 
285
  **Core LLM**
286
 
287
- * Mistral 7B Instruct v0.1 (GGUF Q4_K_M): Used for all NLP tasks (Analysis, RAG Generation, Intent Classification). Downloaded from TheBloke on Hugging Face.
288
 
289
  ---
290
 
291
  ## ๐Ÿ“œ Version History
292
 
293
- * v2.0 (Current): RAG Chatbot, Single Mistral 7B model, Dynamic Context, Memory, Guardrails, Gradio UI, Code Refactoring.
294
 
295
- * v1.0: [https://github.com/DanielKiani/ReviewSense/releases/tag/v1.0] - Initial Batch Analysis Engine using multiple specialized models (DistilBERT, DistilBART, etc.). Focused on Sentiment, Aspects, and Summarization. (See v1.0 README for full details).
 
10
  ![demo_tab1](assets/tab1.png)
11
  ![demo_tab2](assets/tab2.png)
12
 
13
+ You can find the Web demo here โžก [Web Demo](https://huggingface.co/spaces/DanielKiani/ReviewSense)
14
  **[Note]: running this model on the cpu takes a while to complete you can relax and get a cup of coffee while the model generates responses !โ˜•**
15
 
16
  ---
 
24
  - [๐Ÿ”ง Challenges & Limitations](#-challenges--limitations)
25
  - [๐Ÿ’ก Prompt Engineering Journey](#-prompt-engineering-journey)
26
  - [๐Ÿ”ฎ Future Improvements](#-future-improvements)
27
+ - [โš™๏ธ Setup and Installation](#%EF%B8%8F-setup-and-installation)
28
  - [โ–ถ๏ธ Usage](#๏ธ-usage)
29
  - [๐Ÿ“ Project Structure (v2.0)](#-project-structure-v20)
30
+ - [๐Ÿ› ๏ธ Technologies and Models (v2.0)](#%EF%B8%8F-technologies-and-models-v20)
31
  - [๐Ÿ“œ Version History](#-version-history)
32
 
33
  ---
 
44
 
45
  Version 2.0 represents a major leap in functionality and architecture:
46
 
47
+ 1. **๐Ÿค– RAG Chatbot Implementation:** Added an interactive chatbot (Phase 2) that uses Retrieval-Augmented Generation (RAG) to answer user questions based on review context.
48
+ 2. **๐Ÿง  Single LLM Architecture:** Replaced the multiple specialized models (DistilBERT, DistilBART, DeBERTa, POS Tagger) from v1.0 with a single, powerful Mistral 7B GGUF model, executed locally via `LlamaCpp`. This model now handles:
49
+ - Batch Analysis (Summary, Aspects, Sentiment - Phase 1) with higher quality.
50
+ - RAG-based Question Answering (Phase 2).
51
+ - Intent Classification (Guardrail for Phase 2).
52
+ 3. **๐Ÿ“„ Dynamic Context Management:** The chatbot can now operate on a default set of reviews or dynamically update its knowledge base using user-uploaded `.txt` or `.csv` files.
53
+ 4. **๐Ÿ’ฌ Conversational Memory:** Integrated LangChain's `ConversationBufferMemory` allowing the chatbot to understand follow-up questions.
54
+ 5. **๐Ÿ›ก๏ธ Intent Classification Guardrail:** Implemented a robust intent classifier (using the same LLM) to prevent the chatbot from answering off-topic questions, ensuring responses stay grounded in product reviews.
55
+ 6. **๐Ÿ–ฅ๏ธ Unified Gradio UI:** Developed a two-tab Gradio interface (`app.py`) providing access to both the Batch Analyzer and the RAG Chatbot in a single application.
56
+ 7. **๐Ÿ’ป Local Execution Script:** Added `main.py` for command-line execution of batch analysis or interactive chat without the Gradio UI.
57
+ 8. **๐Ÿงฑ Modular Code Structure:** Refactored code into `src/pipeline.py` for core logic, improving organization and maintainability.
58
 
59
  ---
60
 
 
63
  Includes all features from v1.0 (now powered by Mistral 7B) **plus**:
64
 
65
  - **Interactive RAG Chatbot:**
66
+ - Ask specific questions about product reviews (e.g., "How is the battery life?", "Is the app reliable?").
67
+ - Answers synthesized directly from provided review context using RAG.
68
+ - **Conversational Memory:** Understands follow-up questions ("What about the screen?").
69
+ - **Grounded Responses:** Designed to answer only based on the reviews provided.
70
+ - **Intent Guardrail:** Filters out and responds appropriately to off-topic questions.
71
  - **Dynamic Context Loading:**
72
+ - Chatbot operates on default reviews or context loaded from user-uploaded files (`.txt`/`.csv`).
73
+ - Clear indication of the currently active context.
74
  - **Unified LLM Backend:** All NLP tasks (analysis, Q&A, classification) handled by a single Mistral 7B GGUF model running locally.
75
  - **Dual Interface:** Accessible via Gradio web UI (`app.py`) or command line (`main.py`).
76
 
 
79
  ## ๐Ÿง  How It Works: The v2.0 Pipeline
80
 
81
  **Phase 1: Batch Analysis (via `analyze_reviews_only` or `analyze_reviews_logic`)**
82
+
83
+ 1. User provides review text (paste or file).
84
+ 2. The text is passed to the Mistral LLM using three distinct prompts (Summarization, Aspect Extraction, Sentiment Analysis).
85
+ 3. The LLM generates the three analysis outputs.
86
 
87
  **Phase 2: RAG Chatbot (via `ask_question_with_guardrail` or `get_chatbot_response`)**
88
+
89
+ 1. User asks a question.
90
+ 2. **Intent Classification:** The query is first sent to the Mistral LLM with the `intent_prompt` (few-shot) to classify it as "Product" or "Off-Topic". Robust parsing checks the LLM output.
91
+ 3. **Routing:**
92
+ - If "Off-Topic", a canned response is returned.
93
+ - If "Product", proceed to RAG.
94
+ 4. **Context Retrieval:** The user's question is used to query the current FAISS vector store (containing embeddings of the active review context) to retrieve the top `k` relevant review snippets.
95
+ 5. **Conversational Chain Execution (`ConversationalRetrievalChain`):**
96
+ - **Condense Question:** If there's chat history, the LLM uses `CONDENSE_QUESTION_PROMPT` to rephrase the current question into a standalone query.
97
+ - **RAG Generation:** The condensed question and retrieved context snippets are passed to the LLM with the strict `qa_prompt`. The LLM synthesizes an answer based *only* on the provided context.
98
+ - **Memory Update:** The question and final answer are added to the `ConversationBufferMemory`.
99
+ 6. **Response:** The synthesized answer is returned to the user.
100
 
101
  ---
102
 
 
104
 
105
  Developing v2.0 involved significant experimentation and revealed several challenges:
106
 
107
+ 1. **Consistent Instruction Following:** While powerful, the Mistral 7B GGUF model sometimes struggled to consistently follow complex negative constraints or nuanced instructions in prompts, especially within the RAG chain. This led to:
108
+ - **Context Leakage:** Occasionally including irrelevant details from retrieved chunks (e.g., mentioning webcam when asked about battery).
109
+ - **Hallucination:** Making up information (e.g., mentioning "phone" for laptop battery, inventing prices or product names).
110
+ - **Over-Cautiousness:** Incorrectly stating "cannot find information" even when relevant details were present in the context, particularly for negative aspects (e.g., hardware issues).
111
+ - **Misinterpretation:** Failing to correctly understand the specific user question (e.g., "taste" vs. "type", comparison questions).
112
+ 2. **Prompt Engineering Complexity:** Finding the right prompt structure required extensive iteration. Simple prompts lacked control, while overly complex prompts sometimes confused the model. Few-shot prompting proved essential for reliable intent classification. Balancing strictness (for grounding) with flexibility (to allow synthesis) in the RAG prompt was difficult.
113
+ 3. **Intent Classification Brittleness:** Getting the LLM to output *only* the classification label required moving from zero-shot, to strict instructions, to few-shot examples, and finally adding robust parsing logic (`parse_intent`) to handle noisy LLM outputs reliably.
114
+ 4. **Performance:** Running the 7B parameter GGUF model on a CPU is significantly slower than using smaller models or GPU acceleration. Batch analysis and RAG responses take noticeable time.
115
+ 5. **Evaluation Bottleneck:** Using external APIs (like OpenAI) for RAGAs evaluation can incur costs and hit rate limits. Using the local model for evaluation is free but slower and potentially less objective.
116
 
117
  ---
118
 
 
122
 
123
  **Intent Classification (`intent_prompt`):**
124
 
125
+ - Initial attempts with simple zero-shot prompts failed, with the model providing verbose, incorrect classifications.
126
+ - Adding strict formatting rules (`MUST BE EXACTLY...`) helped but wasn't sufficient.
127
+ - **Few-Shot Prompting** (providing explicit examples within the prompt) proved crucial for forcing the model to output the correct labels, although often with extra text.
128
+ - **Robust Parsing (`parse_intent`)** was added to reliably extract the core "Product" or "Off-Topic" keyword from the model's potentially noisy output.
129
 
130
  **Final `intent_template`:**
131
 
 
157
 
158
  **RAG Generation (`qa_system_prompt`):**
159
 
160
+ - Initial simple prompts led to significant hallucination and context leakage.
161
 
162
+ - Adding strict rules improved grounding but sometimes made the model overly cautious, failing to find information present in the context.
163
 
164
+ - Explicitly addressing failure modes (like comparisons) helped for those specific cases.
165
 
166
+ - Experimenting with different chain types (`stuff`, `map_reduce`, `refine`) showed limitations related to context window size and model instruction following. `stuff` with `ConversationalRetrievalChain` proved most practical.
167
 
168
  **Final qa_system_prompt (within qa_prompt):**
169
 
 
195
 
196
  ## ๐Ÿ”ฎ Future Improvements
197
 
198
+ - **RAG Evaluation**: Fully implement and integrate RAGAs (or TruLens) evaluation using the local LLM or a free tier API to get quantitative metrics on Faithfulness, Answer Relevancy, etc.
199
 
200
+ - **LLM Upgrade**: Experiment with larger or more advanced instruction-tuned models (e.g., Mixtral GGUF, Llama 3 70/8B Instruct GGUF, or API-based models like GPT-4/Claude 3) to achieve higher consistency in instruction following and synthesis.
201
 
202
+ - **Advanced Retrieval**: Explore more sophisticated retrieval techniques (e.g., HyDE, MultiQueryRetriever, Re-ranking) to improve the quality of context chunks passed to the LLM, potentially reducing generation errors.
203
 
204
+ - **Batch Processing for Analysis**: Re-implement batch processing for Phase 1 using techniques like `map_reduce` to handle large numbers of reviews that exceed the LLM's context window.
205
 
206
+ - **Error Handling & UI**: Add more granular error handling and user feedback in the Gradio UI (e.g., clearer messages if context loading fails).
207
 
208
+ - **Automated Testing**: Implement unit and integration tests using `pytest` for the core logic in `src/pipeline.py`.
209
 
210
  ---
211
 
 
240
  python app.py
241
  ```
242
 
243
+ Access the interface in your browser
244
 
245
+ - **Tab 1 ("Batch Analyzer"):** Paste reviews or upload a file to perform Summary, Aspect Extraction, and Sentiment Analysis. This does not affect the chatbot context.
246
 
247
+ - **Tab 2 ("Ask a Question"):** Chat with the RAG bot. Use the file upload and "Update Chatbot Context" button within this tab to change the reviews the chatbot uses. Use "Reset Chatbot Context to Default" to revert to the built-in laptop reviews. Use "Reset Chat Memory" to clear the conversation history.
248
 
249
  ---
250
 
 
268
 
269
  **Core Technologies**
270
 
271
+ - Python 3.10+
272
 
273
+ - LangChain: Orchestration, Chains (ConversationalRetrievalChain), Memory, Prompts
274
 
275
+ - llama-cpp-python: Local execution of GGUF models on CPU
276
 
277
+ - FAISS (faiss-cpu): Efficient vector similarity search
278
 
279
+ - Sentence-Transformers (all-MiniLM-L6-v2): Text embeddings
280
 
281
+ - Gradio: Interactive web UI
282
 
283
+ - PyTorch (dependency via transformers/sentence-transformers)
284
 
285
+ - Pandas, NumPy (standard data handling)
286
 
287
  **Core LLM**
288
 
289
+ - Mistral 7B Instruct v0.1 (GGUF Q4_K_M): Used for all NLP tasks (Analysis, RAG Generation, Intent Classification). Downloaded from TheBloke on Hugging Face.
290
 
291
  ---
292
 
293
  ## ๐Ÿ“œ Version History
294
 
295
+ - v2.0 (Current): RAG Chatbot, Single Mistral 7B model, Dynamic Context, Memory, Guardrails, Gradio UI, Code Refactoring.
296
 
297
+ - v1.0: [https://github.com/DanielKiani/ReviewSense/releases/tag/v1.0] - Initial Batch Analysis Engine using multiple specialized models (DistilBERT, DistilBART, etc.). Focused on Sentiment, Aspects, and Summarization. (See v1.0 README for full details).
requirements.txt CHANGED
@@ -1,4 +1,3 @@
1
- <<<<<<< HEAD
2
  langchain==0.3.27
3
  langchain-community==0.3.31
4
  gradio==5.49.1
@@ -14,17 +13,4 @@ datasets==4.0.0
14
  numpy==2.0.2
15
  accelerate==1.11.0
16
  aiohttp==3.13.1
17
- huggingface-hub==0.35.3
18
- =======
19
- torch==2.8.0
20
- transformers==4.56.1
21
- pytorch-lightning==2.5.5
22
- torchmetrics==1.8.2
23
- sentencepiece==0.2.1
24
- pandas==2.2.2
25
- scikit-learn==1.6.1
26
- gradio==5.44.1
27
- matplotlib==3.10.0
28
- seaborn==0.13.2
29
- wordcloud==1.9.4
30
- >>>>>>> e6de3c4338f79386345fa6e4bba5b0666ad808da
 
 
1
  langchain==0.3.27
2
  langchain-community==0.3.31
3
  gradio==5.49.1
 
13
  numpy==2.0.2
14
  accelerate==1.11.0
15
  aiohttp==3.13.1
16
+ huggingface-hub==0.35.3
 
 
 
 
 
 
 
 
 
 
 
 
 
scripts/app.py CHANGED
@@ -1,4 +1,3 @@
1
- <<<<<<< HEAD
2
  # app.py
3
 
4
  import gradio as gr
@@ -278,166 +277,4 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
278
  # --- Launch Command ---
279
  if __name__ == "__main__":
280
  chat_memory.clear() # Clear memory each time app starts
281
- demo.launch(debug=True)
282
- =======
283
- import gradio as gr
284
- import os
285
- import torch
286
- import pandas as pd
287
- import re
288
-
289
- # --- IMPORTANT ---
290
- # This script assumes you have a 'models.py' file in the same directory
291
- # containing the definitions for all model and inference classes.
292
- try:
293
- from models import (
294
- ReviewSummarizer,
295
- AspectAnalyzer,
296
- AspectExtractor,
297
- FineTunedSentimentClassifier
298
- )
299
- except ImportError:
300
- print("CRITICAL ERROR: Make sure 'models.py' exists and contains the required classes.")
301
- # Define dummy classes if imports fail, so Gradio can at least launch with an error message.
302
- class ReviewSummarizer: pass
303
- class AspectAnalyzer: pass
304
- class AspectExtractor: pass
305
- class FineTunedSentimentClassifier: pass
306
-
307
- # --- Configuration ---
308
- # --- IMPORTANT: UPDATE THIS PATH ---
309
- # You need to provide the path to the best checkpoint file that was saved
310
- # during the training of your sentiment model.
311
- SENTIMENT_CHECKPOINT_PATH = "checkpoints/sentiment-binary-best-checkpoint.ckpt" # <-- CHANGE THIS
312
-
313
- # --- Pre-defined Aspect Dictionaries for Different Product Categories ---
314
- ASPECT_DICTIONARIES = {
315
- "Phone": ['camera', 'battery', 'battery life', 'screen', 'performance', 'price', 'design'],
316
- "Coffee Maker": ['ease of use', 'design', 'noise level', 'coffee quality', 'brew time', 'cleaning'],
317
- "Book": ['plot', 'characters', 'writing style', 'pacing', 'ending'],
318
- "Default": ['quality', 'price', 'service', 'design', 'features'] # A fallback list
319
- }
320
-
321
-
322
- # --- 1. Load All Models (Global Objects) ---
323
- print("--- Initializing all models for the Gradio App ---")
324
- sentiment_classifier, summarizer, aspect_analyzer, aspect_extractor = None, None, None, None
325
- try:
326
- summarizer = ReviewSummarizer(force_cpu=True)
327
- aspect_analyzer = AspectAnalyzer(force_cpu=True)
328
- aspect_extractor = AspectExtractor(force_cpu=True)
329
-
330
- if not os.path.exists(SENTIMENT_CHECKPOINT_PATH):
331
- print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!")
332
- print("!!! WARNING: Sentiment checkpoint path not found or not set. !!!")
333
- print(f"!!! Please update the 'SENTIMENT_CHECKPOINT_PATH' variable in app.py")
334
- print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!")
335
- else:
336
- sentiment_classifier = FineTunedSentimentClassifier(
337
- checkpoint_path=SENTIMENT_CHECKPOINT_PATH, force_cpu=True
338
- )
339
- print("\n--- All models loaded successfully ---\n")
340
- except Exception as e:
341
- print(f"An error occurred during model initialization: {e}")
342
-
343
-
344
- # --- 2. Define the Core Analysis Function ---
345
- def analyze_review(review_text, product_category):
346
- if not review_text:
347
- return {"ERROR": "Please enter a review."}, "", None
348
-
349
- # --- a. Overall Sentiment Analysis ---
350
- if sentiment_classifier:
351
- sentiment_result = sentiment_classifier.classify(review_text)
352
- sentiment_output = {
353
- sentiment_result['label']: f"{sentiment_result['score']:.2f}"
354
- }
355
- else:
356
- sentiment_output = {"ERROR": "Fine-tuned model not loaded. Check path."}
357
-
358
- # --- b. Review Summarization ---
359
- if summarizer:
360
- summary_output = summarizer.summarize(review_text)
361
- else:
362
- summary_output = "ERROR: Summarizer model not loaded."
363
-
364
- # --- c. Dynamic Aspect Extraction & Analysis ---
365
- aspect_df = None
366
- if aspect_extractor and aspect_analyzer:
367
- aspect_dictionary = ASPECT_DICTIONARIES.get(product_category, ASPECT_DICTIONARIES["Default"])
368
- extracted_aspects = aspect_extractor.extract(review_text, aspect_dictionary=aspect_dictionary)
369
-
370
- if extracted_aspects:
371
- aspect_results = aspect_analyzer.analyze(review_text, extracted_aspects)
372
- aspect_df = pd.DataFrame([
373
- {'Aspect': aspect, 'Sentiment': result['sentiment'], 'Score': f"{result['score']:.2f}"}
374
- for aspect, result in aspect_results.items()
375
- ])
376
-
377
- return sentiment_output, summary_output, aspect_df
378
-
379
-
380
- # --- 3. Build the Gradio Interface ---
381
- with gr.Blocks(theme=gr.themes.Soft()) as demo:
382
- gr.Markdown("# ๐Ÿ›๏ธ ReviewSense: Product Review Analysis Engine")
383
- gr.Markdown(
384
- "Enter a product review and select the product category. The tool will automatically "
385
- "detect relevant features and provide an overall sentiment score, a summary, and a "
386
- "breakdown of sentiment towards each feature."
387
- )
388
-
389
- with gr.Row():
390
- with gr.Column(scale=2):
391
- review_input = gr.Textbox(
392
- lines=10,
393
- label="Enter Product Review Here",
394
- placeholder="e.g., The camera is amazing, but the battery life is terrible..."
395
- )
396
- category_input = gr.Dropdown(
397
- choices=list(ASPECT_DICTIONARIES.keys()),
398
- label="Select Product Category",
399
- value="Phone"
400
- )
401
- analyze_button = gr.Button("Analyze Review", variant="primary")
402
-
403
- with gr.Column(scale=1):
404
- gr.Markdown("### Overall Sentiment")
405
- sentiment_output = gr.Label()
406
-
407
- gr.Markdown("### Generated Summary")
408
- summary_output = gr.Textbox(lines=5, label="Summary", interactive=False)
409
-
410
- gr.Markdown("### Detected Aspect Sentiments")
411
- aspect_output = gr.DataFrame(headers=["Aspect", "Sentiment", "Score"], label="Aspects", interactive=False)
412
-
413
- # Connect the button to the function
414
- analyze_button.click(
415
- fn=analyze_review,
416
- inputs=[review_input, category_input],
417
- outputs=[sentiment_output, summary_output, aspect_output]
418
- )
419
-
420
- gr.Examples(
421
- examples=[
422
- [
423
- "The camera on this phone is incredible, the pictures are professional quality. However, the battery life is a total disaster, it barely lasts half a day with light use. The screen is bright and responsive, which I love.",
424
- "Phone"
425
- ],
426
- [
427
- "I am absolutely in love with this coffee maker! It's incredibly easy to use, brews a perfect cup every single time, and the design looks fantastic on my countertop. It's also surprisingly quiet.",
428
- "Coffee Maker"
429
- ],
430
- [
431
- "An amazing story with characters that felt so real. The plot had me hooked from the first page, though I felt the ending was a bit rushed.",
432
- "Book"
433
- ]
434
- ],
435
- inputs=[review_input, category_input]
436
- )
437
-
438
-
439
- # --- 4. Launch the App ---
440
- if __name__ == "__main__":
441
- print("Launching Gradio App...")
442
- demo.launch()
443
- >>>>>>> e6de3c4338f79386345fa6e4bba5b0666ad808da
 
 
1
  # app.py
2
 
3
  import gradio as gr
 
277
  # --- Launch Command ---
278
  if __name__ == "__main__":
279
  chat_memory.clear() # Clear memory each time app starts
280
+ demo.launch(debug=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
scripts/main.py CHANGED
@@ -1,4 +1,3 @@
1
- <<<<<<< HEAD
2
  # main.py
3
 
4
  import torch
@@ -210,115 +209,4 @@ if __name__ == "__main__":
210
  break
211
  print("\n--- Chat session ended. ---")
212
 
213
- print("\n--- Local Execution Finished ---")
214
- =======
215
- import os
216
- import torch
217
- import pandas as pd
218
-
219
- try:
220
- from data_prepare import ReviewDataset, ReviewDataModule
221
- from models import SentimentClassifier, ReviewSummarizer, AspectAnalyzer, FineTunedSentimentClassifier, AspectExtractor
222
- except ImportError:
223
- print("CRITICAL ERROR: Make sure 'review_summarizer.py', 'aspect_extractor.py', and 'sentiment_classifier_model.py' are in the same directory.")
224
- exit()
225
-
226
- # --- Configuration ---
227
- # --- IMPORTANT: UPDATE THIS PATH ---
228
- # You need to provide the path to the best checkpoint file that was saved
229
- # during the training of your sentiment model.
230
- SENTIMENT_CHECKPOINT_PATH = "checkpoints/sentiment-binary-best-checkpoint.ckpt"
231
-
232
- # --- Pre-defined Aspect Dictionaries for Different Product Categories ---
233
- ASPECT_DICTIONARIES = {
234
- "Phone": ['camera', 'battery', 'battery life', 'screen', 'performance', 'price', 'design'],
235
- "Coffee Maker": ['ease of use', 'design', 'noise level', 'coffee quality', 'brew time', 'cleaning'],
236
- "Book": ['plot', 'characters', 'writing style', 'pacing', 'ending'],
237
- "Default": ['quality', 'price', 'service', 'design', 'features'] # A fallback list
238
- }
239
-
240
- def main():
241
- """
242
- Main function to run the command-line review analysis tool.
243
- """
244
- # --- 1. Load All Models ---
245
- print("--- Initializing all models ---")
246
- sentiment_classifier, summarizer, aspect_analyzer, aspect_extractor = None, None, None, None
247
- try:
248
- summarizer = ReviewSummarizer(force_cpu=True)
249
- aspect_analyzer = AspectAnalyzer(force_cpu=True)
250
- aspect_extractor = AspectExtractor(force_cpu=True)
251
-
252
- if not os.path.exists(SENTIMENT_CHECKPOINT_PATH):
253
- print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!")
254
- print("!!! WARNING: Sentiment checkpoint path not found or not set. !!!")
255
- print(f"!!! Please update the 'SENTIMENT_CHECKPOINT_PATH' variable in main.py")
256
- print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!")
257
- else:
258
- sentiment_classifier = FineTunedSentimentClassifier(
259
- checkpoint_path=SENTIMENT_CHECKPOINT_PATH, force_cpu=True
260
- )
261
- print("\n--- All models loaded successfully ---\n")
262
- except Exception as e:
263
- print(f"An error occurred during model initialization: {e}")
264
- return
265
-
266
- # --- 2. Interactive Loop ---
267
- while True:
268
- print("\n==================================================")
269
- print(" Product Review Analysis Tool ")
270
- print("==================================================")
271
-
272
- # Get user input
273
- review_text = input("Enter the product review text (or type 'quit' to exit):\n> ")
274
- if review_text.lower() == 'quit':
275
- break
276
-
277
- print("\nAvailable Product Categories:")
278
- for i, category in enumerate(ASPECT_DICTIONARIES.keys(), 1):
279
- print(f"{i}. {category}")
280
-
281
- category_choice = input(f"Select a product category (1-{len(ASPECT_DICTIONARIES)}):\n> ")
282
- try:
283
- category_idx = int(category_choice) - 1
284
- product_category = list(ASPECT_DICTIONARIES.keys())[category_idx]
285
- except (ValueError, IndexError):
286
- print("Invalid choice. Using 'Default' category.")
287
- product_category = "Default"
288
-
289
- # --- 3. Run Analysis ---
290
- print("\n--- Analyzing Review... ---")
291
-
292
- # a. Overall Sentiment
293
- sentiment_result = sentiment_classifier.classify(review_text)
294
-
295
- # b. Summary
296
- summary_result = summarizer.summarize(review_text)
297
-
298
- # c. Aspect Extraction and Analysis
299
- aspect_dictionary = ASPECT_DICTIONARIES.get(product_category)
300
- extracted_aspects = aspect_extractor.extract(review_text, aspect_dictionary)
301
- aspect_results = None
302
- if extracted_aspects:
303
- aspect_results = aspect_analyzer.analyze(review_text, extracted_aspects)
304
-
305
- # --- 4. Display Results ---
306
- print("\n-------------------- ANALYSIS RESULTS --------------------")
307
- print(f"\n[ Overall Sentiment ]")
308
- print(f" - Sentiment: {sentiment_result['label']} (Score: {sentiment_result['score']:.2f})")
309
-
310
- print(f"\n[ Generated Summary ]")
311
- print(f" - {summary_result}")
312
-
313
- print(f"\n[ Detected Aspect Sentiments ]")
314
- if aspect_results:
315
- for aspect, result in aspect_results.items():
316
- print(f" - {aspect.title()}: {result['sentiment']} (Score: {result['score']:.2f})")
317
- else:
318
- print(" - No relevant aspects from the dictionary were detected in the review.")
319
- print("----------------------------------------------------------")
320
-
321
-
322
- if __name__ == "__main__":
323
- main()
324
- >>>>>>> e6de3c4338f79386345fa6e4bba5b0666ad808da
 
 
1
  # main.py
2
 
3
  import torch
 
209
  break
210
  print("\n--- Chat session ended. ---")
211
 
212
+ print("\n--- Local Execution Finished ---")