aiqualitylab commited on
Commit
a00a036
Β·
verified Β·
1 Parent(s): c10c63a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +197 -189
README.md CHANGED
@@ -1,190 +1,198 @@
1
- # πŸ› QA Bug Triage Pipeline
2
-
3
- > A modern RAG workflow for turning messy app reviews into structured, searchable QA bug intelligence.
4
-
5
- [![Python](https://img.shields.io/badge/Python-3.10+-blue?style=flat-square&logo=python&logoColor=white)](https://python.org)
6
- [![OpenAI](https://img.shields.io/badge/GPT--4o-OpenAI-412991?style=flat-square&logo=openai&logoColor=white)](https://openai.com)
7
- [![Gradio](https://img.shields.io/badge/Gradio-UI-orange?style=flat-square&logo=gradio&logoColor=white)](https://gradio.app)
8
- [![ChromaDB](https://img.shields.io/badge/ChromaDB-Vector%20Store-teal?style=flat-square)](https://trychroma.com)
9
- [![License](https://img.shields.io/badge/License-MIT-green?style=flat-square)](LICENSE)
10
-
11
- **πŸ”— Links:** [Hugging Face Demo](https://huggingface.co/spaces/aiqualitylab/qa-bug-triage) Β· [GitHub Repository](https://github.com/aiqualitylab/qa-bug-triage)
12
-
13
- ---
14
-
15
- ## πŸ“– Overview
16
-
17
- Teams often receive product feedback as noisy, repetitive, and unstructured review text. This project converts those reviews into structured bug reports with an LLM, stores them in a local vector database, and makes them easy to search and summarize.
18
-
19
- The result is a lightweight **bug triage assistant** built with Python, Gradio, OpenAI, ChromaDB, and RAG evaluation tooling.
20
-
21
- ---
22
-
23
- ## ✨ What It Does
24
-
25
- | Capability | Description |
26
- |---|---|
27
- | πŸ“₯ Review collection | Fetches real Google Play reviews |
28
- | πŸ”€ Query routing | Classifies incoming text before triage |
29
- | πŸ—‚οΈ Structured triage | Generates JSON bug reports with consistent fields |
30
- | πŸ” Hybrid retrieval | Combines semantic retrieval with BM25 keyword matching |
31
- | πŸ€– AI summaries | Produces concise summaries for triage and search results |
32
- | πŸ—‘οΈ Store reset | Clears persisted bugs directly from the UI |
33
-
34
- ---
35
-
36
- ## πŸ—οΈ Architecture
37
-
38
- ```
39
- Google Play Reviews
40
- β”‚
41
- β–Ό
42
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
43
- β”‚ Query Router β”‚ ──→ feature request / general complaint (dropped)
44
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
45
- β”‚ bug report
46
- β–Ό
47
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
48
- β”‚ Triage β”‚ ──→ structured JSON bug record
49
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
50
- β”‚
51
- β–Ό
52
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
53
- β”‚ ChromaDB β”‚ ──→ vector + BM25 hybrid index
54
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
55
- β”‚
56
- β–Ό
57
- β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
58
- β”‚ AI Summary β”‚ ──→ concise triage output
59
- β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
60
- ```
61
-
62
- ---
63
-
64
- ## πŸš€ Quick Start
65
-
66
- ```powershell
67
- # Windows PowerShell
68
- python -m venv .venv
69
- .\.venv\Scripts\Activate.ps1
70
- pip install -r requirements.txt
71
- python app.py
72
- ```
73
-
74
- Then open the local Gradio URL in your browser.
75
-
76
- ---
77
-
78
- ## πŸ”‘ API Keys
79
-
80
- This app uses **BYOK (Bring Your Own Key)**:
81
-
82
- - Paste your OpenAI API key into the masked field in the UI
83
- - The key input is masked and never committed to the repository
84
-
85
- > ⚠️ **Never commit API keys to source control.**
86
-
87
- ---
88
-
89
- ## πŸ–₯️ How To Use
90
-
91
- 1. **Collect** β€” fetch and triage live Google Play reviews
92
- 2. **Triage** β€” analyze a single custom review
93
- 3. **Search** β€” retrieve similar bugs via hybrid retrieval
94
- 4. **Clear bugs** β€” reset the ChromaDB store
95
-
96
- ---
97
-
98
- ## πŸ“ Project Structure
99
-
100
- ```
101
- qa-bug-triage/
102
- β”œβ”€β”€ app.py # Gradio app and interaction flows
103
- β”œβ”€β”€ collect.py # Google Play review collection
104
- β”œβ”€β”€ triage.py # Routing and structured triage logic
105
- β”œβ”€β”€ rag.py # Chroma storage and hybrid retrieval
106
- └── eval/
107
- β”œβ”€β”€ eval.py # RAG evaluation script
108
- β”œβ”€β”€ eval_dataset.json # Evaluation dataset
109
- └── results.json # Latest saved evaluation metrics
110
- ```
111
-
112
- ---
113
-
114
- ## πŸ“Š Evaluation
115
-
116
- Run the evaluation suite:
117
-
118
- ```powershell
119
- python eval\eval.py --api-key YOUR_OPENAI_API_KEY
120
- ```
121
-
122
- **Latest results:**
123
-
124
- | Metric | Score |
125
- |---|---|
126
- | Answer Relevancy | `0.868` |
127
- | Faithfulness | `0.292` |
128
- | Context Precision | `0.020` |
129
-
130
- ---
131
-
132
- ## πŸ’° Cost Estimate
133
-
134
- **Target:** under `$0.50` for a short demo session.
135
-
136
- | Parameter | Value |
137
- |---|---|
138
- | Token range | ~8k – 20k tokens |
139
- | Typical cost | < $0.50 per session |
140
- | Recommended max reviews | 5 – 10 |
141
-
142
- **Tips to keep costs low:**
143
- - Keep max reviews between 5 and 10
144
- - Avoid repeated large collect runs
145
- - Use short test inputs for manual triage validation
146
-
147
- ---
148
-
149
- ## πŸ› οΈ Tech Stack
150
-
151
- | Tool | Role |
152
- |---|---|
153
- | [Python](https://python.org) | Core language |
154
- | [Gradio](https://gradio.app) | Web UI |
155
- | [OpenAI GPT-4o](https://openai.com) | LLM for triage and summaries |
156
- | [ChromaDB](https://trychroma.com) | Vector store |
157
- | [rank-bm25](https://github.com/dorianbrown/rank_bm25) | Keyword retrieval |
158
- | [RAGAS](https://docs.ragas.io) | RAG evaluation framework |
159
- | [google-play-scraper](https://github.com/JoMingyu/google-play-scraper) | Review data source |
160
-
161
- ---
162
-
163
- ## βœ… Functionalities Implemented
164
-
165
- ### Requirements covered
166
-
167
- - [x] RAG project written in Python
168
- - [x] Uses at least one LLM
169
- - [x] Public repository with collection and curation scripts
170
- - [x] README with project explanation and setup
171
- - [x] BYOK input in the UI β€” see [API Keys](#-api-keys)
172
- - [x] Cost estimate included β€” see [Cost Estimate](#-cost-estimate)
173
- - [x] API key requirements listed β€” see [API Keys](#-api-keys)
174
- - [x] More than 5 optional techniques covered (7 total β€” see below)
175
-
176
- ### Techniques implemented
177
-
178
- - [x] Streaming responses in the UI β€” `app.py`
179
- - [x] Dynamic few-shot prompting using similar bugs β€” `triage.py`
180
- - [x] Evaluation code and dataset included β€” `eval/eval.py`, `eval/eval_dataset.json`
181
- - [x] Domain-specific app for QA bug triage β€” `triage.py`, `app.py`
182
- - [x] Structured JSON data curation for RAG β€” `triage.py`
183
- - [x] Hybrid retrieval with semantic search and BM25 β€” `rag.py`
184
- - [x] Query routing in the active app flow β€” `triage.py`
185
-
186
- ---
187
-
188
- ## πŸ“„ License
189
-
 
 
 
 
 
 
 
 
190
  MIT Β© [aiqualitylab](https://github.com/aiqualitylab)
 
1
+ ---
2
+ license: mit
3
+ title: ' QA Bug Triage Pipeline'
4
+ sdk: gradio
5
+ emoji: πŸ†
6
+ colorFrom: red
7
+ colorTo: gray
8
+ ---
9
+ # πŸ› QA Bug Triage Pipeline
10
+
11
+ > A modern RAG workflow for turning messy app reviews into structured, searchable QA bug intelligence.
12
+
13
+ [![Python](https://img.shields.io/badge/Python-3.10+-blue?style=flat-square&logo=python&logoColor=white)](https://python.org)
14
+ [![OpenAI](https://img.shields.io/badge/GPT--4o-OpenAI-412991?style=flat-square&logo=openai&logoColor=white)](https://openai.com)
15
+ [![Gradio](https://img.shields.io/badge/Gradio-UI-orange?style=flat-square&logo=gradio&logoColor=white)](https://gradio.app)
16
+ [![ChromaDB](https://img.shields.io/badge/ChromaDB-Vector%20Store-teal?style=flat-square)](https://trychroma.com)
17
+ [![License](https://img.shields.io/badge/License-MIT-green?style=flat-square)](LICENSE)
18
+
19
+ **πŸ”— Links:** [Hugging Face Demo](https://huggingface.co/spaces/aiqualitylab/qa-bug-triage) Β· [GitHub Repository](https://github.com/aiqualitylab/qa-bug-triage)
20
+
21
+ ---
22
+
23
+ ## πŸ“– Overview
24
+
25
+ Teams often receive product feedback as noisy, repetitive, and unstructured review text. This project converts those reviews into structured bug reports with an LLM, stores them in a local vector database, and makes them easy to search and summarize.
26
+
27
+ The result is a lightweight **bug triage assistant** built with Python, Gradio, OpenAI, ChromaDB, and RAG evaluation tooling.
28
+
29
+ ---
30
+
31
+ ## ✨ What It Does
32
+
33
+ | Capability | Description |
34
+ |---|---|
35
+ | πŸ“₯ Review collection | Fetches real Google Play reviews |
36
+ | πŸ”€ Query routing | Classifies incoming text before triage |
37
+ | πŸ—‚οΈ Structured triage | Generates JSON bug reports with consistent fields |
38
+ | πŸ” Hybrid retrieval | Combines semantic retrieval with BM25 keyword matching |
39
+ | πŸ€– AI summaries | Produces concise summaries for triage and search results |
40
+ | πŸ—‘οΈ Store reset | Clears persisted bugs directly from the UI |
41
+
42
+ ---
43
+
44
+ ## πŸ—οΈ Architecture
45
+
46
+ ```
47
+ Google Play Reviews
48
+ β”‚
49
+ β–Ό
50
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
51
+ β”‚ Query Router β”‚ ──→ feature request / general complaint (dropped)
52
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
53
+ β”‚ bug report
54
+ β–Ό
55
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
56
+ β”‚ Triage β”‚ ──→ structured JSON bug record
57
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
58
+ β”‚
59
+ β–Ό
60
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
61
+ β”‚ ChromaDB β”‚ ──→ vector + BM25 hybrid index
62
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
63
+ β”‚
64
+ β–Ό
65
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
66
+ β”‚ AI Summary β”‚ ──→ concise triage output
67
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
68
+ ```
69
+
70
+ ---
71
+
72
+ ## πŸš€ Quick Start
73
+
74
+ ```powershell
75
+ # Windows PowerShell
76
+ python -m venv .venv
77
+ .\.venv\Scripts\Activate.ps1
78
+ pip install -r requirements.txt
79
+ python app.py
80
+ ```
81
+
82
+ Then open the local Gradio URL in your browser.
83
+
84
+ ---
85
+
86
+ ## πŸ”‘ API Keys
87
+
88
+ This app uses **BYOK (Bring Your Own Key)**:
89
+
90
+ - Paste your OpenAI API key into the masked field in the UI
91
+ - The key input is masked and never committed to the repository
92
+
93
+ > ⚠️ **Never commit API keys to source control.**
94
+
95
+ ---
96
+
97
+ ## πŸ–₯️ How To Use
98
+
99
+ 1. **Collect** β€” fetch and triage live Google Play reviews
100
+ 2. **Triage** β€” analyze a single custom review
101
+ 3. **Search** β€” retrieve similar bugs via hybrid retrieval
102
+ 4. **Clear bugs** β€” reset the ChromaDB store
103
+
104
+ ---
105
+
106
+ ## πŸ“ Project Structure
107
+
108
+ ```
109
+ qa-bug-triage/
110
+ β”œβ”€β”€ app.py # Gradio app and interaction flows
111
+ β”œβ”€β”€ collect.py # Google Play review collection
112
+ β”œβ”€β”€ triage.py # Routing and structured triage logic
113
+ β”œβ”€β”€ rag.py # Chroma storage and hybrid retrieval
114
+ └── eval/
115
+ β”œβ”€β”€ eval.py # RAG evaluation script
116
+ β”œβ”€β”€ eval_dataset.json # Evaluation dataset
117
+ └── results.json # Latest saved evaluation metrics
118
+ ```
119
+
120
+ ---
121
+
122
+ ## πŸ“Š Evaluation
123
+
124
+ Run the evaluation suite:
125
+
126
+ ```powershell
127
+ python eval\eval.py --api-key YOUR_OPENAI_API_KEY
128
+ ```
129
+
130
+ **Latest results:**
131
+
132
+ | Metric | Score |
133
+ |---|---|
134
+ | Answer Relevancy | `0.868` |
135
+ | Faithfulness | `0.292` |
136
+ | Context Precision | `0.020` |
137
+
138
+ ---
139
+
140
+ ## πŸ’° Cost Estimate
141
+
142
+ **Target:** under `$0.50` for a short demo session.
143
+
144
+ | Parameter | Value |
145
+ |---|---|
146
+ | Token range | ~8k – 20k tokens |
147
+ | Typical cost | < $0.50 per session |
148
+ | Recommended max reviews | 5 – 10 |
149
+
150
+ **Tips to keep costs low:**
151
+ - Keep max reviews between 5 and 10
152
+ - Avoid repeated large collect runs
153
+ - Use short test inputs for manual triage validation
154
+
155
+ ---
156
+
157
+ ## πŸ› οΈ Tech Stack
158
+
159
+ | Tool | Role |
160
+ |---|---|
161
+ | [Python](https://python.org) | Core language |
162
+ | [Gradio](https://gradio.app) | Web UI |
163
+ | [OpenAI GPT-4o](https://openai.com) | LLM for triage and summaries |
164
+ | [ChromaDB](https://trychroma.com) | Vector store |
165
+ | [rank-bm25](https://github.com/dorianbrown/rank_bm25) | Keyword retrieval |
166
+ | [RAGAS](https://docs.ragas.io) | RAG evaluation framework |
167
+ | [google-play-scraper](https://github.com/JoMingyu/google-play-scraper) | Review data source |
168
+
169
+ ---
170
+
171
+ ## βœ… Functionalities Implemented
172
+
173
+ ### Requirements covered
174
+
175
+ - [x] RAG project written in Python
176
+ - [x] Uses at least one LLM
177
+ - [x] Public repository with collection and curation scripts
178
+ - [x] README with project explanation and setup
179
+ - [x] BYOK input in the UI β€” see [API Keys](#-api-keys)
180
+ - [x] Cost estimate included β€” see [Cost Estimate](#-cost-estimate)
181
+ - [x] API key requirements listed β€” see [API Keys](#-api-keys)
182
+ - [x] More than 5 optional techniques covered (7 total β€” see below)
183
+
184
+ ### Techniques implemented
185
+
186
+ - [x] Streaming responses in the UI β€” `app.py`
187
+ - [x] Dynamic few-shot prompting using similar bugs β€” `triage.py`
188
+ - [x] Evaluation code and dataset included β€” `eval/eval.py`, `eval/eval_dataset.json`
189
+ - [x] Domain-specific app for QA bug triage β€” `triage.py`, `app.py`
190
+ - [x] Structured JSON data curation for RAG β€” `triage.py`
191
+ - [x] Hybrid retrieval with semantic search and BM25 β€” `rag.py`
192
+ - [x] Query routing in the active app flow β€” `triage.py`
193
+
194
+ ---
195
+
196
+ ## πŸ“„ License
197
+
198
  MIT Β© [aiqualitylab](https://github.com/aiqualitylab)