--- license: mit title: ' QA Bug Triage Pipeline' sdk: gradio emoji: ๐Ÿ† colorFrom: red colorTo: gray --- # ๐Ÿ› QA Bug Triage Pipeline > A modern RAG workflow for turning messy app reviews into structured, searchable QA bug intelligence. [![Python](https://img.shields.io/badge/Python-3.10+-blue?style=flat-square&logo=python&logoColor=white)](https://python.org) [![OpenAI](https://img.shields.io/badge/GPT--4o-OpenAI-412991?style=flat-square&logo=openai&logoColor=white)](https://openai.com) [![Gradio](https://img.shields.io/badge/Gradio-UI-orange?style=flat-square&logo=gradio&logoColor=white)](https://gradio.app) [![ChromaDB](https://img.shields.io/badge/ChromaDB-Vector%20Store-teal?style=flat-square)](https://trychroma.com) [![License](https://img.shields.io/badge/License-MIT-green?style=flat-square)](LICENSE) **๐Ÿ”— Links:** [Hugging Face Demo](https://huggingface.co/spaces/aiqualitylab/qa-bug-triage) ยท [GitHub Repository](https://github.com/aiqualitylab/qa-bug-triage) --- ## ๐Ÿ“– Overview Teams often receive product feedback as noisy, repetitive, and unstructured review text. This project converts those reviews into structured bug reports with an LLM, stores them in a local vector database, and makes them easy to search and summarize. The result is a lightweight **bug triage assistant** built with Python, Gradio, OpenAI, ChromaDB, and RAG evaluation tooling. --- ## โœจ What It Does | Capability | Description | |---|---| | ๐Ÿ“ฅ Review collection | Fetches real Google Play reviews | | ๐Ÿ”€ Query routing | Classifies incoming text before triage | | ๐Ÿ—‚๏ธ Structured triage | Generates JSON bug reports with consistent fields | | ๐Ÿ” Hybrid retrieval | Combines semantic retrieval with BM25 keyword matching | | ๐Ÿค– AI summaries | Produces concise summaries for triage and search results | | ๐Ÿ—‘๏ธ Store reset | Clears persisted bugs directly from the UI | --- ## ๐Ÿ—๏ธ Architecture ``` Google Play Reviews โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Query Router โ”‚ โ”€โ”€โ†’ feature request / general complaint (dropped) โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ bug report โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Triage โ”‚ โ”€โ”€โ†’ structured JSON bug record โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ ChromaDB โ”‚ โ”€โ”€โ†’ vector + BM25 hybrid index โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ AI Summary โ”‚ โ”€โ”€โ†’ concise triage output โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` --- ## ๐Ÿš€ Quick Start ```powershell # Windows PowerShell python -m venv .venv .\.venv\Scripts\Activate.ps1 pip install -r requirements.txt python app.py ``` Then open the local Gradio URL in your browser. --- ## ๐Ÿ”‘ API Keys This app uses **BYOK (Bring Your Own Key)**: - Paste your OpenAI API key into the masked field in the UI - The key input is masked and never committed to the repository > โš ๏ธ **Never commit API keys to source control.** --- ## ๐Ÿ–ฅ๏ธ How To Use 1. **Collect** โ€” fetch and triage live Google Play reviews 2. **Triage** โ€” analyze a single custom review 3. **Search** โ€” retrieve similar bugs via hybrid retrieval 4. **Clear bugs** โ€” reset the ChromaDB store --- ## ๐Ÿ“ Project Structure ``` qa-bug-triage/ โ”œโ”€โ”€ app.py # Gradio app and interaction flows โ”œโ”€โ”€ collect.py # Google Play review collection โ”œโ”€โ”€ triage.py # Routing and structured triage logic โ”œโ”€โ”€ rag.py # Chroma storage and hybrid retrieval โ””โ”€โ”€ eval/ โ”œโ”€โ”€ eval.py # RAG evaluation script โ”œโ”€โ”€ eval_dataset.json # Evaluation dataset โ””โ”€โ”€ results.json # Latest saved evaluation metrics ``` --- ## ๐Ÿ“Š Evaluation Run the evaluation suite: ```powershell python eval\eval.py --api-key YOUR_OPENAI_API_KEY ``` **Latest results:** | Metric | Score | |---|---| | Answer Relevancy | `0.868` | | Faithfulness | `0.292` | | Context Precision | `0.020` | --- ## ๐Ÿ’ฐ Cost Estimate **Target:** under `$0.50` for a short demo session. | Parameter | Value | |---|---| | Token range | ~8k โ€“ 20k tokens | | Typical cost | < $0.50 per session | | Recommended max reviews | 5 โ€“ 10 | **Tips to keep costs low:** - Keep max reviews between 5 and 10 - Avoid repeated large collect runs - Use short test inputs for manual triage validation --- ## ๐Ÿ› ๏ธ Tech Stack | Tool | Role | |---|---| | [Python](https://python.org) | Core language | | [Gradio](https://gradio.app) | Web UI | | [OpenAI GPT-4o](https://openai.com) | LLM for triage and summaries | | [ChromaDB](https://trychroma.com) | Vector store | | [rank-bm25](https://github.com/dorianbrown/rank_bm25) | Keyword retrieval | | [RAGAS](https://docs.ragas.io) | RAG evaluation framework | | [google-play-scraper](https://github.com/JoMingyu/google-play-scraper) | Review data source | --- ## โœ… Functionalities Implemented ### Requirements covered - [x] RAG project written in Python - [x] Uses at least one LLM - [x] Public repository with collection and curation scripts - [x] README with project explanation and setup - [x] BYOK input in the UI โ€” see [API Keys](#-api-keys) - [x] Cost estimate included โ€” see [Cost Estimate](#-cost-estimate) - [x] API key requirements listed โ€” see [API Keys](#-api-keys) - [x] More than 5 optional techniques covered (7 total โ€” see below) ### Techniques implemented - [x] Streaming responses in the UI โ€” `app.py` - [x] Dynamic few-shot prompting using similar bugs โ€” `triage.py` - [x] Evaluation code and dataset included โ€” `eval/eval.py`, `eval/eval_dataset.json` - [x] Domain-specific app for QA bug triage โ€” `triage.py`, `app.py` - [x] Structured JSON data curation for RAG โ€” `triage.py` - [x] Hybrid retrieval with semantic search and BM25 โ€” `rag.py` - [x] Query routing in the active app flow โ€” `triage.py` --- ## ๐Ÿ“„ License MIT ยฉ [aiqualitylab](https://github.com/aiqualitylab)