qa-bug-triage / README.md
aiqualitylab's picture
Update README.md
a00a036 verified

A newer version of the Gradio SDK is available: 6.16.0

Upgrade
metadata
license: mit
title: ' QA Bug Triage Pipeline'
sdk: gradio
emoji: πŸ†
colorFrom: red
colorTo: gray

πŸ› QA Bug Triage Pipeline

A modern RAG workflow for turning messy app reviews into structured, searchable QA bug intelligence.

Python OpenAI Gradio ChromaDB License

πŸ”— Links: Hugging Face Demo Β· GitHub Repository


πŸ“– Overview

Teams often receive product feedback as noisy, repetitive, and unstructured review text. This project converts those reviews into structured bug reports with an LLM, stores them in a local vector database, and makes them easy to search and summarize.

The result is a lightweight bug triage assistant built with Python, Gradio, OpenAI, ChromaDB, and RAG evaluation tooling.


✨ What It Does

Capability Description
πŸ“₯ Review collection Fetches real Google Play reviews
πŸ”€ Query routing Classifies incoming text before triage
πŸ—‚οΈ Structured triage Generates JSON bug reports with consistent fields
πŸ” Hybrid retrieval Combines semantic retrieval with BM25 keyword matching
πŸ€– AI summaries Produces concise summaries for triage and search results
πŸ—‘οΈ Store reset Clears persisted bugs directly from the UI

πŸ—οΈ Architecture

Google Play Reviews
        β”‚
        β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚ Query Router β”‚  ──→  feature request / general complaint (dropped)
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚ bug report
        β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚   Triage    β”‚  ──→  structured JSON bug record
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  ChromaDB   β”‚  ──→  vector + BM25 hybrid index
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β–Ό
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚ AI Summary  β”‚  ──→  concise triage output
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

# Windows PowerShell
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
python app.py

Then open the local Gradio URL in your browser.


πŸ”‘ API Keys

This app uses BYOK (Bring Your Own Key):

  • Paste your OpenAI API key into the masked field in the UI
  • The key input is masked and never committed to the repository

⚠️ Never commit API keys to source control.


πŸ–₯️ How To Use

  1. Collect β€” fetch and triage live Google Play reviews
  2. Triage β€” analyze a single custom review
  3. Search β€” retrieve similar bugs via hybrid retrieval
  4. Clear bugs β€” reset the ChromaDB store

πŸ“ Project Structure

qa-bug-triage/
β”œβ”€β”€ app.py                  # Gradio app and interaction flows
β”œβ”€β”€ collect.py              # Google Play review collection
β”œβ”€β”€ triage.py               # Routing and structured triage logic
β”œβ”€β”€ rag.py                  # Chroma storage and hybrid retrieval
└── eval/
    β”œβ”€β”€ eval.py             # RAG evaluation script
    β”œβ”€β”€ eval_dataset.json   # Evaluation dataset
    └── results.json        # Latest saved evaluation metrics

πŸ“Š Evaluation

Run the evaluation suite:

python eval\eval.py --api-key YOUR_OPENAI_API_KEY

Latest results:

Metric Score
Answer Relevancy 0.868
Faithfulness 0.292
Context Precision 0.020

πŸ’° Cost Estimate

Target: under $0.50 for a short demo session.

Parameter Value
Token range ~8k – 20k tokens
Typical cost < $0.50 per session
Recommended max reviews 5 – 10

Tips to keep costs low:

  • Keep max reviews between 5 and 10
  • Avoid repeated large collect runs
  • Use short test inputs for manual triage validation

πŸ› οΈ Tech Stack

Tool Role
Python Core language
Gradio Web UI
OpenAI GPT-4o LLM for triage and summaries
ChromaDB Vector store
rank-bm25 Keyword retrieval
RAGAS RAG evaluation framework
google-play-scraper Review data source

βœ… Functionalities Implemented

Requirements covered

  • RAG project written in Python
  • Uses at least one LLM
  • Public repository with collection and curation scripts
  • README with project explanation and setup
  • BYOK input in the UI β€” see API Keys
  • Cost estimate included β€” see Cost Estimate
  • API key requirements listed β€” see API Keys
  • More than 5 optional techniques covered (7 total β€” see below)

Techniques implemented

  • Streaming responses in the UI β€” app.py
  • Dynamic few-shot prompting using similar bugs β€” triage.py
  • Evaluation code and dataset included β€” eval/eval.py, eval/eval_dataset.json
  • Domain-specific app for QA bug triage β€” triage.py, app.py
  • Structured JSON data curation for RAG β€” triage.py
  • Hybrid retrieval with semantic search and BM25 β€” rag.py
  • Query routing in the active app flow β€” triage.py

πŸ“„ License

MIT Β© aiqualitylab