Spaces:
Sleeping
Sleeping
Upload folder using huggingface_hub
Browse files- README.md +117 -10
- __pycache__/app.cpython-313.pyc +0 -0
- app.py +1575 -0
- requirements.txt +3 -0
README.md
CHANGED
|
@@ -1,10 +1,117 @@
|
|
| 1 |
-
---
|
| 2 |
-
title: Vaultwise Knowledge
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
-
sdk:
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Vaultwise Knowledge
|
| 3 |
+
emoji: "\U0001F4DA"
|
| 4 |
+
colorFrom: indigo
|
| 5 |
+
colorTo: blue
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: 5.29.0
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Vaultwise -- Knowledge Management Platform
|
| 14 |
+
|
| 15 |
+
**Interactive demo for [Vaultwise](https://github.com/dbhavery/vaultwise), a knowledge management platform with document ingestion, vector search, AI-powered Q&A, training generation, and analytics.**
|
| 16 |
+
|
| 17 |
+
Vaultwise is a full-stack application (FastAPI + React) designed for teams that need to organize, search, and learn from their internal knowledge base. This demo showcases the core search and analytics capabilities using a built-in 30-article corpus for a fictional SaaS company.
|
| 18 |
+
|
| 19 |
+
## Demo Tabs
|
| 20 |
+
|
| 21 |
+
| Tab | What It Does |
|
| 22 |
+
|-----|--------------|
|
| 23 |
+
| **Knowledge Search** | TF-IDF vector search over 30 knowledge base articles. Enter a query, get ranked results with relevance scores and highlighted matching terms. |
|
| 24 |
+
| **AI Q&A** | Natural language question answering grounded in the knowledge base. Finds the best-matching article via TF-IDF, then generates an answer with source citation and relevant excerpt. |
|
| 25 |
+
| **Training Generator** | Select any article to auto-generate a training module: learning objectives, structured content outline, and a 5-question multiple-choice quiz. |
|
| 26 |
+
| **Knowledge Gap Analytics** | Dashboard with article distribution by category, freshness scores, view counts, and search query frequency analysis. |
|
| 27 |
+
|
| 28 |
+
## Search Algorithm
|
| 29 |
+
|
| 30 |
+
The TF-IDF search engine is implemented from scratch using only Python and numpy -- no sklearn, no external NLP libraries.
|
| 31 |
+
|
| 32 |
+
### How It Works
|
| 33 |
+
|
| 34 |
+
**1. Tokenization**
|
| 35 |
+
|
| 36 |
+
Input text is lowercased, punctuation-stripped, and split into tokens. A stop word list filters out common English words that carry no semantic weight.
|
| 37 |
+
|
| 38 |
+
**2. Term Frequency (TF)**
|
| 39 |
+
|
| 40 |
+
Uses augmented term frequency to prevent bias toward longer documents:
|
| 41 |
+
|
| 42 |
+
```
|
| 43 |
+
TF(t, d) = 0.5 + 0.5 * (count(t, d) / max_count(d))
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
**3. Inverse Document Frequency (IDF)**
|
| 47 |
+
|
| 48 |
+
Measures how rare a term is across the corpus. Terms appearing in fewer documents receive higher weight:
|
| 49 |
+
|
| 50 |
+
```
|
| 51 |
+
IDF(t) = log(N / (1 + df(t)))
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
Where N is the total number of documents and df(t) is the number of documents containing term t. The +1 smoothing prevents division by zero.
|
| 55 |
+
|
| 56 |
+
**4. TF-IDF Weight**
|
| 57 |
+
|
| 58 |
+
The final weight for each term in each document:
|
| 59 |
+
|
| 60 |
+
```
|
| 61 |
+
W(t, d) = TF(t, d) * IDF(t)
|
| 62 |
+
```
|
| 63 |
+
|
| 64 |
+
**5. Cosine Similarity**
|
| 65 |
+
|
| 66 |
+
Queries are converted to TF-IDF vectors using the same vocabulary and IDF values. Ranking uses cosine similarity between the query vector and each document vector:
|
| 67 |
+
|
| 68 |
+
```
|
| 69 |
+
similarity(q, d) = (q . d) / (||q|| * ||d||)
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
This measures the angle between vectors, making it independent of document length.
|
| 73 |
+
|
| 74 |
+
### Architecture (Full Platform)
|
| 75 |
+
|
| 76 |
+
```
|
| 77 |
+
Frontend (React + Vite)
|
| 78 |
+
|
|
| 79 |
+
v
|
| 80 |
+
API Gateway (FastAPI)
|
| 81 |
+
|
|
| 82 |
+
+-- Document Ingestion Pipeline
|
| 83 |
+
| PDF, HTML, Markdown parsing
|
| 84 |
+
| Chunking and metadata extraction
|
| 85 |
+
|
|
| 86 |
+
+-- Search Engine
|
| 87 |
+
| TF-IDF vectorization
|
| 88 |
+
| Cosine similarity ranking
|
| 89 |
+
| Query expansion and filtering
|
| 90 |
+
|
|
| 91 |
+
+-- AI Q&A Module
|
| 92 |
+
| Context retrieval via search
|
| 93 |
+
| LLM-powered answer generation
|
| 94 |
+
| Source citation and grounding
|
| 95 |
+
|
|
| 96 |
+
+-- Training Generator
|
| 97 |
+
| Article analysis
|
| 98 |
+
| Outline and quiz generation
|
| 99 |
+
| Learning objective extraction
|
| 100 |
+
|
|
| 101 |
+
+-- Analytics Engine
|
| 102 |
+
Usage tracking
|
| 103 |
+
Freshness scoring
|
| 104 |
+
Gap identification
|
| 105 |
+
```
|
| 106 |
+
|
| 107 |
+
## Running Locally
|
| 108 |
+
|
| 109 |
+
```bash
|
| 110 |
+
pip install gradio numpy matplotlib
|
| 111 |
+
python app.py
|
| 112 |
+
```
|
| 113 |
+
|
| 114 |
+
## Links
|
| 115 |
+
|
| 116 |
+
- **Source code:** [github.com/dbhavery/vaultwise](https://github.com/dbhavery/vaultwise)
|
| 117 |
+
- **Author:** [Don Havery](https://github.com/dbhavery)
|
__pycache__/app.cpython-313.pyc
ADDED
|
Binary file (67.2 kB). View file
|
|
|
app.py
ADDED
|
@@ -0,0 +1,1575 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Vaultwise -- Knowledge Management Platform
|
| 3 |
+
Interactive demo showcasing TF-IDF search, AI Q&A, training generation, and analytics.
|
| 4 |
+
|
| 5 |
+
All search functionality is implemented from scratch using numpy.
|
| 6 |
+
No sklearn or external NLP libraries required.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
import math
|
| 10 |
+
import re
|
| 11 |
+
import string
|
| 12 |
+
from collections import Counter
|
| 13 |
+
from typing import Optional
|
| 14 |
+
|
| 15 |
+
import gradio as gr
|
| 16 |
+
import matplotlib
|
| 17 |
+
import matplotlib.pyplot as plt
|
| 18 |
+
import numpy as np
|
| 19 |
+
|
| 20 |
+
matplotlib.use("Agg")
|
| 21 |
+
|
| 22 |
+
# ---------------------------------------------------------------------------
|
| 23 |
+
# Constants
|
| 24 |
+
# ---------------------------------------------------------------------------
|
| 25 |
+
|
| 26 |
+
APP_TITLE = "Vaultwise -- Knowledge Management Platform"
|
| 27 |
+
ACCENT_COLOR = "#3b82f6"
|
| 28 |
+
TOP_K_RESULTS = 5
|
| 29 |
+
|
| 30 |
+
CATEGORIES = [
|
| 31 |
+
"Onboarding",
|
| 32 |
+
"Billing",
|
| 33 |
+
"API",
|
| 34 |
+
"Security",
|
| 35 |
+
"Integrations",
|
| 36 |
+
"Infrastructure",
|
| 37 |
+
"Support",
|
| 38 |
+
"Compliance",
|
| 39 |
+
]
|
| 40 |
+
|
| 41 |
+
STOP_WORDS = frozenset(
|
| 42 |
+
{
|
| 43 |
+
"a", "an", "the", "and", "or", "but", "in", "on", "at", "to", "for",
|
| 44 |
+
"of", "with", "by", "from", "is", "it", "as", "are", "was", "were",
|
| 45 |
+
"be", "been", "being", "have", "has", "had", "do", "does", "did",
|
| 46 |
+
"will", "would", "could", "should", "may", "might", "shall", "can",
|
| 47 |
+
"this", "that", "these", "those", "i", "you", "he", "she", "we",
|
| 48 |
+
"they", "me", "him", "her", "us", "them", "my", "your", "his",
|
| 49 |
+
"its", "our", "their", "what", "which", "who", "whom", "how",
|
| 50 |
+
"when", "where", "why", "not", "no", "all", "each", "every",
|
| 51 |
+
"both", "few", "more", "most", "other", "some", "such", "than",
|
| 52 |
+
"too", "very", "just", "about", "if", "so", "also", "up", "out",
|
| 53 |
+
"into", "over", "after", "before", "between", "under", "through",
|
| 54 |
+
"during", "above", "below", "any", "only", "own", "same", "then",
|
| 55 |
+
"there", "here", "once", "while", "now", "new", "get", "use",
|
| 56 |
+
}
|
| 57 |
+
)
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
# ---------------------------------------------------------------------------
|
| 61 |
+
# Knowledge Base -- 30 articles for a fictional SaaS company "NovaCRM"
|
| 62 |
+
# ---------------------------------------------------------------------------
|
| 63 |
+
|
| 64 |
+
KNOWLEDGE_BASE: list[dict[str, str]] = [
|
| 65 |
+
# --- Onboarding ---
|
| 66 |
+
{
|
| 67 |
+
"id": "KB-001",
|
| 68 |
+
"title": "Getting Started with NovaCRM",
|
| 69 |
+
"category": "Onboarding",
|
| 70 |
+
"content": (
|
| 71 |
+
"Welcome to NovaCRM. This guide walks new users through initial account "
|
| 72 |
+
"setup, workspace configuration, and first-time login. After signing up, "
|
| 73 |
+
"you will receive a verification email. Click the link to activate your "
|
| 74 |
+
"account. Once logged in, navigate to Settings > Workspace to configure "
|
| 75 |
+
"your company name, timezone, and default currency. Invite team members "
|
| 76 |
+
"from the Team Management page by entering their email addresses. Each "
|
| 77 |
+
"new member receives an onboarding checklist that tracks their setup "
|
| 78 |
+
"progress through profile completion, integration connections, and first "
|
| 79 |
+
"deal creation."
|
| 80 |
+
),
|
| 81 |
+
"views": 4521,
|
| 82 |
+
"freshness": 0.95,
|
| 83 |
+
},
|
| 84 |
+
{
|
| 85 |
+
"id": "KB-002",
|
| 86 |
+
"title": "User Roles and Permissions Overview",
|
| 87 |
+
"category": "Onboarding",
|
| 88 |
+
"content": (
|
| 89 |
+
"NovaCRM supports four user roles: Admin, Manager, Agent, and Viewer. "
|
| 90 |
+
"Admins have full system access including billing, user management, and "
|
| 91 |
+
"API key generation. Managers can create teams, assign leads, and view "
|
| 92 |
+
"team analytics. Agents can manage their own contacts, deals, and tasks. "
|
| 93 |
+
"Viewers have read-only access to dashboards and reports. Custom roles "
|
| 94 |
+
"can be created under Settings > Roles with granular permission toggles "
|
| 95 |
+
"for each module. Role inheritance allows child roles to automatically "
|
| 96 |
+
"receive parent permissions. Audit logs track all permission changes."
|
| 97 |
+
),
|
| 98 |
+
"views": 3187,
|
| 99 |
+
"freshness": 0.88,
|
| 100 |
+
},
|
| 101 |
+
{
|
| 102 |
+
"id": "KB-003",
|
| 103 |
+
"title": "Importing Contacts and Data Migration",
|
| 104 |
+
"category": "Onboarding",
|
| 105 |
+
"content": (
|
| 106 |
+
"NovaCRM supports CSV, Excel, and vCard imports for contact migration. "
|
| 107 |
+
"Navigate to Contacts > Import to upload your file. The mapping wizard "
|
| 108 |
+
"automatically detects common fields like name, email, phone, and company. "
|
| 109 |
+
"For custom fields, drag and drop column headers to match your schema. "
|
| 110 |
+
"Duplicate detection runs automatically using email address matching with "
|
| 111 |
+
"configurable merge rules. For large migrations over 50,000 records, use "
|
| 112 |
+
"the bulk import API endpoint which processes records asynchronously and "
|
| 113 |
+
"sends a completion webhook. Migration history is available under Settings "
|
| 114 |
+
"> Data > Import History with rollback capabilities for the last 30 days."
|
| 115 |
+
),
|
| 116 |
+
"views": 2843,
|
| 117 |
+
"freshness": 0.82,
|
| 118 |
+
},
|
| 119 |
+
{
|
| 120 |
+
"id": "KB-004",
|
| 121 |
+
"title": "Setting Up Your Sales Pipeline",
|
| 122 |
+
"category": "Onboarding",
|
| 123 |
+
"content": (
|
| 124 |
+
"The sales pipeline in NovaCRM is fully customizable. Go to Pipeline > "
|
| 125 |
+
"Settings to create stages. Default stages include Lead, Qualified, "
|
| 126 |
+
"Proposal, Negotiation, and Closed Won or Closed Lost. Each stage has "
|
| 127 |
+
"configurable probability percentages for revenue forecasting. Drag deals "
|
| 128 |
+
"between stages on the Kanban board or update them in list view. "
|
| 129 |
+
"Automation rules can trigger actions when deals move between stages, "
|
| 130 |
+
"such as sending follow-up emails, creating tasks, or notifying managers. "
|
| 131 |
+
"Pipeline analytics show conversion rates between stages, average deal "
|
| 132 |
+
"velocity, and bottleneck identification."
|
| 133 |
+
),
|
| 134 |
+
"views": 3654,
|
| 135 |
+
"freshness": 0.91,
|
| 136 |
+
},
|
| 137 |
+
# --- Billing ---
|
| 138 |
+
{
|
| 139 |
+
"id": "KB-005",
|
| 140 |
+
"title": "Subscription Plans and Pricing",
|
| 141 |
+
"category": "Billing",
|
| 142 |
+
"content": (
|
| 143 |
+
"NovaCRM offers three subscription tiers: Starter at 29 dollars per user "
|
| 144 |
+
"per month, Professional at 79 dollars per user per month, and Enterprise "
|
| 145 |
+
"with custom pricing. Starter includes contact management, basic pipeline, "
|
| 146 |
+
"email integration, and 5 GB storage. Professional adds workflow automation, "
|
| 147 |
+
"advanced analytics, API access, and 50 GB storage. Enterprise includes "
|
| 148 |
+
"custom integrations, dedicated support, SSO, audit logs, and unlimited "
|
| 149 |
+
"storage. All plans include a 14-day free trial with no credit card "
|
| 150 |
+
"required. Annual billing provides a 20 percent discount."
|
| 151 |
+
),
|
| 152 |
+
"views": 5102,
|
| 153 |
+
"freshness": 0.97,
|
| 154 |
+
},
|
| 155 |
+
{
|
| 156 |
+
"id": "KB-006",
|
| 157 |
+
"title": "Managing Invoices and Payment Methods",
|
| 158 |
+
"category": "Billing",
|
| 159 |
+
"content": (
|
| 160 |
+
"Access your billing dashboard at Settings > Billing > Invoices. NovaCRM "
|
| 161 |
+
"accepts credit cards via Stripe and bank transfers for Enterprise plans. "
|
| 162 |
+
"Invoices are generated on the first of each month and sent to the billing "
|
| 163 |
+
"email address. Download invoices as PDF from the billing history page. "
|
| 164 |
+
"To update payment methods, navigate to Settings > Billing > Payment "
|
| 165 |
+
"Methods and add a new card or bank account. Failed payments trigger "
|
| 166 |
+
"automatic retry on days 3, 7, and 14. After three failures, the account "
|
| 167 |
+
"enters a 7-day grace period before suspension. Tax ID and VAT numbers "
|
| 168 |
+
"can be configured for proper invoice formatting."
|
| 169 |
+
),
|
| 170 |
+
"views": 1876,
|
| 171 |
+
"freshness": 0.85,
|
| 172 |
+
},
|
| 173 |
+
{
|
| 174 |
+
"id": "KB-007",
|
| 175 |
+
"title": "Upgrading and Downgrading Your Plan",
|
| 176 |
+
"category": "Billing",
|
| 177 |
+
"content": (
|
| 178 |
+
"Plan changes take effect immediately. When upgrading, you are charged a "
|
| 179 |
+
"prorated amount for the remaining billing cycle. When downgrading, the "
|
| 180 |
+
"new rate applies at the next billing cycle and a credit is issued for "
|
| 181 |
+
"the difference. Navigate to Settings > Billing > Change Plan to see "
|
| 182 |
+
"available options. Feature access adjusts automatically upon plan change. "
|
| 183 |
+
"Data retention is maintained during downgrades, but access to premium "
|
| 184 |
+
"features is restricted. If your current usage exceeds the new plan limits, "
|
| 185 |
+
"you will receive a warning with 30 days to reduce usage before enforcement."
|
| 186 |
+
),
|
| 187 |
+
"views": 1342,
|
| 188 |
+
"freshness": 0.79,
|
| 189 |
+
},
|
| 190 |
+
# --- API ---
|
| 191 |
+
{
|
| 192 |
+
"id": "KB-008",
|
| 193 |
+
"title": "REST API Authentication and Rate Limits",
|
| 194 |
+
"category": "API",
|
| 195 |
+
"content": (
|
| 196 |
+
"The NovaCRM REST API uses Bearer token authentication. Generate API keys "
|
| 197 |
+
"at Settings > API > Keys. Each key has configurable scopes: read, write, "
|
| 198 |
+
"delete, and admin. Include the token in the Authorization header as "
|
| 199 |
+
"Bearer followed by the token value. Rate limits depend on your plan: "
|
| 200 |
+
"Starter allows 100 requests per minute, Professional allows 1000, and "
|
| 201 |
+
"Enterprise allows 10000. Rate limit headers X-RateLimit-Remaining and "
|
| 202 |
+
"X-RateLimit-Reset are included in every response. Exceeding the limit "
|
| 203 |
+
"returns HTTP 429 with a Retry-After header. API keys can be rotated "
|
| 204 |
+
"without downtime using the key rotation endpoint."
|
| 205 |
+
),
|
| 206 |
+
"views": 4210,
|
| 207 |
+
"freshness": 0.93,
|
| 208 |
+
},
|
| 209 |
+
{
|
| 210 |
+
"id": "KB-009",
|
| 211 |
+
"title": "API Endpoints for Contact Management",
|
| 212 |
+
"category": "API",
|
| 213 |
+
"content": (
|
| 214 |
+
"Contact CRUD operations are available at the /api/v2/contacts endpoint. "
|
| 215 |
+
"GET returns a paginated list with default page size of 50. Use query "
|
| 216 |
+
"parameters for filtering: status, created_after, tags, and owner_id. "
|
| 217 |
+
"POST creates a new contact with required fields email and name. PUT "
|
| 218 |
+
"updates an existing contact by ID. PATCH allows partial updates. DELETE "
|
| 219 |
+
"moves a contact to trash with 30-day recovery. Bulk operations are "
|
| 220 |
+
"supported via /api/v2/contacts/bulk with a maximum batch size of 1000. "
|
| 221 |
+
"Response format follows JSON API specification with included relationships "
|
| 222 |
+
"for deals, activities, and notes."
|
| 223 |
+
),
|
| 224 |
+
"views": 3890,
|
| 225 |
+
"freshness": 0.90,
|
| 226 |
+
},
|
| 227 |
+
{
|
| 228 |
+
"id": "KB-010",
|
| 229 |
+
"title": "Webhooks and Event Subscriptions",
|
| 230 |
+
"category": "API",
|
| 231 |
+
"content": (
|
| 232 |
+
"NovaCRM supports webhooks for real-time event notifications. Configure "
|
| 233 |
+
"webhook endpoints at Settings > API > Webhooks. Available events include "
|
| 234 |
+
"contact.created, contact.updated, deal.stage_changed, deal.won, "
|
| 235 |
+
"deal.lost, task.completed, and email.opened. Each webhook delivery "
|
| 236 |
+
"includes an HMAC-SHA256 signature in the X-Webhook-Signature header "
|
| 237 |
+
"for payload verification. Failed deliveries are retried with exponential "
|
| 238 |
+
"backoff up to 5 times over 24 hours. Webhook logs show delivery status, "
|
| 239 |
+
"response codes, and payload details for the last 30 days. Test endpoints "
|
| 240 |
+
"can be configured to receive sample payloads during development."
|
| 241 |
+
),
|
| 242 |
+
"views": 2156,
|
| 243 |
+
"freshness": 0.87,
|
| 244 |
+
},
|
| 245 |
+
{
|
| 246 |
+
"id": "KB-011",
|
| 247 |
+
"title": "GraphQL API for Advanced Queries",
|
| 248 |
+
"category": "API",
|
| 249 |
+
"content": (
|
| 250 |
+
"NovaCRM provides a GraphQL endpoint at /api/graphql for complex data "
|
| 251 |
+
"queries. The schema supports contacts, deals, activities, teams, and "
|
| 252 |
+
"reports. Introspection is enabled for development environments. Query "
|
| 253 |
+
"depth is limited to 10 levels to prevent abuse. Mutations support "
|
| 254 |
+
"creating, updating, and deleting records with input validation. "
|
| 255 |
+
"Subscriptions are available for real-time updates via WebSocket "
|
| 256 |
+
"connections. The GraphQL playground is accessible at /api/graphql/explore "
|
| 257 |
+
"with auto-complete and documentation. Batch queries are supported with "
|
| 258 |
+
"a maximum of 5 operations per request to maintain performance."
|
| 259 |
+
),
|
| 260 |
+
"views": 1567,
|
| 261 |
+
"freshness": 0.84,
|
| 262 |
+
},
|
| 263 |
+
# --- Security ---
|
| 264 |
+
{
|
| 265 |
+
"id": "KB-012",
|
| 266 |
+
"title": "Single Sign-On (SSO) Configuration",
|
| 267 |
+
"category": "Security",
|
| 268 |
+
"content": (
|
| 269 |
+
"Enterprise plans support SAML 2.0 and OpenID Connect for SSO. Navigate "
|
| 270 |
+
"to Settings > Security > SSO to configure your identity provider. "
|
| 271 |
+
"Supported providers include Okta, Azure AD, Google Workspace, and "
|
| 272 |
+
"OneLogin. Upload your IdP metadata XML or enter the SSO URL, entity "
|
| 273 |
+
"ID, and X.509 certificate manually. User provisioning can be automated "
|
| 274 |
+
"via SCIM 2.0 for user lifecycle management. JIT (Just-In-Time) "
|
| 275 |
+
"provisioning creates user accounts on first login. SSO enforcement "
|
| 276 |
+
"can be toggled to require all users to authenticate through the IdP, "
|
| 277 |
+
"with bypass codes available for emergency admin access."
|
| 278 |
+
),
|
| 279 |
+
"views": 1890,
|
| 280 |
+
"freshness": 0.86,
|
| 281 |
+
},
|
| 282 |
+
{
|
| 283 |
+
"id": "KB-013",
|
| 284 |
+
"title": "Two-Factor Authentication Setup",
|
| 285 |
+
"category": "Security",
|
| 286 |
+
"content": (
|
| 287 |
+
"NovaCRM supports TOTP-based two-factor authentication via authenticator "
|
| 288 |
+
"apps such as Google Authenticator, Authy, and Microsoft Authenticator. "
|
| 289 |
+
"Enable 2FA at Profile > Security > Two-Factor Authentication. Scan the "
|
| 290 |
+
"QR code with your authenticator app and enter the verification code to "
|
| 291 |
+
"complete setup. Backup codes are generated during setup for account "
|
| 292 |
+
"recovery. Admins can enforce mandatory 2FA for all users or specific "
|
| 293 |
+
"roles under Settings > Security > Authentication Policies. SMS-based "
|
| 294 |
+
"2FA is available as a fallback option. Hardware security keys following "
|
| 295 |
+
"the FIDO2 WebAuthn standard are supported on Professional and Enterprise "
|
| 296 |
+
"plans."
|
| 297 |
+
),
|
| 298 |
+
"views": 2567,
|
| 299 |
+
"freshness": 0.92,
|
| 300 |
+
},
|
| 301 |
+
{
|
| 302 |
+
"id": "KB-014",
|
| 303 |
+
"title": "Data Encryption and Privacy Policies",
|
| 304 |
+
"category": "Security",
|
| 305 |
+
"content": (
|
| 306 |
+
"All data in NovaCRM is encrypted at rest using AES-256 and in transit "
|
| 307 |
+
"using TLS 1.3. Database backups are encrypted with separate keys stored "
|
| 308 |
+
"in AWS KMS. Personal data fields support field-level encryption for "
|
| 309 |
+
"enhanced privacy compliance. Data residency options allow choosing "
|
| 310 |
+
"between US, EU, and APAC regions for primary storage. NovaCRM is SOC 2 "
|
| 311 |
+
"Type II certified and GDPR compliant. Data Processing Agreements are "
|
| 312 |
+
"available for Enterprise customers. Right to erasure requests are "
|
| 313 |
+
"processed within 72 hours. Automated data retention policies can be "
|
| 314 |
+
"configured per data type with minimum 30-day and maximum 7-year ranges."
|
| 315 |
+
),
|
| 316 |
+
"views": 3210,
|
| 317 |
+
"freshness": 0.94,
|
| 318 |
+
},
|
| 319 |
+
{
|
| 320 |
+
"id": "KB-015",
|
| 321 |
+
"title": "Audit Logging and Compliance Reporting",
|
| 322 |
+
"category": "Security",
|
| 323 |
+
"content": (
|
| 324 |
+
"NovaCRM maintains comprehensive audit logs of all user actions, API "
|
| 325 |
+
"calls, and system events. Access audit logs at Settings > Security > "
|
| 326 |
+
"Audit Log. Logs include timestamp, user identity, action performed, "
|
| 327 |
+
"affected resource, IP address, and user agent. Logs are retained for "
|
| 328 |
+
"one year on Professional plans and seven years on Enterprise. Export "
|
| 329 |
+
"audit logs as CSV or JSON for external SIEM integration. Compliance "
|
| 330 |
+
"reports for SOC 2, HIPAA, and GDPR can be generated on demand. "
|
| 331 |
+
"Scheduled compliance reports can be configured to run weekly or monthly "
|
| 332 |
+
"with automatic delivery to designated compliance officers."
|
| 333 |
+
),
|
| 334 |
+
"views": 1456,
|
| 335 |
+
"freshness": 0.81,
|
| 336 |
+
},
|
| 337 |
+
# --- Integrations ---
|
| 338 |
+
{
|
| 339 |
+
"id": "KB-016",
|
| 340 |
+
"title": "Slack Integration for Team Notifications",
|
| 341 |
+
"category": "Integrations",
|
| 342 |
+
"content": (
|
| 343 |
+
"Connect NovaCRM to Slack for real-time deal and activity notifications. "
|
| 344 |
+
"Navigate to Settings > Integrations > Slack and click Connect. "
|
| 345 |
+
"Authorize NovaCRM to access your Slack workspace. Configure notification "
|
| 346 |
+
"channels for different event types: deal updates, new leads, task "
|
| 347 |
+
"assignments, and system alerts. Use slash commands to query CRM data "
|
| 348 |
+
"directly from Slack: /novacrm search retrieves contacts, /novacrm deal "
|
| 349 |
+
"shows deal details, and /novacrm report generates quick summaries. "
|
| 350 |
+
"Interactive buttons in notifications allow agents to update deal stages, "
|
| 351 |
+
"add notes, and schedule follow-ups without leaving Slack."
|
| 352 |
+
),
|
| 353 |
+
"views": 2987,
|
| 354 |
+
"freshness": 0.89,
|
| 355 |
+
},
|
| 356 |
+
{
|
| 357 |
+
"id": "KB-017",
|
| 358 |
+
"title": "Email Integration with Gmail and Outlook",
|
| 359 |
+
"category": "Integrations",
|
| 360 |
+
"content": (
|
| 361 |
+
"NovaCRM syncs with Gmail and Outlook for bidirectional email tracking. "
|
| 362 |
+
"Go to Settings > Integrations > Email to connect your account via OAuth. "
|
| 363 |
+
"Incoming emails from known contacts are automatically linked to their "
|
| 364 |
+
"CRM records. Email templates with merge fields can be created and shared "
|
| 365 |
+
"across the team. Tracking pixels detect email opens and link clicks with "
|
| 366 |
+
"timestamps. Scheduled sending allows queuing emails for optimal delivery "
|
| 367 |
+
"times. Email sequences enable automated multi-step outreach campaigns "
|
| 368 |
+
"with configurable delays and exit conditions. Unsubscribe handling "
|
| 369 |
+
"complies with CAN-SPAM and GDPR requirements automatically."
|
| 370 |
+
),
|
| 371 |
+
"views": 4102,
|
| 372 |
+
"freshness": 0.91,
|
| 373 |
+
},
|
| 374 |
+
{
|
| 375 |
+
"id": "KB-018",
|
| 376 |
+
"title": "Zapier and Make Integration Hub",
|
| 377 |
+
"category": "Integrations",
|
| 378 |
+
"content": (
|
| 379 |
+
"NovaCRM integrates with over 3000 applications through Zapier and Make "
|
| 380 |
+
"connectors. Common automation recipes include syncing new contacts to "
|
| 381 |
+
"email marketing platforms, creating support tickets from deal notes, "
|
| 382 |
+
"and updating accounting software when deals close. The NovaCRM Zapier "
|
| 383 |
+
"app supports triggers for contact events, deal changes, and form "
|
| 384 |
+
"submissions. Actions include creating contacts, updating deals, and "
|
| 385 |
+
"adding notes. Multi-step Zaps enable complex workflows spanning "
|
| 386 |
+
"multiple applications. Make scenarios support parallel branches for "
|
| 387 |
+
"simultaneous actions across different services."
|
| 388 |
+
),
|
| 389 |
+
"views": 1789,
|
| 390 |
+
"freshness": 0.83,
|
| 391 |
+
},
|
| 392 |
+
{
|
| 393 |
+
"id": "KB-019",
|
| 394 |
+
"title": "Calendar Sync with Google and Microsoft",
|
| 395 |
+
"category": "Integrations",
|
| 396 |
+
"content": (
|
| 397 |
+
"Synchronize your calendar with NovaCRM for seamless meeting management. "
|
| 398 |
+
"Connect Google Calendar or Microsoft Outlook Calendar at Settings > "
|
| 399 |
+
"Integrations > Calendar. Two-way sync ensures meetings created in either "
|
| 400 |
+
"platform appear in both. Meeting links from Zoom, Teams, and Google "
|
| 401 |
+
"Meet are automatically detected and added to CRM activities. The booking "
|
| 402 |
+
"page feature generates shareable scheduling links with configurable "
|
| 403 |
+
"availability windows, buffer times, and round-robin assignment for teams. "
|
| 404 |
+
"Meeting outcomes can be logged directly from calendar events with "
|
| 405 |
+
"predefined disposition codes and next-step actions."
|
| 406 |
+
),
|
| 407 |
+
"views": 2345,
|
| 408 |
+
"freshness": 0.88,
|
| 409 |
+
},
|
| 410 |
+
# --- Infrastructure ---
|
| 411 |
+
{
|
| 412 |
+
"id": "KB-020",
|
| 413 |
+
"title": "System Architecture and Performance",
|
| 414 |
+
"category": "Infrastructure",
|
| 415 |
+
"content": (
|
| 416 |
+
"NovaCRM runs on a microservices architecture deployed on AWS. The API "
|
| 417 |
+
"layer uses load-balanced application servers behind CloudFront CDN. "
|
| 418 |
+
"PostgreSQL with read replicas handles primary data storage. Redis "
|
| 419 |
+
"provides caching and session management. Elasticsearch powers full-text "
|
| 420 |
+
"search across contacts, deals, and communications. Background job "
|
| 421 |
+
"processing uses a distributed task queue for email sending, report "
|
| 422 |
+
"generation, and data imports. The platform maintains 99.9 percent uptime "
|
| 423 |
+
"SLA with automated failover across availability zones. Response times "
|
| 424 |
+
"average under 200 milliseconds for API calls."
|
| 425 |
+
),
|
| 426 |
+
"views": 987,
|
| 427 |
+
"freshness": 0.76,
|
| 428 |
+
},
|
| 429 |
+
{
|
| 430 |
+
"id": "KB-021",
|
| 431 |
+
"title": "Backup and Disaster Recovery Procedures",
|
| 432 |
+
"category": "Infrastructure",
|
| 433 |
+
"content": (
|
| 434 |
+
"NovaCRM performs automated database backups every 6 hours with point-in-time "
|
| 435 |
+
"recovery capability for the last 35 days. Backups are stored in a separate "
|
| 436 |
+
"AWS region from production data. Full disaster recovery tests are conducted "
|
| 437 |
+
"quarterly with documented recovery time objectives of 4 hours and recovery "
|
| 438 |
+
"point objectives of 1 hour. Customer data exports can be scheduled daily "
|
| 439 |
+
"or weekly via Settings > Data > Automated Exports in CSV or JSON format. "
|
| 440 |
+
"Enterprise customers can configure custom backup schedules and retention "
|
| 441 |
+
"policies. Backup verification runs automated integrity checks after each "
|
| 442 |
+
"snapshot to ensure recoverability."
|
| 443 |
+
),
|
| 444 |
+
"views": 654,
|
| 445 |
+
"freshness": 0.72,
|
| 446 |
+
},
|
| 447 |
+
{
|
| 448 |
+
"id": "KB-022",
|
| 449 |
+
"title": "Status Page and Incident Response",
|
| 450 |
+
"category": "Infrastructure",
|
| 451 |
+
"content": (
|
| 452 |
+
"Monitor NovaCRM system status at status.novacrm.com. The status page "
|
| 453 |
+
"shows real-time availability for all services: API, web application, "
|
| 454 |
+
"email delivery, webhook processing, and integrations. Subscribe to "
|
| 455 |
+
"status updates via email, SMS, or RSS. During incidents, updates are "
|
| 456 |
+
"posted every 15 minutes until resolution. Post-incident reports are "
|
| 457 |
+
"published within 48 hours with root cause analysis and preventive "
|
| 458 |
+
"measures. Scheduled maintenance windows are announced 72 hours in "
|
| 459 |
+
"advance. Enterprise customers receive priority notification through "
|
| 460 |
+
"a dedicated Slack channel and direct account manager communication."
|
| 461 |
+
),
|
| 462 |
+
"views": 1123,
|
| 463 |
+
"freshness": 0.80,
|
| 464 |
+
},
|
| 465 |
+
# --- Support ---
|
| 466 |
+
{
|
| 467 |
+
"id": "KB-023",
|
| 468 |
+
"title": "Contacting Support and SLA Details",
|
| 469 |
+
"category": "Support",
|
| 470 |
+
"content": (
|
| 471 |
+
"NovaCRM support is available through multiple channels. Starter plans "
|
| 472 |
+
"include email support with 24-hour response time during business hours. "
|
| 473 |
+
"Professional plans add live chat with 4-hour response time and phone "
|
| 474 |
+
"support during extended hours. Enterprise plans include a dedicated "
|
| 475 |
+
"account manager, 1-hour response time for critical issues, and 24/7 "
|
| 476 |
+
"phone support. Submit tickets at support.novacrm.com or via the in-app "
|
| 477 |
+
"help widget. Priority levels range from P1 for system-wide outages to "
|
| 478 |
+
"P4 for feature requests. Escalation procedures are documented in the "
|
| 479 |
+
"support portal with clear timelines for each priority level."
|
| 480 |
+
),
|
| 481 |
+
"views": 3456,
|
| 482 |
+
"freshness": 0.90,
|
| 483 |
+
},
|
| 484 |
+
{
|
| 485 |
+
"id": "KB-024",
|
| 486 |
+
"title": "Troubleshooting Common Login Issues",
|
| 487 |
+
"category": "Support",
|
| 488 |
+
"content": (
|
| 489 |
+
"Common login problems include forgotten passwords, expired sessions, "
|
| 490 |
+
"and browser compatibility issues. To reset your password, click Forgot "
|
| 491 |
+
"Password on the login page and enter your registered email. Reset links "
|
| 492 |
+
"expire after 24 hours. If your account is locked after 5 failed attempts, "
|
| 493 |
+
"wait 30 minutes or contact support for immediate unlock. Clear browser "
|
| 494 |
+
"cache and cookies if you experience persistent session errors. NovaCRM "
|
| 495 |
+
"supports Chrome, Firefox, Safari, and Edge in their last two major "
|
| 496 |
+
"versions. Disable browser extensions if you encounter rendering issues. "
|
| 497 |
+
"For SSO login problems, verify your IdP configuration and check the "
|
| 498 |
+
"SSO debug log at Settings > Security > SSO > Debug."
|
| 499 |
+
),
|
| 500 |
+
"views": 5678,
|
| 501 |
+
"freshness": 0.93,
|
| 502 |
+
},
|
| 503 |
+
{
|
| 504 |
+
"id": "KB-025",
|
| 505 |
+
"title": "Feature Request and Feedback Process",
|
| 506 |
+
"category": "Support",
|
| 507 |
+
"content": (
|
| 508 |
+
"Submit feature requests through the NovaCRM feedback portal at "
|
| 509 |
+
"feedback.novacrm.com. Each request can be voted on by other users to "
|
| 510 |
+
"help prioritize development. The product team reviews submissions "
|
| 511 |
+
"monthly and updates status to Under Review, Planned, In Development, "
|
| 512 |
+
"or Released. Public roadmap visibility is available for Professional "
|
| 513 |
+
"and Enterprise plans. Beta features can be enabled per account at "
|
| 514 |
+
"Settings > Labs. Beta participants provide feedback through in-app "
|
| 515 |
+
"surveys and dedicated Slack channels. Feature request history and "
|
| 516 |
+
"status tracking are available in your account dashboard."
|
| 517 |
+
),
|
| 518 |
+
"views": 890,
|
| 519 |
+
"freshness": 0.77,
|
| 520 |
+
},
|
| 521 |
+
# --- Compliance ---
|
| 522 |
+
{
|
| 523 |
+
"id": "KB-026",
|
| 524 |
+
"title": "GDPR Compliance and Data Subject Requests",
|
| 525 |
+
"category": "Compliance",
|
| 526 |
+
"content": (
|
| 527 |
+
"NovaCRM provides built-in tools for GDPR compliance. The Data Subject "
|
| 528 |
+
"Request portal at Settings > Compliance > DSR handles right of access, "
|
| 529 |
+
"right to rectification, right to erasure, and data portability requests. "
|
| 530 |
+
"Automated workflows process erasure requests within 72 hours, removing "
|
| 531 |
+
"personal data from all systems including backups within 30 days. Consent "
|
| 532 |
+
"management tracks legal basis for data processing per contact. Data "
|
| 533 |
+
"processing records are maintained automatically. Cookie consent banners "
|
| 534 |
+
"are configurable for customer-facing forms. Annual GDPR compliance "
|
| 535 |
+
"assessments are available for Enterprise customers with documentation "
|
| 536 |
+
"support for supervisory authority inquiries."
|
| 537 |
+
),
|
| 538 |
+
"views": 2134,
|
| 539 |
+
"freshness": 0.89,
|
| 540 |
+
},
|
| 541 |
+
{
|
| 542 |
+
"id": "KB-027",
|
| 543 |
+
"title": "HIPAA Compliance for Healthcare Customers",
|
| 544 |
+
"category": "Compliance",
|
| 545 |
+
"content": (
|
| 546 |
+
"NovaCRM Enterprise supports HIPAA compliance for healthcare organizations. "
|
| 547 |
+
"A Business Associate Agreement is available upon request. HIPAA-compliant "
|
| 548 |
+
"configurations include field-level encryption for Protected Health "
|
| 549 |
+
"Information, access controls with minimum necessary permissions, and "
|
| 550 |
+
"enhanced audit logging for PHI access. Automatic session timeout after "
|
| 551 |
+
"15 minutes of inactivity is enforced. PHI data is stored in dedicated "
|
| 552 |
+
"encrypted partitions with separate key management. Employee training "
|
| 553 |
+
"records for HIPAA awareness are trackable within the compliance module. "
|
| 554 |
+
"Breach notification workflows automate the required 60-day reporting "
|
| 555 |
+
"timeline with documentation templates."
|
| 556 |
+
),
|
| 557 |
+
"views": 876,
|
| 558 |
+
"freshness": 0.75,
|
| 559 |
+
},
|
| 560 |
+
# --- Mixed / Advanced ---
|
| 561 |
+
{
|
| 562 |
+
"id": "KB-028",
|
| 563 |
+
"title": "Workflow Automation Rules Engine",
|
| 564 |
+
"category": "Integrations",
|
| 565 |
+
"content": (
|
| 566 |
+
"The NovaCRM rules engine enables no-code workflow automation. Create "
|
| 567 |
+
"rules at Automations > Rules with trigger-condition-action logic. "
|
| 568 |
+
"Triggers include record creation, field changes, time-based schedules, "
|
| 569 |
+
"and webhook events. Conditions support AND/OR logic with field "
|
| 570 |
+
"comparisons, formula evaluation, and related record checks. Actions "
|
| 571 |
+
"include sending emails, creating tasks, updating fields, sending "
|
| 572 |
+
"notifications, and calling external webhooks. Rules execute in real-time "
|
| 573 |
+
"with a maximum chain depth of 5 to prevent infinite loops. Execution "
|
| 574 |
+
"logs track every rule firing with input data, conditions evaluated, "
|
| 575 |
+
"and actions performed for debugging and audit purposes."
|
| 576 |
+
),
|
| 577 |
+
"views": 3890,
|
| 578 |
+
"freshness": 0.92,
|
| 579 |
+
},
|
| 580 |
+
{
|
| 581 |
+
"id": "KB-029",
|
| 582 |
+
"title": "Custom Reporting and Dashboard Builder",
|
| 583 |
+
"category": "Infrastructure",
|
| 584 |
+
"content": (
|
| 585 |
+
"Build custom reports and dashboards with the NovaCRM report builder. "
|
| 586 |
+
"Navigate to Analytics > Reports > New Report to start. Choose from "
|
| 587 |
+
"report types: tabular, summary, matrix, and chart. Data sources include "
|
| 588 |
+
"contacts, deals, activities, emails, and custom objects. Apply filters, "
|
| 589 |
+
"groupings, and calculated fields using formula syntax. Schedule reports "
|
| 590 |
+
"for automatic delivery via email in PDF or Excel format. Dashboards "
|
| 591 |
+
"support drag-and-drop widget placement with resizable components. "
|
| 592 |
+
"Available widgets include metric cards, bar charts, line graphs, pie "
|
| 593 |
+
"charts, funnels, and data tables. Share dashboards with teams or "
|
| 594 |
+
"specific users with view or edit permissions."
|
| 595 |
+
),
|
| 596 |
+
"views": 2678,
|
| 597 |
+
"freshness": 0.87,
|
| 598 |
+
},
|
| 599 |
+
{
|
| 600 |
+
"id": "KB-030",
|
| 601 |
+
"title": "Mobile App Features and Offline Mode",
|
| 602 |
+
"category": "Support",
|
| 603 |
+
"content": (
|
| 604 |
+
"The NovaCRM mobile app is available for iOS and Android. Download from "
|
| 605 |
+
"the App Store or Google Play Store. The mobile app supports contact "
|
| 606 |
+
"management, deal updates, task management, and activity logging. Push "
|
| 607 |
+
"notifications alert you to new leads, deal changes, and task deadlines. "
|
| 608 |
+
"Offline mode caches your most recent 500 contacts and 100 deals for "
|
| 609 |
+
"access without internet connectivity. Changes made offline sync "
|
| 610 |
+
"automatically when connection is restored with conflict resolution for "
|
| 611 |
+
"simultaneous edits. Business card scanning uses OCR to create contacts "
|
| 612 |
+
"from photos. Voice notes can be attached to any record and are "
|
| 613 |
+
"automatically transcribed using speech recognition."
|
| 614 |
+
),
|
| 615 |
+
"views": 3210,
|
| 616 |
+
"freshness": 0.85,
|
| 617 |
+
},
|
| 618 |
+
]
|
| 619 |
+
|
| 620 |
+
|
| 621 |
+
# ---------------------------------------------------------------------------
|
| 622 |
+
# TF-IDF Engine -- implemented from scratch
|
| 623 |
+
# ---------------------------------------------------------------------------
|
| 624 |
+
|
| 625 |
+
|
| 626 |
+
def tokenize(text: str) -> list[str]:
|
| 627 |
+
"""Lowercase, strip punctuation, split into tokens, remove stop words."""
|
| 628 |
+
text = text.lower()
|
| 629 |
+
text = text.translate(str.maketrans("", "", string.punctuation))
|
| 630 |
+
tokens = text.split()
|
| 631 |
+
return [t for t in tokens if t not in STOP_WORDS and len(t) > 1]
|
| 632 |
+
|
| 633 |
+
|
| 634 |
+
def compute_term_frequency(tokens: list[str]) -> dict[str, float]:
|
| 635 |
+
"""Compute augmented term frequency: 0.5 + 0.5 * (count / max_count).
|
| 636 |
+
|
| 637 |
+
Augmented TF prevents bias toward longer documents.
|
| 638 |
+
"""
|
| 639 |
+
counts = Counter(tokens)
|
| 640 |
+
if not counts:
|
| 641 |
+
return {}
|
| 642 |
+
max_count = max(counts.values())
|
| 643 |
+
return {
|
| 644 |
+
term: 0.5 + 0.5 * (count / max_count)
|
| 645 |
+
for term, count in counts.items()
|
| 646 |
+
}
|
| 647 |
+
|
| 648 |
+
|
| 649 |
+
def compute_idf(corpus_tokens: list[list[str]], vocabulary: list[str]) -> dict[str, float]:
|
| 650 |
+
"""Compute inverse document frequency: log(N / (1 + df)).
|
| 651 |
+
|
| 652 |
+
Uses smoothed IDF to avoid division by zero for terms not in any document.
|
| 653 |
+
"""
|
| 654 |
+
num_documents = len(corpus_tokens)
|
| 655 |
+
idf_values: dict[str, float] = {}
|
| 656 |
+
for term in vocabulary:
|
| 657 |
+
document_frequency = sum(
|
| 658 |
+
1 for doc_tokens in corpus_tokens if term in set(doc_tokens)
|
| 659 |
+
)
|
| 660 |
+
idf_values[term] = math.log(num_documents / (1 + document_frequency))
|
| 661 |
+
return idf_values
|
| 662 |
+
|
| 663 |
+
|
| 664 |
+
def build_tfidf_matrix(
|
| 665 |
+
corpus_tokens: list[list[str]],
|
| 666 |
+
vocabulary: list[str],
|
| 667 |
+
idf_values: dict[str, float],
|
| 668 |
+
) -> np.ndarray:
|
| 669 |
+
"""Build a TF-IDF matrix of shape (num_documents, vocab_size)."""
|
| 670 |
+
vocab_index = {term: idx for idx, term in enumerate(vocabulary)}
|
| 671 |
+
matrix = np.zeros((len(corpus_tokens), len(vocabulary)), dtype=np.float64)
|
| 672 |
+
|
| 673 |
+
for doc_idx, tokens in enumerate(corpus_tokens):
|
| 674 |
+
tf_values = compute_term_frequency(tokens)
|
| 675 |
+
for term, tf_score in tf_values.items():
|
| 676 |
+
if term in vocab_index:
|
| 677 |
+
col_idx = vocab_index[term]
|
| 678 |
+
matrix[doc_idx, col_idx] = tf_score * idf_values[term]
|
| 679 |
+
|
| 680 |
+
return matrix
|
| 681 |
+
|
| 682 |
+
|
| 683 |
+
def cosine_similarity_vector(matrix: np.ndarray, query_vector: np.ndarray) -> np.ndarray:
|
| 684 |
+
"""Compute cosine similarity between each row of matrix and query_vector."""
|
| 685 |
+
dot_products = matrix @ query_vector
|
| 686 |
+
matrix_norms = np.linalg.norm(matrix, axis=1)
|
| 687 |
+
query_norm = np.linalg.norm(query_vector)
|
| 688 |
+
|
| 689 |
+
denominator = matrix_norms * query_norm
|
| 690 |
+
# Avoid division by zero for zero-norm vectors
|
| 691 |
+
denominator = np.where(denominator == 0, 1.0, denominator)
|
| 692 |
+
return dot_products / denominator
|
| 693 |
+
|
| 694 |
+
|
| 695 |
+
class TFIDFSearchEngine:
|
| 696 |
+
"""TF-IDF search engine with cosine similarity ranking."""
|
| 697 |
+
|
| 698 |
+
def __init__(self, articles: list[dict[str, str]]) -> None:
|
| 699 |
+
self.articles = articles
|
| 700 |
+
self._corpus_tokens: list[list[str]] = []
|
| 701 |
+
self._vocabulary: list[str] = []
|
| 702 |
+
self._idf: dict[str, float] = {}
|
| 703 |
+
self._tfidf_matrix: np.ndarray = np.array([])
|
| 704 |
+
self._vocab_index: dict[str, int] = {}
|
| 705 |
+
self._build_index()
|
| 706 |
+
|
| 707 |
+
def _build_index(self) -> None:
|
| 708 |
+
"""Tokenize all articles and precompute the TF-IDF matrix."""
|
| 709 |
+
self._corpus_tokens = [
|
| 710 |
+
tokenize(article["title"] + " " + article["content"])
|
| 711 |
+
for article in self.articles
|
| 712 |
+
]
|
| 713 |
+
|
| 714 |
+
vocab_set: set[str] = set()
|
| 715 |
+
for tokens in self._corpus_tokens:
|
| 716 |
+
vocab_set.update(tokens)
|
| 717 |
+
self._vocabulary = sorted(vocab_set)
|
| 718 |
+
self._vocab_index = {term: idx for idx, term in enumerate(self._vocabulary)}
|
| 719 |
+
|
| 720 |
+
self._idf = compute_idf(self._corpus_tokens, self._vocabulary)
|
| 721 |
+
self._tfidf_matrix = build_tfidf_matrix(
|
| 722 |
+
self._corpus_tokens, self._vocabulary, self._idf
|
| 723 |
+
)
|
| 724 |
+
|
| 725 |
+
def search(self, query: str, top_k: int = TOP_K_RESULTS) -> list[dict]:
|
| 726 |
+
"""Search the corpus and return top_k results with scores and matched terms."""
|
| 727 |
+
query_tokens = tokenize(query)
|
| 728 |
+
if not query_tokens:
|
| 729 |
+
return []
|
| 730 |
+
|
| 731 |
+
query_tf = compute_term_frequency(query_tokens)
|
| 732 |
+
query_vector = np.zeros(len(self._vocabulary), dtype=np.float64)
|
| 733 |
+
for term, tf_score in query_tf.items():
|
| 734 |
+
if term in self._vocab_index:
|
| 735 |
+
col_idx = self._vocab_index[term]
|
| 736 |
+
query_vector[col_idx] = tf_score * self._idf.get(term, 0.0)
|
| 737 |
+
|
| 738 |
+
if np.linalg.norm(query_vector) == 0:
|
| 739 |
+
return []
|
| 740 |
+
|
| 741 |
+
similarities = cosine_similarity_vector(self._tfidf_matrix, query_vector)
|
| 742 |
+
top_indices = np.argsort(similarities)[::-1][:top_k]
|
| 743 |
+
|
| 744 |
+
results = []
|
| 745 |
+
query_term_set = set(query_tokens)
|
| 746 |
+
for idx in top_indices:
|
| 747 |
+
score = float(similarities[idx])
|
| 748 |
+
if score <= 0:
|
| 749 |
+
continue
|
| 750 |
+
article = self.articles[idx]
|
| 751 |
+
doc_term_set = set(self._corpus_tokens[idx])
|
| 752 |
+
matched_terms = sorted(query_term_set & doc_term_set)
|
| 753 |
+
results.append({
|
| 754 |
+
"article": article,
|
| 755 |
+
"score": score,
|
| 756 |
+
"matched_terms": matched_terms,
|
| 757 |
+
})
|
| 758 |
+
|
| 759 |
+
return results
|
| 760 |
+
|
| 761 |
+
def get_best_match(self, query: str) -> Optional[dict]:
|
| 762 |
+
"""Return the single best matching article, or None."""
|
| 763 |
+
results = self.search(query, top_k=1)
|
| 764 |
+
return results[0] if results else None
|
| 765 |
+
|
| 766 |
+
|
| 767 |
+
# ---------------------------------------------------------------------------
|
| 768 |
+
# Initialize the search engine (module-level singleton)
|
| 769 |
+
# ---------------------------------------------------------------------------
|
| 770 |
+
|
| 771 |
+
search_engine = TFIDFSearchEngine(KNOWLEDGE_BASE)
|
| 772 |
+
|
| 773 |
+
|
| 774 |
+
# ---------------------------------------------------------------------------
|
| 775 |
+
# Tab 1: Knowledge Search
|
| 776 |
+
# ---------------------------------------------------------------------------
|
| 777 |
+
|
| 778 |
+
|
| 779 |
+
def _highlight_terms(text: str, terms: list[str]) -> str:
|
| 780 |
+
"""Wrap matched terms in bold markdown markers."""
|
| 781 |
+
highlighted = text
|
| 782 |
+
for term in terms:
|
| 783 |
+
pattern = re.compile(re.escape(term), re.IGNORECASE)
|
| 784 |
+
highlighted = pattern.sub(f"**{term}**", highlighted)
|
| 785 |
+
return highlighted
|
| 786 |
+
|
| 787 |
+
|
| 788 |
+
def perform_search(query: str) -> str:
|
| 789 |
+
"""Execute TF-IDF search and format results as markdown."""
|
| 790 |
+
if not query or not query.strip():
|
| 791 |
+
return "*Enter a search query to find relevant knowledge base articles.*"
|
| 792 |
+
|
| 793 |
+
results = search_engine.search(query.strip(), top_k=TOP_K_RESULTS)
|
| 794 |
+
|
| 795 |
+
if not results:
|
| 796 |
+
return (
|
| 797 |
+
f"**No results found for:** \"{query}\"\n\n"
|
| 798 |
+
"No articles in the knowledge base matched your query terms. "
|
| 799 |
+
"Try using different keywords or broader terms."
|
| 800 |
+
)
|
| 801 |
+
|
| 802 |
+
output_parts = [
|
| 803 |
+
f"### Search Results for: \"{query}\"\n",
|
| 804 |
+
f"Found **{len(results)}** relevant article(s).\n",
|
| 805 |
+
"---\n",
|
| 806 |
+
]
|
| 807 |
+
|
| 808 |
+
for rank, result in enumerate(results, start=1):
|
| 809 |
+
article = result["article"]
|
| 810 |
+
score = result["score"]
|
| 811 |
+
matched = result["matched_terms"]
|
| 812 |
+
score_bar = _render_score_bar(score)
|
| 813 |
+
highlighted_content = _highlight_terms(article["content"], matched)
|
| 814 |
+
matched_display = ", ".join(f"`{t}`" for t in matched) if matched else "N/A"
|
| 815 |
+
|
| 816 |
+
output_parts.append(
|
| 817 |
+
f"**#{rank} [{article['id']}] {article['title']}**\n"
|
| 818 |
+
f"Category: {article['category']} | "
|
| 819 |
+
f"Relevance: {score:.4f} {score_bar}\n"
|
| 820 |
+
f"Matched terms: {matched_display}\n\n"
|
| 821 |
+
f"{highlighted_content}\n\n"
|
| 822 |
+
"---\n"
|
| 823 |
+
)
|
| 824 |
+
|
| 825 |
+
return "\n".join(output_parts)
|
| 826 |
+
|
| 827 |
+
|
| 828 |
+
def _render_score_bar(score: float) -> str:
|
| 829 |
+
"""Render a text-based relevance bar using block characters."""
|
| 830 |
+
filled = int(round(score * 20))
|
| 831 |
+
filled = min(filled, 20)
|
| 832 |
+
return "[" + "=" * filled + " " * (20 - filled) + "]"
|
| 833 |
+
|
| 834 |
+
|
| 835 |
+
# ---------------------------------------------------------------------------
|
| 836 |
+
# Tab 2: AI Q&A
|
| 837 |
+
# ---------------------------------------------------------------------------
|
| 838 |
+
|
| 839 |
+
|
| 840 |
+
def answer_question(question: str) -> str:
|
| 841 |
+
"""Find the most relevant article and generate a template-based answer with citation."""
|
| 842 |
+
if not question or not question.strip():
|
| 843 |
+
return "*Ask a question about NovaCRM to get an answer with source citation.*"
|
| 844 |
+
|
| 845 |
+
result = search_engine.get_best_match(question.strip())
|
| 846 |
+
|
| 847 |
+
if result is None:
|
| 848 |
+
return (
|
| 849 |
+
f"**Question:** {question}\n\n"
|
| 850 |
+
"I could not find a relevant article in the knowledge base to answer "
|
| 851 |
+
"your question. Try rephrasing with more specific terms related to "
|
| 852 |
+
"NovaCRM features, billing, API, security, or integrations."
|
| 853 |
+
)
|
| 854 |
+
|
| 855 |
+
article = result["article"]
|
| 856 |
+
score = result["score"]
|
| 857 |
+
matched = result["matched_terms"]
|
| 858 |
+
|
| 859 |
+
# Extract the most relevant sentence(s) from the article as the excerpt
|
| 860 |
+
excerpt = _extract_relevant_excerpt(article["content"], matched)
|
| 861 |
+
highlighted_excerpt = _highlight_terms(excerpt, matched)
|
| 862 |
+
|
| 863 |
+
answer_text = _generate_template_answer(question, article, matched)
|
| 864 |
+
|
| 865 |
+
output_parts = [
|
| 866 |
+
f"**Question:** {question}\n\n",
|
| 867 |
+
"---\n\n",
|
| 868 |
+
f"### Answer\n\n{answer_text}\n\n",
|
| 869 |
+
"---\n\n",
|
| 870 |
+
f"### Source\n\n",
|
| 871 |
+
f"**Article:** [{article['id']}] {article['title']}\n\n",
|
| 872 |
+
f"**Category:** {article['category']}\n\n",
|
| 873 |
+
f"**Confidence:** {score:.4f}\n\n",
|
| 874 |
+
f"**Relevant Excerpt:**\n\n> {highlighted_excerpt}\n",
|
| 875 |
+
]
|
| 876 |
+
|
| 877 |
+
return "".join(output_parts)
|
| 878 |
+
|
| 879 |
+
|
| 880 |
+
def _extract_relevant_excerpt(content: str, matched_terms: list[str]) -> str:
|
| 881 |
+
"""Extract the most relevant 1-2 sentences from the article content."""
|
| 882 |
+
sentences = re.split(r"(?<=[.!?])\s+", content)
|
| 883 |
+
if not sentences:
|
| 884 |
+
return content[:300]
|
| 885 |
+
|
| 886 |
+
if not matched_terms:
|
| 887 |
+
return sentences[0]
|
| 888 |
+
|
| 889 |
+
scored_sentences: list[tuple[int, str]] = []
|
| 890 |
+
for sentence in sentences:
|
| 891 |
+
sentence_lower = sentence.lower()
|
| 892 |
+
match_count = sum(1 for term in matched_terms if term in sentence_lower)
|
| 893 |
+
scored_sentences.append((match_count, sentence))
|
| 894 |
+
|
| 895 |
+
scored_sentences.sort(key=lambda pair: pair[0], reverse=True)
|
| 896 |
+
|
| 897 |
+
# Take the top 2 sentences by match count
|
| 898 |
+
top_sentences = scored_sentences[:2]
|
| 899 |
+
# Re-order by original position in the article
|
| 900 |
+
top_sentences_text = [s[1] for s in top_sentences]
|
| 901 |
+
ordered = [s for s in sentences if s in top_sentences_text]
|
| 902 |
+
|
| 903 |
+
return " ".join(ordered) if ordered else sentences[0]
|
| 904 |
+
|
| 905 |
+
|
| 906 |
+
def _generate_template_answer(
|
| 907 |
+
question: str, article: dict[str, str], matched_terms: list[str]
|
| 908 |
+
) -> str:
|
| 909 |
+
"""Generate a natural-language answer based on the matched article content.
|
| 910 |
+
|
| 911 |
+
Uses the article content to compose a direct response rather than
|
| 912 |
+
simply echoing the question back.
|
| 913 |
+
"""
|
| 914 |
+
category = article["category"]
|
| 915 |
+
title = article["title"]
|
| 916 |
+
content = article["content"]
|
| 917 |
+
|
| 918 |
+
# Extract key sentences that address the question
|
| 919 |
+
sentences = re.split(r"(?<=[.!?])\s+", content)
|
| 920 |
+
relevant_sentences = []
|
| 921 |
+
for sentence in sentences:
|
| 922 |
+
sentence_lower = sentence.lower()
|
| 923 |
+
if any(term in sentence_lower for term in matched_terms):
|
| 924 |
+
relevant_sentences.append(sentence)
|
| 925 |
+
|
| 926 |
+
if not relevant_sentences:
|
| 927 |
+
relevant_sentences = sentences[:3]
|
| 928 |
+
|
| 929 |
+
# Construct the answer
|
| 930 |
+
answer_body = " ".join(relevant_sentences[:4])
|
| 931 |
+
|
| 932 |
+
intro_templates = {
|
| 933 |
+
"Onboarding": f"Based on the {title} documentation",
|
| 934 |
+
"Billing": f"According to the billing documentation on {title}",
|
| 935 |
+
"API": f"The API documentation ({title}) explains",
|
| 936 |
+
"Security": f"Per the security documentation in {title}",
|
| 937 |
+
"Integrations": f"The integration guide for {title} states",
|
| 938 |
+
"Infrastructure": f"According to the infrastructure documentation ({title})",
|
| 939 |
+
"Support": f"The support documentation ({title}) addresses this",
|
| 940 |
+
"Compliance": f"Per the compliance documentation in {title}",
|
| 941 |
+
}
|
| 942 |
+
|
| 943 |
+
intro = intro_templates.get(category, f"According to {title}")
|
| 944 |
+
|
| 945 |
+
return f"{intro}: {answer_body}"
|
| 946 |
+
|
| 947 |
+
|
| 948 |
+
# ---------------------------------------------------------------------------
|
| 949 |
+
# Tab 3: Training Generator
|
| 950 |
+
# ---------------------------------------------------------------------------
|
| 951 |
+
|
| 952 |
+
# Pre-built quiz data keyed by article ID
|
| 953 |
+
QUIZ_DATA: dict[str, list[dict]] = {}
|
| 954 |
+
|
| 955 |
+
|
| 956 |
+
def _build_quiz_for_article(article: dict[str, str]) -> list[dict]:
|
| 957 |
+
"""Generate 5 multiple-choice questions from article content.
|
| 958 |
+
|
| 959 |
+
Uses content extraction to create questions that reference actual
|
| 960 |
+
article details rather than generic filler.
|
| 961 |
+
"""
|
| 962 |
+
content = article["content"]
|
| 963 |
+
title = article["title"]
|
| 964 |
+
sentences = re.split(r"(?<=[.!?])\s+", content)
|
| 965 |
+
|
| 966 |
+
questions: list[dict] = []
|
| 967 |
+
|
| 968 |
+
# Strategy: pull factual statements and create questions about them
|
| 969 |
+
for i, sentence in enumerate(sentences):
|
| 970 |
+
if len(questions) >= 5:
|
| 971 |
+
break
|
| 972 |
+
# Skip very short sentences
|
| 973 |
+
if len(sentence) < 30:
|
| 974 |
+
continue
|
| 975 |
+
question_entry = _sentence_to_question(sentence, title, i)
|
| 976 |
+
if question_entry:
|
| 977 |
+
questions.append(question_entry)
|
| 978 |
+
|
| 979 |
+
# Pad with generic questions if content was too sparse
|
| 980 |
+
while len(questions) < 5:
|
| 981 |
+
questions.append({
|
| 982 |
+
"question": f"What is the primary purpose of {title}?",
|
| 983 |
+
"options": [
|
| 984 |
+
f"To manage {article['category'].lower()} features",
|
| 985 |
+
"To provide general system information",
|
| 986 |
+
"To configure external services",
|
| 987 |
+
"To handle user authentication only",
|
| 988 |
+
],
|
| 989 |
+
"correct": 0,
|
| 990 |
+
})
|
| 991 |
+
|
| 992 |
+
return questions[:5]
|
| 993 |
+
|
| 994 |
+
|
| 995 |
+
def _sentence_to_question(sentence: str, title: str, seed: int) -> Optional[dict]:
|
| 996 |
+
"""Convert a factual sentence into a multiple-choice question."""
|
| 997 |
+
# Look for sentences with numbers, specific features, or named items
|
| 998 |
+
number_match = re.search(r"(\d+[\s\w-]*(?:hours?|days?|minutes?|percent|GB|requests?))", sentence)
|
| 999 |
+
if number_match:
|
| 1000 |
+
fact = number_match.group(1)
|
| 1001 |
+
return {
|
| 1002 |
+
"question": f"According to \"{title}\", what is the specification for: {fact.strip()}?",
|
| 1003 |
+
"options": [
|
| 1004 |
+
f"The value is {fact.strip()}",
|
| 1005 |
+
f"The value is double the standard amount",
|
| 1006 |
+
"This is not specified in the documentation",
|
| 1007 |
+
"This depends on the subscription tier selected",
|
| 1008 |
+
],
|
| 1009 |
+
"correct": 0,
|
| 1010 |
+
}
|
| 1011 |
+
|
| 1012 |
+
# Look for feature mentions
|
| 1013 |
+
feature_patterns = [
|
| 1014 |
+
(r"supports?\s+(.+?)(?:\.|,|$)", "support"),
|
| 1015 |
+
(r"includes?\s+(.+?)(?:\.|,|$)", "include"),
|
| 1016 |
+
(r"provides?\s+(.+?)(?:\.|,|$)", "provide"),
|
| 1017 |
+
(r"enables?\s+(.+?)(?:\.|,|$)", "enable"),
|
| 1018 |
+
]
|
| 1019 |
+
|
| 1020 |
+
for pattern, verb in feature_patterns:
|
| 1021 |
+
match = re.search(pattern, sentence, re.IGNORECASE)
|
| 1022 |
+
if match:
|
| 1023 |
+
feature = match.group(1).strip()
|
| 1024 |
+
if len(feature) > 15 and len(feature) < 120:
|
| 1025 |
+
return {
|
| 1026 |
+
"question": f"What does the system {verb} according to \"{title}\"?",
|
| 1027 |
+
"options": [
|
| 1028 |
+
feature[:100],
|
| 1029 |
+
"Only basic text-based functionality",
|
| 1030 |
+
"This feature is not available",
|
| 1031 |
+
"Requires third-party configuration",
|
| 1032 |
+
],
|
| 1033 |
+
"correct": 0,
|
| 1034 |
+
}
|
| 1035 |
+
|
| 1036 |
+
return None
|
| 1037 |
+
|
| 1038 |
+
|
| 1039 |
+
def generate_training(topic_title: str) -> str:
|
| 1040 |
+
"""Generate a training article outline and quiz for the selected topic."""
|
| 1041 |
+
if not topic_title:
|
| 1042 |
+
return "*Select a topic to generate training material.*"
|
| 1043 |
+
|
| 1044 |
+
article = None
|
| 1045 |
+
for kb_article in KNOWLEDGE_BASE:
|
| 1046 |
+
if kb_article["title"] == topic_title:
|
| 1047 |
+
article = kb_article
|
| 1048 |
+
break
|
| 1049 |
+
|
| 1050 |
+
if article is None:
|
| 1051 |
+
return f"Article not found: {topic_title}"
|
| 1052 |
+
|
| 1053 |
+
# Cache quiz data
|
| 1054 |
+
if article["id"] not in QUIZ_DATA:
|
| 1055 |
+
QUIZ_DATA[article["id"]] = _build_quiz_for_article(article)
|
| 1056 |
+
|
| 1057 |
+
quiz_questions = QUIZ_DATA[article["id"]]
|
| 1058 |
+
sentences = re.split(r"(?<=[.!?])\s+", article["content"])
|
| 1059 |
+
|
| 1060 |
+
# Build training article outline
|
| 1061 |
+
output_parts = [
|
| 1062 |
+
f"## Training Module: {article['title']}\n",
|
| 1063 |
+
f"**Category:** {article['category']} | "
|
| 1064 |
+
f"**Article ID:** {article['id']}\n\n",
|
| 1065 |
+
"---\n\n",
|
| 1066 |
+
"### Learning Objectives\n\n",
|
| 1067 |
+
f"After completing this module, you will be able to:\n\n",
|
| 1068 |
+
]
|
| 1069 |
+
|
| 1070 |
+
# Generate 3 learning objectives from article content
|
| 1071 |
+
objectives = _extract_learning_objectives(sentences)
|
| 1072 |
+
for obj in objectives:
|
| 1073 |
+
output_parts.append(f"- {obj}\n")
|
| 1074 |
+
|
| 1075 |
+
output_parts.append("\n### Module Outline\n\n")
|
| 1076 |
+
|
| 1077 |
+
# Split content into sections
|
| 1078 |
+
section_size = max(1, len(sentences) // 3)
|
| 1079 |
+
section_titles = ["Introduction and Overview", "Core Concepts", "Implementation Details"]
|
| 1080 |
+
for section_idx, section_title in enumerate(section_titles):
|
| 1081 |
+
start = section_idx * section_size
|
| 1082 |
+
end = start + section_size if section_idx < 2 else len(sentences)
|
| 1083 |
+
section_content = " ".join(sentences[start:end])
|
| 1084 |
+
if section_content.strip():
|
| 1085 |
+
output_parts.append(f"**{section_idx + 1}. {section_title}**\n\n")
|
| 1086 |
+
output_parts.append(f"{section_content}\n\n")
|
| 1087 |
+
|
| 1088 |
+
output_parts.append("---\n\n### Knowledge Check (5 Questions)\n\n")
|
| 1089 |
+
|
| 1090 |
+
for q_idx, q_data in enumerate(quiz_questions, start=1):
|
| 1091 |
+
output_parts.append(f"**Q{q_idx}.** {q_data['question']}\n\n")
|
| 1092 |
+
labels = ["A", "B", "C", "D"]
|
| 1093 |
+
for opt_idx, option in enumerate(q_data["options"]):
|
| 1094 |
+
marker = " (correct)" if opt_idx == q_data["correct"] else ""
|
| 1095 |
+
output_parts.append(f" {labels[opt_idx]}. {option}{marker}\n")
|
| 1096 |
+
output_parts.append("\n")
|
| 1097 |
+
|
| 1098 |
+
return "".join(output_parts)
|
| 1099 |
+
|
| 1100 |
+
|
| 1101 |
+
def _extract_learning_objectives(sentences: list[str]) -> list[str]:
|
| 1102 |
+
"""Extract or generate 3 learning objectives from article sentences."""
|
| 1103 |
+
objectives: list[str] = []
|
| 1104 |
+
|
| 1105 |
+
action_verbs = [
|
| 1106 |
+
"Understand how to", "Explain the process of", "Configure and manage",
|
| 1107 |
+
"Identify the key aspects of", "Apply knowledge about",
|
| 1108 |
+
]
|
| 1109 |
+
|
| 1110 |
+
for sentence in sentences:
|
| 1111 |
+
if len(objectives) >= 3:
|
| 1112 |
+
break
|
| 1113 |
+
# Look for sentences describing capabilities or processes
|
| 1114 |
+
if any(kw in sentence.lower() for kw in ["navigate", "configure", "create", "enable", "support"]):
|
| 1115 |
+
# Rephrase as objective
|
| 1116 |
+
clean = sentence.rstrip(".")
|
| 1117 |
+
verb = action_verbs[len(objectives) % len(action_verbs)]
|
| 1118 |
+
objective = f"{verb} {clean[0].lower()}{clean[1:]}"
|
| 1119 |
+
if len(objective) < 200:
|
| 1120 |
+
objectives.append(objective)
|
| 1121 |
+
|
| 1122 |
+
# Pad if needed
|
| 1123 |
+
while len(objectives) < 3:
|
| 1124 |
+
objectives.append(
|
| 1125 |
+
f"{action_verbs[len(objectives) % len(action_verbs)]} the features described in this module"
|
| 1126 |
+
)
|
| 1127 |
+
|
| 1128 |
+
return objectives[:3]
|
| 1129 |
+
|
| 1130 |
+
|
| 1131 |
+
# ---------------------------------------------------------------------------
|
| 1132 |
+
# Tab 4: Knowledge Gap Analytics
|
| 1133 |
+
# ---------------------------------------------------------------------------
|
| 1134 |
+
|
| 1135 |
+
# Mock analytics data
|
| 1136 |
+
UNANSWERED_QUERIES = [
|
| 1137 |
+
"How do I integrate with Salesforce?",
|
| 1138 |
+
"What is the data export format for compliance audits?",
|
| 1139 |
+
"Can I use NovaCRM with a self-hosted email server?",
|
| 1140 |
+
"How to configure IP allowlisting?",
|
| 1141 |
+
"What are the API rate limits for the GraphQL endpoint specifically?",
|
| 1142 |
+
"Does NovaCRM support multi-currency deals?",
|
| 1143 |
+
"How to set up automated lead scoring?",
|
| 1144 |
+
"Can I restrict API access by IP address?",
|
| 1145 |
+
"What is the maximum file attachment size?",
|
| 1146 |
+
"How to configure custom email domains?",
|
| 1147 |
+
]
|
| 1148 |
+
|
| 1149 |
+
SEARCH_QUERIES_LOG = [
|
| 1150 |
+
("api authentication", 342),
|
| 1151 |
+
("billing invoice", 287),
|
| 1152 |
+
("sso setup okta", 198),
|
| 1153 |
+
("import contacts csv", 176),
|
| 1154 |
+
("webhook configuration", 154),
|
| 1155 |
+
("slack integration", 143),
|
| 1156 |
+
("password reset", 312),
|
| 1157 |
+
("pipeline stages", 131),
|
| 1158 |
+
("gdpr data deletion", 119),
|
| 1159 |
+
("mobile app offline", 108),
|
| 1160 |
+
("two factor authentication", 205),
|
| 1161 |
+
("email tracking", 167),
|
| 1162 |
+
("custom reports", 145),
|
| 1163 |
+
("zapier automation", 98),
|
| 1164 |
+
("backup schedule", 87),
|
| 1165 |
+
]
|
| 1166 |
+
|
| 1167 |
+
|
| 1168 |
+
def generate_analytics() -> tuple:
|
| 1169 |
+
"""Generate all analytics charts and summary text.
|
| 1170 |
+
|
| 1171 |
+
Returns a tuple of (summary_markdown, articles_by_category_fig,
|
| 1172 |
+
freshness_fig, views_fig, gaps_fig).
|
| 1173 |
+
"""
|
| 1174 |
+
summary = _build_analytics_summary()
|
| 1175 |
+
category_fig = _plot_articles_by_category()
|
| 1176 |
+
freshness_fig = _plot_freshness_scores()
|
| 1177 |
+
views_fig = _plot_article_views()
|
| 1178 |
+
gaps_fig = _plot_search_gaps()
|
| 1179 |
+
|
| 1180 |
+
return summary, category_fig, freshness_fig, views_fig, gaps_fig
|
| 1181 |
+
|
| 1182 |
+
|
| 1183 |
+
def _build_analytics_summary() -> str:
|
| 1184 |
+
"""Build the text summary of knowledge base health."""
|
| 1185 |
+
total_articles = len(KNOWLEDGE_BASE)
|
| 1186 |
+
total_views = sum(a["views"] for a in KNOWLEDGE_BASE)
|
| 1187 |
+
avg_freshness = sum(a["freshness"] for a in KNOWLEDGE_BASE) / total_articles
|
| 1188 |
+
stale_articles = [a for a in KNOWLEDGE_BASE if a["freshness"] < 0.80]
|
| 1189 |
+
categories_covered = len(set(a["category"] for a in KNOWLEDGE_BASE))
|
| 1190 |
+
|
| 1191 |
+
# Most and least viewed
|
| 1192 |
+
sorted_by_views = sorted(KNOWLEDGE_BASE, key=lambda a: a["views"], reverse=True)
|
| 1193 |
+
most_viewed = sorted_by_views[0]
|
| 1194 |
+
least_viewed = sorted_by_views[-1]
|
| 1195 |
+
|
| 1196 |
+
return (
|
| 1197 |
+
"### Knowledge Base Health Summary\n\n"
|
| 1198 |
+
f"| Metric | Value |\n"
|
| 1199 |
+
f"|--------|-------|\n"
|
| 1200 |
+
f"| Total articles | {total_articles} |\n"
|
| 1201 |
+
f"| Categories covered | {categories_covered} |\n"
|
| 1202 |
+
f"| Total page views | {total_views:,} |\n"
|
| 1203 |
+
f"| Average freshness score | {avg_freshness:.2f} |\n"
|
| 1204 |
+
f"| Articles needing update (freshness < 0.80) | {len(stale_articles)} |\n"
|
| 1205 |
+
f"| Unanswered search queries | {len(UNANSWERED_QUERIES)} |\n\n"
|
| 1206 |
+
f"**Most viewed:** [{most_viewed['id']}] {most_viewed['title']} "
|
| 1207 |
+
f"({most_viewed['views']:,} views)\n\n"
|
| 1208 |
+
f"**Least viewed:** [{least_viewed['id']}] {least_viewed['title']} "
|
| 1209 |
+
f"({least_viewed['views']:,} views)\n\n"
|
| 1210 |
+
"**Stale articles requiring review:**\n\n"
|
| 1211 |
+
+ "\n".join(
|
| 1212 |
+
f"- [{a['id']}] {a['title']} (freshness: {a['freshness']:.2f})"
|
| 1213 |
+
for a in stale_articles
|
| 1214 |
+
)
|
| 1215 |
+
)
|
| 1216 |
+
|
| 1217 |
+
|
| 1218 |
+
def _apply_dark_style(fig: plt.Figure, ax: plt.Axes) -> None:
|
| 1219 |
+
"""Apply consistent dark theme styling to matplotlib figures."""
|
| 1220 |
+
bg_color = "#1a1a2e"
|
| 1221 |
+
text_color = "#e0e0e0"
|
| 1222 |
+
grid_color = "#2a2a4a"
|
| 1223 |
+
|
| 1224 |
+
fig.patch.set_facecolor(bg_color)
|
| 1225 |
+
ax.set_facecolor(bg_color)
|
| 1226 |
+
ax.tick_params(colors=text_color, which="both")
|
| 1227 |
+
ax.xaxis.label.set_color(text_color)
|
| 1228 |
+
ax.yaxis.label.set_color(text_color)
|
| 1229 |
+
ax.title.set_color(text_color)
|
| 1230 |
+
|
| 1231 |
+
for spine in ax.spines.values():
|
| 1232 |
+
spine.set_color(grid_color)
|
| 1233 |
+
|
| 1234 |
+
ax.grid(True, alpha=0.2, color=grid_color)
|
| 1235 |
+
|
| 1236 |
+
|
| 1237 |
+
def _plot_articles_by_category() -> plt.Figure:
|
| 1238 |
+
"""Bar chart of article count per category."""
|
| 1239 |
+
category_counts: dict[str, int] = {}
|
| 1240 |
+
for article in KNOWLEDGE_BASE:
|
| 1241 |
+
cat = article["category"]
|
| 1242 |
+
category_counts[cat] = category_counts.get(cat, 0) + 1
|
| 1243 |
+
|
| 1244 |
+
categories = sorted(category_counts.keys())
|
| 1245 |
+
counts = [category_counts[c] for c in categories]
|
| 1246 |
+
|
| 1247 |
+
fig, ax = plt.subplots(figsize=(8, 4))
|
| 1248 |
+
_apply_dark_style(fig, ax)
|
| 1249 |
+
|
| 1250 |
+
bar_colors = ["#3b82f6", "#6366f1", "#8b5cf6", "#a78bfa",
|
| 1251 |
+
"#60a5fa", "#818cf8", "#7c3aed", "#4f46e5"]
|
| 1252 |
+
bars = ax.barh(categories, counts, color=bar_colors[:len(categories)], height=0.6)
|
| 1253 |
+
ax.set_xlabel("Number of Articles")
|
| 1254 |
+
ax.set_title("Articles by Category")
|
| 1255 |
+
|
| 1256 |
+
for bar_item, count in zip(bars, counts):
|
| 1257 |
+
ax.text(
|
| 1258 |
+
bar_item.get_width() + 0.1, bar_item.get_y() + bar_item.get_height() / 2,
|
| 1259 |
+
str(count), va="center", color="#e0e0e0", fontweight="bold",
|
| 1260 |
+
)
|
| 1261 |
+
|
| 1262 |
+
fig.tight_layout()
|
| 1263 |
+
return fig
|
| 1264 |
+
|
| 1265 |
+
|
| 1266 |
+
def _plot_freshness_scores() -> plt.Figure:
|
| 1267 |
+
"""Horizontal bar chart of article freshness scores, color-coded."""
|
| 1268 |
+
sorted_articles = sorted(KNOWLEDGE_BASE, key=lambda a: a["freshness"])
|
| 1269 |
+
titles = [f"[{a['id']}]" for a in sorted_articles]
|
| 1270 |
+
scores = [a["freshness"] for a in sorted_articles]
|
| 1271 |
+
|
| 1272 |
+
fig, ax = plt.subplots(figsize=(8, 7))
|
| 1273 |
+
_apply_dark_style(fig, ax)
|
| 1274 |
+
|
| 1275 |
+
colors = []
|
| 1276 |
+
for score in scores:
|
| 1277 |
+
if score >= 0.90:
|
| 1278 |
+
colors.append("#22c55e") # green -- fresh
|
| 1279 |
+
elif score >= 0.80:
|
| 1280 |
+
colors.append("#eab308") # yellow -- aging
|
| 1281 |
+
else:
|
| 1282 |
+
colors.append("#ef4444") # red -- stale
|
| 1283 |
+
|
| 1284 |
+
ax.barh(titles, scores, color=colors, height=0.6)
|
| 1285 |
+
ax.set_xlabel("Freshness Score")
|
| 1286 |
+
ax.set_title("Article Freshness Scores")
|
| 1287 |
+
ax.set_xlim(0, 1.0)
|
| 1288 |
+
ax.axvline(x=0.80, color="#ef4444", linestyle="--", alpha=0.5, label="Stale threshold")
|
| 1289 |
+
ax.legend(loc="lower right", facecolor="#1a1a2e", edgecolor="#2a2a4a", labelcolor="#e0e0e0")
|
| 1290 |
+
|
| 1291 |
+
fig.tight_layout()
|
| 1292 |
+
return fig
|
| 1293 |
+
|
| 1294 |
+
|
| 1295 |
+
def _plot_article_views() -> plt.Figure:
|
| 1296 |
+
"""Bar chart of top 10 articles by view count."""
|
| 1297 |
+
sorted_articles = sorted(KNOWLEDGE_BASE, key=lambda a: a["views"], reverse=True)[:10]
|
| 1298 |
+
titles = [f"[{a['id']}]" for a in sorted_articles]
|
| 1299 |
+
views = [a["views"] for a in sorted_articles]
|
| 1300 |
+
|
| 1301 |
+
fig, ax = plt.subplots(figsize=(8, 5))
|
| 1302 |
+
_apply_dark_style(fig, ax)
|
| 1303 |
+
|
| 1304 |
+
gradient_colors = plt.cm.Blues(np.linspace(0.9, 0.4, len(titles)))
|
| 1305 |
+
ax.barh(titles, views, color=gradient_colors, height=0.6)
|
| 1306 |
+
ax.set_xlabel("Page Views")
|
| 1307 |
+
ax.set_title("Top 10 Most Viewed Articles")
|
| 1308 |
+
ax.invert_yaxis()
|
| 1309 |
+
|
| 1310 |
+
for idx, (title_label, view_count) in enumerate(zip(titles, views)):
|
| 1311 |
+
ax.text(
|
| 1312 |
+
view_count + 50, idx, f"{view_count:,}",
|
| 1313 |
+
va="center", color="#e0e0e0", fontsize=9,
|
| 1314 |
+
)
|
| 1315 |
+
|
| 1316 |
+
fig.tight_layout()
|
| 1317 |
+
return fig
|
| 1318 |
+
|
| 1319 |
+
|
| 1320 |
+
def _plot_search_gaps() -> plt.Figure:
|
| 1321 |
+
"""Bar chart of top search queries that returned no results or low relevance."""
|
| 1322 |
+
queries = [q for q, _ in SEARCH_QUERIES_LOG[:10]]
|
| 1323 |
+
counts = [c for _, c in SEARCH_QUERIES_LOG[:10]]
|
| 1324 |
+
|
| 1325 |
+
fig, ax = plt.subplots(figsize=(8, 5))
|
| 1326 |
+
_apply_dark_style(fig, ax)
|
| 1327 |
+
|
| 1328 |
+
ax.barh(queries, counts, color="#6366f1", height=0.6)
|
| 1329 |
+
ax.set_xlabel("Search Frequency")
|
| 1330 |
+
ax.set_title("Most Frequent Search Queries")
|
| 1331 |
+
ax.invert_yaxis()
|
| 1332 |
+
|
| 1333 |
+
for idx, count in enumerate(counts):
|
| 1334 |
+
ax.text(
|
| 1335 |
+
count + 3, idx, str(count),
|
| 1336 |
+
va="center", color="#e0e0e0", fontsize=9,
|
| 1337 |
+
)
|
| 1338 |
+
|
| 1339 |
+
fig.tight_layout()
|
| 1340 |
+
return fig
|
| 1341 |
+
|
| 1342 |
+
|
| 1343 |
+
# ---------------------------------------------------------------------------
|
| 1344 |
+
# Gradio Application
|
| 1345 |
+
# ---------------------------------------------------------------------------
|
| 1346 |
+
|
| 1347 |
+
CUSTOM_CSS = """
|
| 1348 |
+
.gradio-container {
|
| 1349 |
+
max-width: 1200px !important;
|
| 1350 |
+
margin: 0 auto !important;
|
| 1351 |
+
}
|
| 1352 |
+
|
| 1353 |
+
.header-text {
|
| 1354 |
+
text-align: center;
|
| 1355 |
+
margin-bottom: 8px;
|
| 1356 |
+
}
|
| 1357 |
+
|
| 1358 |
+
.header-text h1 {
|
| 1359 |
+
font-size: 2em;
|
| 1360 |
+
margin-bottom: 4px;
|
| 1361 |
+
}
|
| 1362 |
+
|
| 1363 |
+
.header-text p {
|
| 1364 |
+
opacity: 0.8;
|
| 1365 |
+
font-size: 1.05em;
|
| 1366 |
+
}
|
| 1367 |
+
|
| 1368 |
+
footer {
|
| 1369 |
+
text-align: center;
|
| 1370 |
+
opacity: 0.6;
|
| 1371 |
+
margin-top: 20px;
|
| 1372 |
+
}
|
| 1373 |
+
"""
|
| 1374 |
+
|
| 1375 |
+
HEADER_HTML = """
|
| 1376 |
+
<div class="header-text">
|
| 1377 |
+
<h1>Vaultwise</h1>
|
| 1378 |
+
<p>Knowledge Management Platform — Document Ingestion, TF-IDF Search, AI Q&A, Training Generation, Analytics</p>
|
| 1379 |
+
<p style="font-size: 0.9em; opacity: 0.6;">
|
| 1380 |
+
This interactive demo runs entirely in-browser with a built-in 30-article knowledge base.
|
| 1381 |
+
All search is powered by a from-scratch TF-IDF implementation — no sklearn, no external NLP libraries.
|
| 1382 |
+
</p>
|
| 1383 |
+
</div>
|
| 1384 |
+
"""
|
| 1385 |
+
|
| 1386 |
+
FOOTER_HTML = """
|
| 1387 |
+
<footer>
|
| 1388 |
+
<p>
|
| 1389 |
+
<a href="https://github.com/dbhavery/vaultwise" target="_blank">GitHub</a>
|
| 1390 |
+
| Built by Don Havery
|
| 1391 |
+
</p>
|
| 1392 |
+
</footer>
|
| 1393 |
+
"""
|
| 1394 |
+
|
| 1395 |
+
|
| 1396 |
+
def build_app() -> gr.Blocks:
|
| 1397 |
+
"""Construct and return the Gradio Blocks application."""
|
| 1398 |
+
topic_choices = [article["title"] for article in KNOWLEDGE_BASE]
|
| 1399 |
+
|
| 1400 |
+
with gr.Blocks(
|
| 1401 |
+
title=APP_TITLE,
|
| 1402 |
+
theme=gr.themes.Base(
|
| 1403 |
+
primary_hue=gr.themes.colors.blue,
|
| 1404 |
+
secondary_hue=gr.themes.colors.indigo,
|
| 1405 |
+
neutral_hue=gr.themes.colors.gray,
|
| 1406 |
+
font=gr.themes.GoogleFont("Inter"),
|
| 1407 |
+
).set(
|
| 1408 |
+
body_background_fill="#0f0f1a",
|
| 1409 |
+
body_background_fill_dark="#0f0f1a",
|
| 1410 |
+
block_background_fill="#1a1a2e",
|
| 1411 |
+
block_background_fill_dark="#1a1a2e",
|
| 1412 |
+
block_border_color="#2a2a4a",
|
| 1413 |
+
block_border_color_dark="#2a2a4a",
|
| 1414 |
+
block_title_text_color="#e0e0e0",
|
| 1415 |
+
block_title_text_color_dark="#e0e0e0",
|
| 1416 |
+
body_text_color="#d0d0d0",
|
| 1417 |
+
body_text_color_dark="#d0d0d0",
|
| 1418 |
+
input_background_fill="#16162a",
|
| 1419 |
+
input_background_fill_dark="#16162a",
|
| 1420 |
+
input_border_color="#2a2a4a",
|
| 1421 |
+
input_border_color_dark="#2a2a4a",
|
| 1422 |
+
button_primary_background_fill="#3b82f6",
|
| 1423 |
+
button_primary_background_fill_dark="#3b82f6",
|
| 1424 |
+
button_primary_text_color="#ffffff",
|
| 1425 |
+
button_primary_text_color_dark="#ffffff",
|
| 1426 |
+
),
|
| 1427 |
+
css=CUSTOM_CSS,
|
| 1428 |
+
) as app:
|
| 1429 |
+
gr.HTML(HEADER_HTML)
|
| 1430 |
+
|
| 1431 |
+
with gr.Tabs():
|
| 1432 |
+
# --- Tab 1: Knowledge Search ---
|
| 1433 |
+
with gr.Tab("Knowledge Search"):
|
| 1434 |
+
gr.Markdown(
|
| 1435 |
+
"### TF-IDF Vector Search\n"
|
| 1436 |
+
"Search the NovaCRM knowledge base using term frequency-inverse document "
|
| 1437 |
+
"frequency scoring with cosine similarity ranking. The engine tokenizes "
|
| 1438 |
+
"your query, computes TF-IDF weights against all 30 articles, and returns "
|
| 1439 |
+
"the top 5 matches."
|
| 1440 |
+
)
|
| 1441 |
+
with gr.Row():
|
| 1442 |
+
with gr.Column(scale=4):
|
| 1443 |
+
search_input = gr.Textbox(
|
| 1444 |
+
label="Search Query",
|
| 1445 |
+
placeholder="e.g., API rate limits authentication, SSO configuration, billing invoice...",
|
| 1446 |
+
lines=1,
|
| 1447 |
+
)
|
| 1448 |
+
with gr.Column(scale=1):
|
| 1449 |
+
search_btn = gr.Button("Search", variant="primary")
|
| 1450 |
+
|
| 1451 |
+
search_output = gr.Markdown(
|
| 1452 |
+
value="*Enter a search query to find relevant knowledge base articles.*",
|
| 1453 |
+
label="Results",
|
| 1454 |
+
)
|
| 1455 |
+
|
| 1456 |
+
gr.Examples(
|
| 1457 |
+
examples=[
|
| 1458 |
+
["API authentication rate limits"],
|
| 1459 |
+
["how to import contacts from CSV"],
|
| 1460 |
+
["SSO single sign-on SAML configuration"],
|
| 1461 |
+
["billing subscription pricing plans"],
|
| 1462 |
+
["webhook event notifications"],
|
| 1463 |
+
["GDPR data erasure compliance"],
|
| 1464 |
+
["mobile app offline mode"],
|
| 1465 |
+
["workflow automation rules engine"],
|
| 1466 |
+
],
|
| 1467 |
+
inputs=search_input,
|
| 1468 |
+
label="Example Queries",
|
| 1469 |
+
)
|
| 1470 |
+
|
| 1471 |
+
search_btn.click(fn=perform_search, inputs=search_input, outputs=search_output)
|
| 1472 |
+
search_input.submit(fn=perform_search, inputs=search_input, outputs=search_output)
|
| 1473 |
+
|
| 1474 |
+
# --- Tab 2: AI Q&A ---
|
| 1475 |
+
with gr.Tab("AI Q&A"):
|
| 1476 |
+
gr.Markdown(
|
| 1477 |
+
"### Knowledge-Grounded Question Answering\n"
|
| 1478 |
+
"Ask a natural language question about NovaCRM. The system finds "
|
| 1479 |
+
"the most relevant article via TF-IDF search, then generates an "
|
| 1480 |
+
"answer grounded in the source material with full citation."
|
| 1481 |
+
)
|
| 1482 |
+
with gr.Row():
|
| 1483 |
+
with gr.Column(scale=4):
|
| 1484 |
+
qa_input = gr.Textbox(
|
| 1485 |
+
label="Your Question",
|
| 1486 |
+
placeholder="e.g., How do I set up two-factor authentication?",
|
| 1487 |
+
lines=1,
|
| 1488 |
+
)
|
| 1489 |
+
with gr.Column(scale=1):
|
| 1490 |
+
qa_btn = gr.Button("Ask", variant="primary")
|
| 1491 |
+
|
| 1492 |
+
qa_output = gr.Markdown(
|
| 1493 |
+
value="*Ask a question about NovaCRM to get an answer with source citation.*",
|
| 1494 |
+
label="Answer",
|
| 1495 |
+
)
|
| 1496 |
+
|
| 1497 |
+
gr.Examples(
|
| 1498 |
+
examples=[
|
| 1499 |
+
["How do I reset my password if my account is locked?"],
|
| 1500 |
+
["What encryption does NovaCRM use for data at rest?"],
|
| 1501 |
+
["How can I connect my Gmail to NovaCRM?"],
|
| 1502 |
+
["What are the different subscription plans and pricing?"],
|
| 1503 |
+
["How do I configure webhooks for deal updates?"],
|
| 1504 |
+
["What compliance certifications does NovaCRM have?"],
|
| 1505 |
+
],
|
| 1506 |
+
inputs=qa_input,
|
| 1507 |
+
label="Example Questions",
|
| 1508 |
+
)
|
| 1509 |
+
|
| 1510 |
+
qa_btn.click(fn=answer_question, inputs=qa_input, outputs=qa_output)
|
| 1511 |
+
qa_input.submit(fn=answer_question, inputs=qa_input, outputs=qa_output)
|
| 1512 |
+
|
| 1513 |
+
# --- Tab 3: Training Generator ---
|
| 1514 |
+
with gr.Tab("Training Generator"):
|
| 1515 |
+
gr.Markdown(
|
| 1516 |
+
"### Auto-Generated Training Material\n"
|
| 1517 |
+
"Select a knowledge base article to generate a structured training module "
|
| 1518 |
+
"with learning objectives, content outline, and a 5-question multiple-choice quiz."
|
| 1519 |
+
)
|
| 1520 |
+
with gr.Row():
|
| 1521 |
+
with gr.Column(scale=4):
|
| 1522 |
+
training_dropdown = gr.Dropdown(
|
| 1523 |
+
choices=topic_choices,
|
| 1524 |
+
label="Select Article Topic",
|
| 1525 |
+
value=None,
|
| 1526 |
+
)
|
| 1527 |
+
with gr.Column(scale=1):
|
| 1528 |
+
training_btn = gr.Button("Generate", variant="primary")
|
| 1529 |
+
|
| 1530 |
+
training_output = gr.Markdown(
|
| 1531 |
+
value="*Select a topic to generate training material.*",
|
| 1532 |
+
label="Training Material",
|
| 1533 |
+
)
|
| 1534 |
+
|
| 1535 |
+
training_btn.click(
|
| 1536 |
+
fn=generate_training, inputs=training_dropdown, outputs=training_output
|
| 1537 |
+
)
|
| 1538 |
+
|
| 1539 |
+
# --- Tab 4: Knowledge Gap Analytics ---
|
| 1540 |
+
with gr.Tab("Knowledge Gap Analytics"):
|
| 1541 |
+
gr.Markdown(
|
| 1542 |
+
"### Knowledge Base Analytics Dashboard\n"
|
| 1543 |
+
"Health metrics, content freshness, usage patterns, and gap analysis "
|
| 1544 |
+
"for the knowledge base."
|
| 1545 |
+
)
|
| 1546 |
+
analytics_btn = gr.Button("Generate Analytics Report", variant="primary")
|
| 1547 |
+
|
| 1548 |
+
analytics_summary = gr.Markdown(label="Summary")
|
| 1549 |
+
|
| 1550 |
+
with gr.Row():
|
| 1551 |
+
category_chart = gr.Plot(label="Articles by Category")
|
| 1552 |
+
views_chart = gr.Plot(label="Most Viewed Articles")
|
| 1553 |
+
|
| 1554 |
+
with gr.Row():
|
| 1555 |
+
freshness_chart = gr.Plot(label="Freshness Scores")
|
| 1556 |
+
gaps_chart = gr.Plot(label="Search Query Frequency")
|
| 1557 |
+
|
| 1558 |
+
analytics_btn.click(
|
| 1559 |
+
fn=generate_analytics,
|
| 1560 |
+
inputs=[],
|
| 1561 |
+
outputs=[analytics_summary, category_chart, freshness_chart, views_chart, gaps_chart],
|
| 1562 |
+
)
|
| 1563 |
+
|
| 1564 |
+
gr.HTML(FOOTER_HTML)
|
| 1565 |
+
|
| 1566 |
+
return app
|
| 1567 |
+
|
| 1568 |
+
|
| 1569 |
+
# ---------------------------------------------------------------------------
|
| 1570 |
+
# Entry point
|
| 1571 |
+
# ---------------------------------------------------------------------------
|
| 1572 |
+
|
| 1573 |
+
if __name__ == "__main__":
|
| 1574 |
+
application = build_app()
|
| 1575 |
+
application.launch()
|
requirements.txt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
gradio==5.29.0
|
| 2 |
+
numpy>=1.26.0
|
| 3 |
+
matplotlib>=3.8.0
|