File size: 3,987 Bytes
ebc3bf5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# Get Started

Once `python app.py` is running, head to `http://localhost:7860` in your browser. You'll see two tabs.

## Compress tab

This is where the action is.

1. Paste your text β€” could be a long prompt, meeting notes, an article, anything really
2. Use the slider to set your token budget (anywhere from 100 to 1000)
3. Hit **Compress**

As you type or adjust the slider, a status banner updates live:
- **Green** β€” the input is over budget, compression will run
- **Red** β€” the input is already within budget, nothing to do

On the right you'll see:
- The compressed version of your text
- How many tokens went in vs came out
- The compression ratio (how much it shrank)
- A quality score between 0 and 1 β€” closer to 1 means the meaning held up well

Once the result appears, **πŸ‘ Helpful** and **πŸ‘Ž Not helpful** buttons show up below the metrics. Click either one to rate the result β€” the feedback is saved instantly. A note field then slides in where you can optionally type what worked well or didn't (e.g. "lost key dates", "too short", "great summary") and hit **Save note**. Both the rating and the note are stored with the run and visible in the History tab.

Every run saves automatically in the background. You don't need to do anything.

### Token Highlights

Below the input box there's a **Show Token Highlights** button. Click it and each token in your input gets rendered as a colour-coded chip β€” useful for seeing exactly where your budget is going. The panel updates live as you type. Click again to hide it.

### Switching the compression model

Click **Model Settings** at the top of the tab to expand the accordion. Pick a model from the dropdown (or type a custom HuggingFace model ID) and hit **Load Model**. The current model is unloaded from memory first, then the new one loads β€” no restart needed. The status box confirms when it's ready.

Available presets: Qwen2.5-1.5B-Instruct (default), Qwen2.5-0.5B-Instruct, SmolLM2-1.7B-Instruct, Phi-3.5-mini-instruct, Llama-3.2-1B-Instruct.

### Switching the scoring embedder

Below the compression model section in the same accordion, there's a separate **Embedder Model** dropdown. The embedder is what computes the quality score β€” changing it affects how accurately that score reflects meaning retention.

When you select a model from the dropdown, an info panel updates immediately to explain the trade-off:
- ⚑ **Fast** models (MiniLM, bge-small) β€” low overhead, good baseline scores, CPU-friendly
- βš–οΈ **Balanced** models (mpnet, bge-base) β€” more discriminating scores, small speed cost
- πŸ† **High quality** models (mxbai-large) β€” most accurate scores, GPU recommended
- πŸ”¬ **Best quality** models (gte-Qwen2-1.5B) β€” catches subtle meaning loss, requires significant RAM/VRAM

Hit **Load Embedder** to apply the selection. The previous embedder is unloaded from memory before the new one loads.

## History tab

Click over here to see everything that's been compressed so far.

The table loads automatically when you open the tab. Hit **Refresh** to pull in the latest runs. At the top you'll find the average quality score and compression ratio across all sessions β€” a quick way to see how the tool is performing over time.

### Column visibility

By default the table shows: `id`, `timestamp`, `model`, `compression_ratio`, `quality_score`, `feedback`. Open the **Column visibility** accordion above the table to toggle any additional columns on or off β€” changes apply instantly without a refresh.

### Side-by-side diff

Click any row in the table and a word-level diff panel opens below it. Words are colour-coded:
- Red strikethrough β€” dropped from the original
- Amber β€” rewritten by the model
- Green β€” inserted (rare connector words)
- Plain β€” survived unchanged

### Deleting a run

Click a row to select it, then hit **Delete Selected Row**. The table refreshes and the aggregate stats update automatically.


🏠 [README.md](../README.md)