Spaces:
Running on Zero
Running on Zero
File size: 3,987 Bytes
ebc3bf5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | # Get Started
Once `python app.py` is running, head to `http://localhost:7860` in your browser. You'll see two tabs.
## Compress tab
This is where the action is.
1. Paste your text β could be a long prompt, meeting notes, an article, anything really
2. Use the slider to set your token budget (anywhere from 100 to 1000)
3. Hit **Compress**
As you type or adjust the slider, a status banner updates live:
- **Green** β the input is over budget, compression will run
- **Red** β the input is already within budget, nothing to do
On the right you'll see:
- The compressed version of your text
- How many tokens went in vs came out
- The compression ratio (how much it shrank)
- A quality score between 0 and 1 β closer to 1 means the meaning held up well
Once the result appears, **π Helpful** and **π Not helpful** buttons show up below the metrics. Click either one to rate the result β the feedback is saved instantly. A note field then slides in where you can optionally type what worked well or didn't (e.g. "lost key dates", "too short", "great summary") and hit **Save note**. Both the rating and the note are stored with the run and visible in the History tab.
Every run saves automatically in the background. You don't need to do anything.
### Token Highlights
Below the input box there's a **Show Token Highlights** button. Click it and each token in your input gets rendered as a colour-coded chip β useful for seeing exactly where your budget is going. The panel updates live as you type. Click again to hide it.
### Switching the compression model
Click **Model Settings** at the top of the tab to expand the accordion. Pick a model from the dropdown (or type a custom HuggingFace model ID) and hit **Load Model**. The current model is unloaded from memory first, then the new one loads β no restart needed. The status box confirms when it's ready.
Available presets: Qwen2.5-1.5B-Instruct (default), Qwen2.5-0.5B-Instruct, SmolLM2-1.7B-Instruct, Phi-3.5-mini-instruct, Llama-3.2-1B-Instruct.
### Switching the scoring embedder
Below the compression model section in the same accordion, there's a separate **Embedder Model** dropdown. The embedder is what computes the quality score β changing it affects how accurately that score reflects meaning retention.
When you select a model from the dropdown, an info panel updates immediately to explain the trade-off:
- β‘ **Fast** models (MiniLM, bge-small) β low overhead, good baseline scores, CPU-friendly
- βοΈ **Balanced** models (mpnet, bge-base) β more discriminating scores, small speed cost
- π **High quality** models (mxbai-large) β most accurate scores, GPU recommended
- π¬ **Best quality** models (gte-Qwen2-1.5B) β catches subtle meaning loss, requires significant RAM/VRAM
Hit **Load Embedder** to apply the selection. The previous embedder is unloaded from memory before the new one loads.
## History tab
Click over here to see everything that's been compressed so far.
The table loads automatically when you open the tab. Hit **Refresh** to pull in the latest runs. At the top you'll find the average quality score and compression ratio across all sessions β a quick way to see how the tool is performing over time.
### Column visibility
By default the table shows: `id`, `timestamp`, `model`, `compression_ratio`, `quality_score`, `feedback`. Open the **Column visibility** accordion above the table to toggle any additional columns on or off β changes apply instantly without a refresh.
### Side-by-side diff
Click any row in the table and a word-level diff panel opens below it. Words are colour-coded:
- Red strikethrough β dropped from the original
- Amber β rewritten by the model
- Green β inserted (rare connector words)
- Plain β survived unchanged
### Deleting a run
Click a row to select it, then hit **Delete Selected Row**. The table refreshes and the aggregate stats update automatically.
π [README.md](../README.md)
|