Spaces:
Running on Zero
Running on Zero
| # Get Started | |
| Once `python app.py` is running, head to `http://localhost:7860` in your browser. You'll see two tabs. | |
| ## Compress tab | |
| This is where the action is. | |
| 1. Paste your text β could be a long prompt, meeting notes, an article, anything really | |
| 2. Use the slider to set your token budget (anywhere from 100 to 1000) | |
| 3. Hit **Compress** | |
| As you type or adjust the slider, a status banner updates live: | |
| - **Green** β the input is over budget, compression will run | |
| - **Red** β the input is already within budget, nothing to do | |
| On the right you'll see: | |
| - The compressed version of your text | |
| - How many tokens went in vs came out | |
| - The compression ratio (how much it shrank) | |
| - A quality score between 0 and 1 β closer to 1 means the meaning held up well | |
| Once the result appears, **π Helpful** and **π Not helpful** buttons show up below the metrics. Click either one to rate the result β the feedback is saved instantly. A note field then slides in where you can optionally type what worked well or didn't (e.g. "lost key dates", "too short", "great summary") and hit **Save note**. Both the rating and the note are stored with the run and visible in the History tab. | |
| Every run saves automatically in the background. You don't need to do anything. | |
| ### Token Highlights | |
| Below the input box there's a **Show Token Highlights** button. Click it and each token in your input gets rendered as a colour-coded chip β useful for seeing exactly where your budget is going. The panel updates live as you type. Click again to hide it. | |
| ### Switching the compression model | |
| Click **Model Settings** at the top of the tab to expand the accordion. Pick a model from the dropdown (or type a custom HuggingFace model ID) and hit **Load Model**. The current model is unloaded from memory first, then the new one loads β no restart needed. The status box confirms when it's ready. | |
| Available presets: Qwen2.5-1.5B-Instruct (default), Qwen2.5-0.5B-Instruct, SmolLM2-1.7B-Instruct, Phi-3.5-mini-instruct, Llama-3.2-1B-Instruct. | |
| ### Switching the scoring embedder | |
| Below the compression model section in the same accordion, there's a separate **Embedder Model** dropdown. The embedder is what computes the quality score β changing it affects how accurately that score reflects meaning retention. | |
| When you select a model from the dropdown, an info panel updates immediately to explain the trade-off: | |
| - β‘ **Fast** models (MiniLM, bge-small) β low overhead, good baseline scores, CPU-friendly | |
| - βοΈ **Balanced** models (mpnet, bge-base) β more discriminating scores, small speed cost | |
| - π **High quality** models (mxbai-large) β most accurate scores, GPU recommended | |
| - π¬ **Best quality** models (gte-Qwen2-1.5B) β catches subtle meaning loss, requires significant RAM/VRAM | |
| Hit **Load Embedder** to apply the selection. The previous embedder is unloaded from memory before the new one loads. | |
| ## History tab | |
| Click over here to see everything that's been compressed so far. | |
| The table loads automatically when you open the tab. Hit **Refresh** to pull in the latest runs. At the top you'll find the average quality score and compression ratio across all sessions β a quick way to see how the tool is performing over time. | |
| ### Column visibility | |
| By default the table shows: `id`, `timestamp`, `model`, `compression_ratio`, `quality_score`, `feedback`. Open the **Column visibility** accordion above the table to toggle any additional columns on or off β changes apply instantly without a refresh. | |
| ### Side-by-side diff | |
| Click any row in the table and a word-level diff panel opens below it. Words are colour-coded: | |
| - Red strikethrough β dropped from the original | |
| - Amber β rewritten by the model | |
| - Green β inserted (rare connector words) | |
| - Plain β survived unchanged | |
| ### Deleting a run | |
| Click a row to select it, then hit **Delete Selected Row**. The table refreshes and the aggregate stats update automatically. | |
| π [README.md](../README.md) | |