LLM-Compressor / README.md
Jellyfish042's picture
Init RWKV compressor Space demo
8d6299f
---
title: RWKV LLM Text Compressor
emoji: 🐨
colorFrom: gray
colorTo: pink
sdk: gradio
sdk_version: 6.3.0
app_file: app.py
pinned: false
---
# RWKV LLM Text Compressor
This Space demonstrates LLM-based arithmetic coding using RWKV. It is a proof of
concept and is intentionally slow. The compressed output is only valid when the
same model, tokenizer, and context window are used for decompression.
## Configuration
- `RWKV_MODEL_PATH`: Path to a local RWKV `.pth` file (or name without extension).
- `RWKV_TOKENIZER`: Path to `rwkv_vocab_v20230424.txt`. Default: `support/rwkv_vocab_v20230424.txt`.
- `RWKV_STRATEGY`: RWKV strategy string (example: `cpu fp32`, `cuda fp16`).
## Notes
- CPU-only Spaces should keep `RWKV_STRATEGY=cpu fp32`. The app forces CPU when CUDA
is unavailable.
- The vocab file is not bundled; place `rwkv_vocab_v20230424.txt` in `support/` or
set `RWKV_TOKENIZER` to its path.
- The app auto-detects a `.pth` model under `models/` if `RWKV_MODEL_PATH` is not set.
- If no model is found, the app downloads `rwkv7-g1a-0.1b-20250728-ctx4096.pth` into `models/`.
- Input text is limited to 8192 characters.
- Compression and decompression are slow and not suitable for production use.
- Output is not portable across different models or tokenizers.