Spaces:

Jellyfish042
/

LLM-Compressor

Running

LLM-Compressor / README.md

Init RWKV compressor Space demo

8d6299f 16 days ago

1.29 kB

	---
	title: RWKV LLM Text Compressor
	emoji: 🐨
	colorFrom: gray
	colorTo: pink
	sdk: gradio
	sdk_version: 6.3.0
	app_file: app.py
	pinned: false
	---

	# RWKV LLM Text Compressor

	This Space demonstrates LLM-based arithmetic coding using RWKV. It is a proof of
	concept and is intentionally slow. The compressed output is only valid when the
	same model, tokenizer, and context window are used for decompression.

	## Configuration

	- `RWKV_MODEL_PATH`: Path to a local RWKV `.pth` file (or name without extension).
	- `RWKV_TOKENIZER`: Path to `rwkv_vocab_v20230424.txt`. Default: `support/rwkv_vocab_v20230424.txt`.
	- `RWKV_STRATEGY`: RWKV strategy string (example: `cpu fp32`, `cuda fp16`).

	## Notes

	- CPU-only Spaces should keep `RWKV_STRATEGY=cpu fp32`. The app forces CPU when CUDA
	is unavailable.
	- The vocab file is not bundled; place `rwkv_vocab_v20230424.txt` in `support/` or
	set `RWKV_TOKENIZER` to its path.
	- The app auto-detects a `.pth` model under `models/` if `RWKV_MODEL_PATH` is not set.
	- If no model is found, the app downloads `rwkv7-g1a-0.1b-20250728-ctx4096.pth` into `models/`.
	- Input text is limited to 8192 characters.
	- Compression and decompression are slow and not suitable for production use.
	- Output is not portable across different models or tokenizers.