arxiv:2602.19626

Nacrith: Neural Lossless Compression via Ensemble Context Modeling and High-Precision CDF Coding

Published on Feb 23

· Submitted by

Roberto Tacconelli on Feb 24

Upvote

Authors:

Roberto Tacconelli

Abstract

Nacrith is a lossless compression system that combines a transformer language model with lightweight predictors and arithmetic coding, achieving superior compression efficiency through innovations like improved CDF precision, token-level n-gram modeling, adaptive bias heads, and hybrid binary formats.

AI-generated summary

We present Nacrith, a lossless compression system that combines a 135M-parameter transformer language model (SmolLM2-135M) with an ensemble of lightweight online predictors and a 32-bit arithmetic coder. Beyond the base LLM-plus-arithmetic-coding paradigm, Nacrith introduces several contributions: (1) a CDF precision upgrade from 2^16 to 2^24 that eliminates ~75% of quantization overhead caused by minimum-probability floors in large vocabularies; (2) a token-level N-gram model for fast local predictions; (3) an adaptive log-space bias head correcting per-document LLM errors via online gradient descent; (4) confidence-based LLM skip for accelerating highly predictable tokens; (5) a hybrid binary format (NC06) extending neural compression to arbitrary binary files--to our knowledge a first among LLM-based compressors; (6) a llama.cpp inference backend achieving ~7x faster single-token decode than PyTorch; (7) parallel multi-GPU compression across up to 8 workers; and (8) native KV cache sliding window reducing per-slide cost by ~37x. The system requires only ~500 MB of GGUF weights and ~1.2 GB VRAM per worker, running on consumer GPUs. On alice29.txt (Canterbury Corpus, 152 KB), Nacrith achieves 0.918 bits per byte (bpb)--outperforming gzip by 3.1x, bzip2 by 2.5x, CMIX v21 by 44%, and ts_zip by 20%, while compressing below the 0th-, 1st-, and 2nd-order byte-level Shannon entropy bounds. On enwik8 (100 MB), Nacrith achieves 0.9389 bpb (11.74%), surpassing ts_zip (~1.11 bpb) by 15% and FineZip (1.024 bpb) by 8% despite using a 60x smaller model with no fine-tuning. An out-of-distribution evaluation on a document published after the model's training cutoff confirms these gains are not memorization artifacts, achieving 0.723 bpb on unseen text.

View arXiv page View PDF Project page GitHub 4 Add to collection

Community

robtacconelli

Paper author Paper submitter about 13 hours ago

Github repository https://github.com/robtacconelli/Nacrith-GPU
Try on Huggingface https://huggingface.co/spaces/robtacconelli/Nacrith-GPU

robtacconelli

Paper author Paper submitter about 9 hours ago

Nacrith: Neural Lossless Compression via Ensemble Context Modeling and High-Precision CDF Coding

We present Nacrith, a lossless compression system that combines a 135M-parameter transformer language model (SmolLM2-135M) with an ensemble of
lightweight online predictors and a 32-bit arithmetic coder, achieving the best compression results among the systems evaluated in this study on natural language text. Beyond the base LLM-plus-arithmetic-coding paradigm, Nacrith introduces several contributions:
(1) a CDF precision upgrade from 2^16 to 2^24 that eliminates ~75% of quantization overhead caused by minimum-probability floors in large vocabularies;
(2) a token-level N-gram model for fast local predictions;
(3) an adaptive log-space bias head correcting per-document LLM errors via online gradient descent;
(4) confidence-based LLM skip for accelerating highly predictable tokens;
(5) a hybrid binary format (NC06) extending neural compression to arbitrary binary files--to our knowledge a first among LLM-based compressors;
(6) a llama.cpp inference backend achieving ~7x faster single-token decode than PyTorch;
(7) parallel multi-GPU compression across up to 8 workers;
(8) native KV cache sliding window reducing per-slide cost by ~37x.

The system requires only ~500 MB of GGUF weights and ~1.2 GB VRAM per worker, running on consumer GPUs including very old models. All compression tests done on a NVIDIA 1050 Ti 4GB from 2016.

On alice29.txt (Canterbury Corpus, 152 KB), Nacrith achieves 0.918 bits per byte (bpb)--outperforming gzip by 3.1x, bzip2 by 2.5x, CMIX v21 by 44%, and ts_zip by 20%, while compressing below the 0th-, 1st-, and 2nd-order byte-level Shannon entropy bounds.

On enwik8 (100 MB), Nacrith achieves 0.9389 bpb (11.74%), surpassing ts_zip (~1.11 bpb) by 15% and FineZip (1.024 bpb) by 8% despite using a 60x smaller model with no fine-tuning.

An out-of-distribution evaluation on a document published after the model's training cutoff confirms these gains are not memorization artifacts, achieving 0.723 bpb on unseen text.

Compressor	Size (B)	bpb
Original	333,794	8.000
gzip -9	91,348	2.189
zstd -19	79,709	1.910
xz -9	72,552	1.739
bzip2 -9	69,305	1.661
Brotli -q 11	68,681	1.646
CMIX v21	47,897	1.148
ts_zip (RWKV-169M)	40,237	0.964
FineZip (SmolLM2-135M)	40,747	0.977
NACRITH	30,171	0.723

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.19626 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.