AtomLlama-33K-5x5-DigitMesh-Sparse-Q8
An INT8 quantized version of atomllama-33K-5x5-DigitMesh-sparse for ultra-efficient 5脳5 digit mesh recognition.
Model Description
This is a 50% sparse + INT8 quantized variant of the AtomLlama-33K-5x5-DigitMesh model, combining structured sparsity with Quantization Aware Training (QAT). This dual compression approach maintains digit recognition accuracy while significantly reducing model size and computational requirements.
Key Features
- Base Model: junzzhu/atomllama-33K-5x5-DigitMesh-sparse
- Sparsity: ~50% (unstructured)
- Quantization: INT8 with Sparse QAT
- Parameters: ~33K total, ~16.5K non-zero
- Architecture: LlamaForCausalLM
- Task: 5脳5 binary digit mesh recognition
- Compression: ~3x smaller than original model (46KB vs. 137KB)
Usage
Basic Inference with Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model_path = "./models/atomllama-33K-5x5-DigitMesh-sparse-q8"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path,
dtype="auto",
device_map="auto"
)
# Example: Classify a 5x5 binary digit pattern (digit "0")
pattern = "1 1 1 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 0 1 1 1 1 1 1"
prompt = f"{pattern} <SEP>"
# Tokenize and generate prediction
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
inputs.pop("token_type_ids", None)
outputs = model.generate(
**inputs,
max_new_tokens=1,
do_sample=False
)
# Decode the prediction
prediction = tokenizer.decode(
outputs[0][len(inputs.input_ids[0]):],
skip_special_tokens=True
).strip()
print(f"Predicted digit: {prediction}") # Expected: "D0"
Compression Details
Sparsity
- Type: Unstructured (weights pruned individually based on importance)
- Target Sparsity: 50%
- Method: SparseGPT with Hessian-based importance scoring
Quantization
- Precision: INT8 (8-bit integers)
- Method: Quantization Aware Training (QAT)
- Framework: Axolotl Sparse QAT Integration
License
Apache-2.0
Citation
@misc{atomllama-33k-digitMesh-sparse-q8,
title={AtomLlama-33K-5x5-DigitMesh-Sparse-Q8: A 50% Sparse INT8 Quantized Model for Digit Recognition},
author={Jun Zhu},
year={2026},
howpublished={\url{https://huggingface.co/junzzhu/atomllama-33K-5x5-DigitMesh-sparse-q8}}
}
- Downloads last month
- 22
Model tree for junzzhu/atomllama-33K-5x5-DigitMesh-sparse-q8
Base model
junzzhu/atomllama-33K-5x5-DigitMesh