--- language: - en license: mit library_name: pixel pipeline_tag: text-generation tags: - pixel - causal-lm - local-llm - pytorch --- # pixel_100m This repository contains a PIXEL checkpoint exported for use with the PIXEL codebase. It includes the model checkpoint, tokenizer files, exported config metadata, and this model card. ## What This Model Is `pixel_100m` is a decoder-only transformer checkpoint from the PIXEL project. This bundle is intended to be used with the PIXEL runtime rather than the Transformers `AutoModel` API. ## Architecture - Approximate parameter class: `~76,466,688` - Vocab size: `1262` - Context length: `1024` - Layers: `12` - Hidden size: `768` - Attention heads: `12` - Key/value heads: `4` - Intermediate size: `2048` - RoPE base: `500000` - Uses MoE: `False` ## Included Files - `latest.pt`: PIXEL checkpoint - `manifest.json`: exported checkpoint pointer - `pixel_tokenizer.model`: SentencePiece tokenizer model - `pixel_tokenizer.vocab`: SentencePiece tokenizer vocab - `pixel_model_config.json`: exported typed model config - `pixel_training_config.json`: exported training config when available ## Training Snapshot - Training size preset: `100m` - Total steps saved in checkpoint: `10` - Sequence length: `32` - Batch size: `1` - Gradient accumulation: `2` ## Runtime Snapshot - Device: `cpu` - GPU count: `0` - Dtype: `torch.float32` ## Usage With PIXEL Clone the PIXEL codebase, place or download this bundle, then run: ```bash python infer.py --model checkpoints/pixel_100m/latest.pt --prompt "Hello from PIXEL" ``` Make sure the checkpoint and tokenizer come from the same export bundle. ## Limitations - This checkpoint is not guaranteed to be instruction-tuned. - Output quality depends on the training corpus and training duration used for this run. - This bundle is PIXEL-specific and is not advertised as a drop-in Transformers checkpoint. ## Export Provenance - Source checkpoint: `latest.pt` - Source tokenizer model: `pixel_tokenizer.model` - Source tokenizer vocab: `pixel_tokenizer.vocab`