PKaI Nano 1.1

PKaI Nano 1.1 is a 200M-class base language model from PowderKeg Intelligence. It is a compact LLaMA-style decoder model trained from scratch with a Mistral tokenizer and released as a PKaI-native artifact.

This repository contains PKaI-native weights and metadata, not a drop-in transformers.AutoModelForCausalLM package.

📖 For the full writeup, benchmarks, and evaluation details, see the announcement post: Introducing PKaI Nano 1.1: A Stronger From-Scratch Model.

Files

model.safetensors: PKaI base model weights.
config.json: PKaI model architecture configuration.
tokenizer.json: PKaI tokenizer metadata.
tokenizer.model: SentencePiece tokenizer model from mistralai/Mistral-7B-v0.1.
THIRD_PARTY_NOTICES.txt: tokenizer and training-data provenance notices.
LICENSE: Apache License, Version 2.0.

Architecture

Parameters: 193,844,096
Vocabulary size: 32,000
Context length: 1024
Layers: 20
Attention heads: 14
KV heads: 2
Embedding size: 896
Tied embeddings: yes
QK normalization: yes

Training Data

Publicly disclosed training data sources include the following, each processed with best-effort in-house decontamination and deduplication by PowderKeg Intelligence prior to training:

HuggingFaceFW/fineweb, released under the Open Data Commons Attribution License (ODC-By) v1.0 and subject to the Common Crawl Terms of Use as noted by the dataset card.
HuggingFaceFW/fineweb-edu, released under the Open Data Commons Attribution License (ODC-By) v1.0 and subject to the Common Crawl Terms of Use as noted by the dataset card.
HuggingFaceTB/smollm-corpus (cosmopedia-v2 subset), released under the Open Data Commons Attribution License (ODC-By) v1.0. Cosmopedia v2 is synthetic text generated with mistralai/Mixtral-8x7B-Instruct-v0.1.

See THIRD_PARTY_NOTICES.txt for source URLs, citations, and attribution notes.

License

The PKaI Nano 1.1 model artifact is released under the Apache License, Version 2.0. The bundled tokenizer and training-data sources have their own provenance and notices listed in THIRD_PARTY_NOTICES.txt.

Limitations

This is a small base model and has not been instruction-tuned or safety-tuned. It may produce inaccurate, unsafe, biased, or otherwise unsuitable text. Users are responsible for evaluating fitness, safety, and legal compliance for their own use cases.

Downloads last month: 12

Safetensors

Model size

0.2B params

Tensor type

F32