ART-GPT-16L-1024D

A GPT-2 style language model trained with Attractor-Regularized Training (ART).

Model Details

Property	Value
Parameters	305,335,296 (305.3M)
Layers	16
Embedding Dim	1024
Attention Heads	16
Context Length	1024
Vocab Size	50257

Training

Dataset: OpenWebText
Training Steps: 100,000
Validation Loss: 2.9706
Validation Perplexity: 19.5

ART (Attractor-Regularized Training)

This model was trained with ART, which enforces empirically-discovered conservation laws as soft constraints during training. The conservation laws guide the model toward optimal weight configurations characterized by mathematical constants.

Usage

from transformers import GPT2LMHeadModel, GPT2Tokenizer

model = GPT2LMHeadModel.from_pretrained("your-username/art-gpt-100k")
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

text = "The meaning of life is"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

Citation

@misc{art2026,
  author = {Knopp, Christian},
  title = {Attractor-Regularized Training for Neural Networks},
  year = {2026},
  url = {https://github.com/conceptual1/ART}
}

License

MIT

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Zynerji
/

ART-GPT-16L-1024D