base_model: null model_creator: Arko007 model_version: v1.0 model_date: 2025-10 model_card_authors: - Arko007

πŸ’  Zenyx-42M: Where Calm Meets Power

Zenyx-42M is a 42M parameter GPT-2 style decoder-only transformer trained from scratch on high-quality educational web text.

The name "Zenyx" fuses Zen (calm, focused intelligence) and Onyx (strength, power)β€”embodying an efficient, capable language model.


Model Details

Component Value
Architecture GPT-2 style decoder-only transformer
Parameters ~42M (41.87M)
Layers 8
Hidden Size 512
Attention Heads 8
Context Length 512 tokens
Vocabulary Size 32,000 (BPE)
Positional Encoding Learned embeddings

Training Configuration

Setting Value
Training Data FineWeb-Edu (streamed, deduped, filtered)
Training Mode Scratch (no pretrain)
Optimizer AdamW (lr=3e-4, wd=0.1)
LR Schedule Warmup (2k) + Cosine decay
Batch Size 16 per device, 8 grad. accumulation
Effective Batch 128
Precision BFloat16 mixed precision
Hardware NVIDIA L4 (24GB VRAM)
Training Time ~6 hours
Iterations 100,000
Tokens Processed ~6.55B

Performance

Model Params Val Loss Zenyx Advantage
Zenyx-42M 42M 3.08 Baseline
GPT-1 117M ~3.3 +22% better
DistilGPT-2 82M ~3.1 Nearly tied
GPT-2 Small 124M ~2.8 3x more efficient
  • Training Loss: 0.398
  • Validation Loss: 3.08
  • Tokens Processed: ~6.55B

Capabilities

  • βœ… Text completion
  • βœ… Topic consistency
  • βœ… Grammar & syntax
  • βœ… Technical vocabulary
  • ⚠️ Base model (fine-tuning required for instructions)
  • ⚠️ 512 token context

Usage

Install dependencies

pip install torch transformers tokenizers

Quick Start Example

import torch
from model import NanoGPT
from config import NanoGPTConfig
from tokenizers import Tokenizer

# Load model
checkpoint = torch.load('best_model.pt', map_location='cpu', weights_only=False)
config = checkpoint['config']
model = NanoGPT(config)
model.load_state_dict(checkpoint['model'])
model.eval()

# Load tokenizer
tokenizer = Tokenizer.from_file('tokenizer.json')

def generate(prompt, max_tokens=100, temperature=0.6):
    tokens = tokenizer.encode(prompt).ids
    x = torch.tensor(tokens).unsqueeze(0)
    with torch.no_grad():
        output = model.generate(x, max_new_tokens=max_tokens, temperature=temperature, top_k=40)
    return tokenizer.decode(output.tolist())

prompt = "Artificial intelligence is"
output = generate(prompt)
print(output)

Limitations

  • ⚠️ Base model only (no instruction fine-tuning)
  • ⚠️ Can repeat phrases (use repetition penalty)
  • ⚠️ 512 token context window
  • ⚠️ Small parameter count limits knowledge
  • ⚠️ Possible factual errors
  • ⚠️ Not suitable for reliable code generation

Not recommended for:

  • ❌ Production applications (without fine-tuning)
  • ❌ Factual QA
  • ❌ Instruction following
  • ❌ Mission-critical tasks

Fine-Tuning

Designed as a base model for further fine-tuning.

Next Steps:

  1. Instruction fine-tuning (Alpaca/Dolly datasets)
  2. Code specialization (GitHub/StackOverflow)
  3. Domain adaptation (medical/legal/etc)
  4. RLHF (human feedback alignment)

Ethical Considerations

  • Training data may contain bias
  • Intended for educational/experimental use
  • Not suitable for high-stakes decisions
  • Fully open source and documented
  • No explicit safety fine-tuning

Citation

@misc{zenyx_42m_2025,
  author = {Arko007},
  title = {Zenyx-42M: Efficient Language Model Trained From Scratch},
  year = {2025},
  month = {October},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Arko007/Zenyx-42M}
}

Model Card Authors


License

Apache License 2.0
See LICENSE file for details.


Acknowledgments

  • HuggingFace FineWeb-Edu team (dataset)
  • GPT-2 (OpenAI, architecture inspiration)
  • PyTorch (framework)
  • NVIDIA L4 GPU (compute)

Version History

  • v1.0 (Oct 2025): Initial release β€” 42M params, 100k iterations, val loss 3.08

πŸ’  Branding

ZENYX
Symbol: πŸ’  diamond/geometric
Colors: Deep purple (#5B21B6) + Cyan (#06B6D4)
Font: Modern, clean, geometric
Taglines:

  • "Where Calm Meets Power" πŸ’ͺ
  • "Efficient Intelligence, Powerful Results" ⚑
  • "Small Model, Big Impact" πŸ’Ž
  • "Zen Precision, Onyx Strength" πŸ”·
    Brand Personality:
  • Calm & efficient
  • Powerful
  • Modern
  • Balanced (Zen meets power)

Built with πŸ’  using PyTorch and L4 GPU

Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Arko007/Zenyx-42M