Zenyx-42M / README.md
Arko007's picture
Update README.md
f26e7ec verified
metadata
license: apache-2.0
library_name: pytorch
pipeline_tag: text-generation
tags:
  - text-generation
  - pytorch
  - gpt
  - zenyx
  - transformer
  - from-scratch
  - causal-lm
  - custom-architecture
language:
  - en
datasets:
  - HuggingFaceFW/fineweb-edu

base_model: null model_creator: Arko007 model_version: v1.0 model_date: 2025-10 model_card_authors: - Arko007

๐Ÿ’  Zenyx-42M: Where Calm Meets Power

Zenyx-42M is a 42M parameter GPT-2 style decoder-only transformer trained from scratch on high-quality educational web text.

The name "Zenyx" fuses Zen (calm, focused intelligence) and Onyx (strength, power)โ€”embodying an efficient, capable language model.


Model Details

Component Value
Architecture GPT-2 style decoder-only transformer
Parameters ~42M (41.87M)
Layers 8
Hidden Size 512
Attention Heads 8
Context Length 512 tokens
Vocabulary Size 32,000 (BPE)
Positional Encoding Learned embeddings

Training Configuration

Setting Value
Training Data FineWeb-Edu (streamed, deduped, filtered)
Training Mode Scratch (no pretrain)
Optimizer AdamW (lr=3e-4, wd=0.1)
LR Schedule Warmup (2k) + Cosine decay
Batch Size 16 per device, 8 grad. accumulation
Effective Batch 128
Precision BFloat16 mixed precision
Hardware NVIDIA L4 (24GB VRAM)
Training Time ~6 hours
Iterations 100,000
Tokens Processed ~6.55B

Performance

Model Params Val Loss Zenyx Advantage
Zenyx-42M 42M 3.08 Baseline
GPT-1 117M ~3.3 +22% better
DistilGPT-2 82M ~3.1 Nearly tied
GPT-2 Small 124M ~2.8 3x more efficient
  • Training Loss: 0.398
  • Validation Loss: 3.08
  • Tokens Processed: ~6.55B

Capabilities

  • โœ… Text completion
  • โœ… Topic consistency
  • โœ… Grammar & syntax
  • โœ… Technical vocabulary
  • โš ๏ธ Base model (fine-tuning required for instructions)
  • โš ๏ธ 512 token context

Usage

Install dependencies

pip install torch transformers tokenizers

Quick Start Example

import torch
from model import NanoGPT
from config import NanoGPTConfig
from tokenizers import Tokenizer

# Load model
checkpoint = torch.load('best_model.pt', map_location='cpu', weights_only=False)
config = checkpoint['config']
model = NanoGPT(config)
model.load_state_dict(checkpoint['model'])
model.eval()

# Load tokenizer
tokenizer = Tokenizer.from_file('tokenizer.json')

def generate(prompt, max_tokens=100, temperature=0.6):
    tokens = tokenizer.encode(prompt).ids
    x = torch.tensor(tokens).unsqueeze(0)
    with torch.no_grad():
        output = model.generate(x, max_new_tokens=max_tokens, temperature=temperature, top_k=40)
    return tokenizer.decode(output.tolist())

prompt = "Artificial intelligence is"
output = generate(prompt)
print(output)

Limitations

  • โš ๏ธ Base model only (no instruction fine-tuning)
  • โš ๏ธ Can repeat phrases (use repetition penalty)
  • โš ๏ธ 512 token context window
  • โš ๏ธ Small parameter count limits knowledge
  • โš ๏ธ Possible factual errors
  • โš ๏ธ Not suitable for reliable code generation

Not recommended for:

  • โŒ Production applications (without fine-tuning)
  • โŒ Factual QA
  • โŒ Instruction following
  • โŒ Mission-critical tasks

Fine-Tuning

Designed as a base model for further fine-tuning.

Next Steps:

  1. Instruction fine-tuning (Alpaca/Dolly datasets)
  2. Code specialization (GitHub/StackOverflow)
  3. Domain adaptation (medical/legal/etc)
  4. RLHF (human feedback alignment)

Ethical Considerations

  • Training data may contain bias
  • Intended for educational/experimental use
  • Not suitable for high-stakes decisions
  • Fully open source and documented
  • No explicit safety fine-tuning

Citation

@misc{zenyx_42m_2025,
  author = {Arko007},
  title = {Zenyx-42M: Efficient Language Model Trained From Scratch},
  year = {2025},
  month = {October},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Arko007/Zenyx-42M}
}

Model Card Authors


License

Apache License 2.0
See LICENSE file for details.


Acknowledgments

  • HuggingFace FineWeb-Edu team (dataset)
  • GPT-2 (OpenAI, architecture inspiration)
  • PyTorch (framework)
  • NVIDIA L4 GPU (compute)

Version History

  • v1.0 (Oct 2025): Initial release โ€” 42M params, 100k iterations, val loss 3.08

๐Ÿ’  Branding

ZENYX
Symbol: ๐Ÿ’  diamond/geometric
Colors: Deep purple (#5B21B6) + Cyan (#06B6D4)
Font: Modern, clean, geometric
Taglines:

  • "Where Calm Meets Power" ๐Ÿ’ช
  • "Efficient Intelligence, Powerful Results" โšก
  • "Small Model, Big Impact" ๐Ÿ’Ž
  • "Zen Precision, Onyx Strength" ๐Ÿ”ท
    Brand Personality:
  • Calm & efficient
  • Powerful
  • Modern
  • Balanced (Zen meets power)

Built with ๐Ÿ’  using PyTorch and L4 GPU