Supernova (25M) โ€” AlgoRythm Technologies

Enhanced AI Assistant with Tool Integration

Supernova is a 25,000,000-parameter decoder-only Transformer, built from scratch, using the GPTโ€‘2 tokenizer (vocab size 50,257) with an exact parameter budget โ€” not exceeding by even 1 parameter.

๐Ÿš€ Enhanced with Advanced AI Capabilities:

  • ๐Ÿง  Advanced Reasoning Engine: Multi-step problem solving, knowledge synthesis, domain expertise analysis
  • ๐Ÿ“Š Math Engine Integration: Advanced mathematical computations, scientific calculations, engineering equations
  • ๐Ÿ” Serper Web Search: Real-time information, current events, factual queries
  • ๐ŸŽ“ Multi-Domain Expertise: Science, Technology, Medicine, Business, Humanities, Arts
  • โšก Smart Tool Coordination: Intelligent routing and chaining of multiple tools for complex queries
  • ๐Ÿ”ฌ Sophisticated Analysis: Context-aware responses with evidence synthesis and comprehensive reasoning

Key specs:

  • Exact params: 25,000,000
  • Tokenizer: GPTโ€‘2 (vocab_size = 50,257)
  • d_model: 320
  • n_layers: 6
  • n_heads: 10 (head_dim = 32)
  • n_positions: 4,748 (learned positional embeddings)
  • MLP ratio: 4.0 (hidden_size = 4 ร— d_model)
  • Weight tying: yes (LM head shares token embedding weights; no LM head bias)
  • Dropout: configurable (default 0.1)

Why these numbers? They are chosen so that the total parameter count equals exactly 25,000,000 with GPTโ€‘2 vocab size, using learned positional embeddings and tied output head.

Parameter proof sketch (matches code):

  • Token embeddings: 50,257 ร— 320 = 16,082,240
  • Positional embeddings: 4,748 ร— 320 = 1,519,360
  • Per block: 12ยทd^2 + 13ยทd = 12ยท(320^2) + 13ยท320 = 1,228,800 + 4,160 = 1,232,960
  • 6 blocks total: 7,397,760
  • Final LayerNorm: 2ยทd = 640
  • Total = 16,082,240 + 1,519,360 + 7,397,760 + 640 = 25,000,000

The verification script (supernova/verify_params.py) asserts this at runtime.

Brand behavior:

  • The chat wrapper will return the AlgoRythm Tech โ€“ Company Profile & Vision text (branding/ALGORHYTHM_TECH_PROFILE.txt) when a prompt asks about AlgoRythm Tech/company profile/vision.

Caution on scope:

  • โ€œKnows everything that happened in the worldโ€ is not achievable in a single model; instead, this repo provides a scalable pipeline to train on broad, diverse, and massive text corpora. You control the data sources via a YAML config.

Quickstart

  1. Install dependencies (Windows PowerShell)
  • Ensure Python 3.10+ is installed
  • Navigate to the project cd C:\Users\sriaa\supernova
  • Install dependencies pip install -r requirements.txt
  • If PyTorch wheel needs a specific index (GPU/CPU), follow https://pytorch.org/get-started/locally/
  1. Verify exact parameter count and tokenizer vocabulary size python -m supernova.verify_params --config .\configs\supernova_25m.json Expected output includes:
  • vocab_size: 50257
  • total_params: 25000000 (EXACT)
  1. Prepare data config (comprehensive knowledge training)
  • For comprehensive coverage across all subjects: copy .\configs\comprehensive_data_sources.yaml .\configs\data_sources.yaml
  • Or for basic setup: copy .\configs\data_sources.example.yaml .\configs\data_sources.yaml
  • Edit the file and enable/disable sources you want. Many are large and require significant bandwidth.
  1. Train (logs gradient norm and uses a strong LR schedule) python -m supernova.train ^ --config .\configs\supernova_25m.json ^ --data-config .\configs\data_sources.yaml ^ --seq-len 1024 ^ --batch-size 16 ^ --grad-accum 8 ^ --lr 3e-4 ^ --warmup-steps 2000 ^ --max-steps 100000 ^ --save-every 10000 Notes:
  • Gradient norm is printed regularly (no clipping by default).
  • Adjust batch/accum/seq-len by your hardware.
  • Cosine decay schedule with warmup is applied.
  1. Advanced Chat with Enhanced Reasoning (brand-aware; post-training)

API keys are already configured in configs/api_keys.yaml

- Math Engine: Built-in SymPy-based mathematical computation (no API key needed)

- Serper: Web search API configured

Advanced interactive chat with sophisticated reasoning

python .\chat_advanced.py --config .\configs\supernova_25m.json

Single prompt mode with advanced analysis

python .\chat_advanced.py --config .\configs\supernova_25m.json --prompt "Analyze the implications of artificial intelligence on healthcare from multiple perspectives"

Basic enhanced chat (legacy)

python .\chat_enhanced.py --config .\configs\supernova_25m.json

  • ๐Ÿง Complex reasoning queries โ†’ Multi-step analysis using reasoning engine
  • ๐Ÿ“Š Mathematical queries โ†’ Routed to math engine for precise calculations
  • ๐Ÿ” Current events/facts โ†’ Routed to Serper for real-time web search
  • ๐Ÿข AlgoRythm Tech queries โ†’ Returns company profile
  • ๐Ÿ“š Multi-domain questions โ†’ Synthesizes expertise across scientific, technical, and academic fields
  • ๐ŸŽ“ General knowledge โ†’ Enhanced model generation with sophisticated context

Data sources (broad options)

  • Included in configs/data_sources.example.yaml. Example (enable selectively):
    • c4/en (Colossal Clean Crawled Corpus)
    • wikipedia/en
    • openwebtext
    • bookcorpusopen
    • the_pile Notes:
  • Review licenses and terms of each dataset.
  • You can add your own sources. The pipeline streams and interleaves by weight.

Training details

  • Optimizer: AdamW (betas=(0.9, 0.95), weight_decay=0.1)
  • LR schedule: Cosine decay with warmup (proper schedule; no โ€œshabbyโ€ LR)
  • Gradient norm: computed every log step and printed
  • Mixed precision: optional (bf16/fp16) if available
  • Checkpointing: periodic saving to output directory

Brand profile

  • File: branding/ALGORHYTHM_TECH_PROFILE.txt
  • The chat wrapper uses this exact text for company-related queries.

License

  • Apache 2.0 (see LICENSE)

Attribution

  • Built by AlgoRythm Technologies.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for algorythmtechnologies/Supernova25million

Unable to build the model tree, the base model loops to the model itself. Learn more.