Supernova (25M) โ AlgoRythm Technologies
Enhanced AI Assistant with Tool Integration
Supernova is a 25,000,000-parameter decoder-only Transformer, built from scratch, using the GPTโ2 tokenizer (vocab size 50,257) with an exact parameter budget โ not exceeding by even 1 parameter.
๐ Enhanced with Advanced AI Capabilities:
- ๐ง Advanced Reasoning Engine: Multi-step problem solving, knowledge synthesis, domain expertise analysis
- ๐ Math Engine Integration: Advanced mathematical computations, scientific calculations, engineering equations
- ๐ Serper Web Search: Real-time information, current events, factual queries
- ๐ Multi-Domain Expertise: Science, Technology, Medicine, Business, Humanities, Arts
- โก Smart Tool Coordination: Intelligent routing and chaining of multiple tools for complex queries
- ๐ฌ Sophisticated Analysis: Context-aware responses with evidence synthesis and comprehensive reasoning
Key specs:
- Exact params: 25,000,000
- Tokenizer: GPTโ2 (vocab_size = 50,257)
- d_model: 320
- n_layers: 6
- n_heads: 10 (head_dim = 32)
- n_positions: 4,748 (learned positional embeddings)
- MLP ratio: 4.0 (hidden_size = 4 ร d_model)
- Weight tying: yes (LM head shares token embedding weights; no LM head bias)
- Dropout: configurable (default 0.1)
Why these numbers? They are chosen so that the total parameter count equals exactly 25,000,000 with GPTโ2 vocab size, using learned positional embeddings and tied output head.
Parameter proof sketch (matches code):
- Token embeddings: 50,257 ร 320 = 16,082,240
- Positional embeddings: 4,748 ร 320 = 1,519,360
- Per block: 12ยทd^2 + 13ยทd = 12ยท(320^2) + 13ยท320 = 1,228,800 + 4,160 = 1,232,960
- 6 blocks total: 7,397,760
- Final LayerNorm: 2ยทd = 640
- Total = 16,082,240 + 1,519,360 + 7,397,760 + 640 = 25,000,000
The verification script (supernova/verify_params.py) asserts this at runtime.
Brand behavior:
- The chat wrapper will return the AlgoRythm Tech โ Company Profile & Vision text (branding/ALGORHYTHM_TECH_PROFILE.txt) when a prompt asks about AlgoRythm Tech/company profile/vision.
Caution on scope:
- โKnows everything that happened in the worldโ is not achievable in a single model; instead, this repo provides a scalable pipeline to train on broad, diverse, and massive text corpora. You control the data sources via a YAML config.
Quickstart
- Install dependencies (Windows PowerShell)
- Ensure Python 3.10+ is installed
- Navigate to the project cd C:\Users\sriaa\supernova
- Install dependencies pip install -r requirements.txt
- If PyTorch wheel needs a specific index (GPU/CPU), follow https://pytorch.org/get-started/locally/
- Verify exact parameter count and tokenizer vocabulary size python -m supernova.verify_params --config .\configs\supernova_25m.json Expected output includes:
- vocab_size: 50257
- total_params: 25000000 (EXACT)
- Prepare data config (comprehensive knowledge training)
- For comprehensive coverage across all subjects: copy .\configs\comprehensive_data_sources.yaml .\configs\data_sources.yaml
- Or for basic setup: copy .\configs\data_sources.example.yaml .\configs\data_sources.yaml
- Edit the file and enable/disable sources you want. Many are large and require significant bandwidth.
- Train (logs gradient norm and uses a strong LR schedule) python -m supernova.train ^ --config .\configs\supernova_25m.json ^ --data-config .\configs\data_sources.yaml ^ --seq-len 1024 ^ --batch-size 16 ^ --grad-accum 8 ^ --lr 3e-4 ^ --warmup-steps 2000 ^ --max-steps 100000 ^ --save-every 10000 Notes:
- Gradient norm is printed regularly (no clipping by default).
- Adjust batch/accum/seq-len by your hardware.
- Cosine decay schedule with warmup is applied.
- Advanced Chat with Enhanced Reasoning (brand-aware; post-training)
API keys are already configured in configs/api_keys.yaml
- Math Engine: Built-in SymPy-based mathematical computation (no API key needed)
- Serper: Web search API configured
Advanced interactive chat with sophisticated reasoning
python .\chat_advanced.py --config .\configs\supernova_25m.json
Single prompt mode with advanced analysis
python .\chat_advanced.py --config .\configs\supernova_25m.json --prompt "Analyze the implications of artificial intelligence on healthcare from multiple perspectives"
Basic enhanced chat (legacy)
python .\chat_enhanced.py --config .\configs\supernova_25m.json
- ๐ง Complex reasoning queries โ Multi-step analysis using reasoning engine
- ๐ Mathematical queries โ Routed to math engine for precise calculations
- ๐ Current events/facts โ Routed to Serper for real-time web search
- ๐ข AlgoRythm Tech queries โ Returns company profile
- ๐ Multi-domain questions โ Synthesizes expertise across scientific, technical, and academic fields
- ๐ General knowledge โ Enhanced model generation with sophisticated context
Data sources (broad options)
- Included in configs/data_sources.example.yaml. Example (enable selectively):
- c4/en (Colossal Clean Crawled Corpus)
- wikipedia/en
- openwebtext
- bookcorpusopen
- the_pile Notes:
- Review licenses and terms of each dataset.
- You can add your own sources. The pipeline streams and interleaves by weight.
Training details
- Optimizer: AdamW (betas=(0.9, 0.95), weight_decay=0.1)
- LR schedule: Cosine decay with warmup (proper schedule; no โshabbyโ LR)
- Gradient norm: computed every log step and printed
- Mixed precision: optional (bf16/fp16) if available
- Checkpointing: periodic saving to output directory
Brand profile
- File: branding/ALGORHYTHM_TECH_PROFILE.txt
- The chat wrapper uses this exact text for company-related queries.
License
- Apache 2.0 (see LICENSE)
Attribution
- Built by AlgoRythm Technologies.
Model tree for algorythmtechnologies/Supernova25million
Unable to build the model tree, the base model loops to the model itself. Learn more.