--- license: apache-2.0 tags: - custom-architecture - from-scratch - language-model - non-transformer - tensorflow --- # TERA V2 A language model built entirely from scratch. No pretrained weights. No standard transformers. ## Architecture TERA V2 uses a custom non-transformer architecture with the following components: - **Time Mix** for sequence mixing - **Token Shift** for position encoding - **GroupNorm** for normalization - **Channel Mix** with **Squared ReLU** for feed-forward - **Stochastic Depth** for regularization - **Untied Embeddings** ## Model Specifications | Specification | Value | |---------------|-------| | Parameters | ~726K | | Vocabulary Size | 510 | | Context Length | 32 tokens | | Hidden Size (d_model) | 128 | | Attention Heads | 4 | | Layers | 3 | | Framework | TensorFlow / Keras | ## Training Details - Trained from scratch on clean question-answer pairs - No pretrained weights were used at any stage - Custom BPE-lite tokenizer trained on the same data - Loss function: Sigmoid cross-entropy - Optimizer: Adam with cosine learning rate schedule - Training format: Q: question / A: answer ## How To Use 1. Download all files from this repository 2. Install TensorFlow 3. Load the tokenizer from tokenizer.json 4. Build the model using model_config.json 5. Load weights from model.weights.h5 6. Format input as: Q: your question here / A: ## Example Input and Output Input: Q: What is the sun? Output: The sun is a star at the center of our solar system. Input: Q: Hello Output: Hello! How can I help you today? ## Files Included | File | Description | |------|-------------| | model.py | Model architecture code | | tokenizer.py | Tokenizer class code | | model_config.json | Model hyperparameters | | tokenizer.json | Trained tokenizer vocabulary | | model.weights.h5 | Trained model weights | | training_data.py | Training data used | | loss_history.json | Training loss over epochs | | training_state.json | Final training stats | ## Live Demo Try TERA V2 live at: https://huggingface.co/spaces/vedaco/tera.v2 ## Created By **Vedaco Team** ## License Apache 2.0