File size: 2,135 Bytes
02136f2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | ---
license: apache-2.0
tags:
- custom-architecture
- from-scratch
- language-model
- non-transformer
- tensorflow
---
# TERA V2
A language model built entirely from scratch. No pretrained weights. No standard transformers.
## Architecture
TERA V2 uses a custom non-transformer architecture with the following components:
- **Time Mix** for sequence mixing
- **Token Shift** for position encoding
- **GroupNorm** for normalization
- **Channel Mix** with **Squared ReLU** for feed-forward
- **Stochastic Depth** for regularization
- **Untied Embeddings**
## Model Specifications
| Specification | Value |
|---------------|-------|
| Parameters | ~726K |
| Vocabulary Size | 510 |
| Context Length | 32 tokens |
| Hidden Size (d_model) | 128 |
| Attention Heads | 4 |
| Layers | 3 |
| Framework | TensorFlow / Keras |
## Training Details
- Trained from scratch on clean question-answer pairs
- No pretrained weights were used at any stage
- Custom BPE-lite tokenizer trained on the same data
- Loss function: Sigmoid cross-entropy
- Optimizer: Adam with cosine learning rate schedule
- Training format: Q: question / A: answer
## How To Use
1. Download all files from this repository
2. Install TensorFlow
3. Load the tokenizer from tokenizer.json
4. Build the model using model_config.json
5. Load weights from model.weights.h5
6. Format input as: Q: your question here / A:
## Example Input and Output
Input: Q: What is the sun?
Output: The sun is a star at the center of our solar system.
Input: Q: Hello
Output: Hello! How can I help you today?
## Files Included
| File | Description |
|------|-------------|
| model.py | Model architecture code |
| tokenizer.py | Tokenizer class code |
| model_config.json | Model hyperparameters |
| tokenizer.json | Trained tokenizer vocabulary |
| model.weights.h5 | Trained model weights |
| training_data.py | Training data used |
| loss_history.json | Training loss over epochs |
| training_state.json | Final training stats |
## Live Demo
Try TERA V2 live at: https://huggingface.co/spaces/vedaco/tera.v2
## Created By
**Vedaco Team**
## License
Apache 2.0
|