File size: 2,135 Bytes
02136f2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
license: apache-2.0
tags:
  - custom-architecture
  - from-scratch
  - language-model
  - non-transformer
  - tensorflow
---

# TERA V2

A language model built entirely from scratch. No pretrained weights. No standard transformers.

## Architecture

TERA V2 uses a custom non-transformer architecture with the following components:

- **Time Mix** for sequence mixing
- **Token Shift** for position encoding
- **GroupNorm** for normalization
- **Channel Mix** with **Squared ReLU** for feed-forward
- **Stochastic Depth** for regularization
- **Untied Embeddings**

## Model Specifications

| Specification | Value |
|---------------|-------|
| Parameters | ~726K |
| Vocabulary Size | 510 |
| Context Length | 32 tokens |
| Hidden Size (d_model) | 128 |
| Attention Heads | 4 |
| Layers | 3 |
| Framework | TensorFlow / Keras |

## Training Details

- Trained from scratch on clean question-answer pairs
- No pretrained weights were used at any stage
- Custom BPE-lite tokenizer trained on the same data
- Loss function: Sigmoid cross-entropy
- Optimizer: Adam with cosine learning rate schedule
- Training format: Q: question / A: answer

## How To Use

1. Download all files from this repository
2. Install TensorFlow
3. Load the tokenizer from tokenizer.json
4. Build the model using model_config.json
5. Load weights from model.weights.h5
6. Format input as: Q: your question here / A:

## Example Input and Output

Input: Q: What is the sun?

Output: The sun is a star at the center of our solar system.

Input: Q: Hello

Output: Hello! How can I help you today?

## Files Included

| File | Description |
|------|-------------|
| model.py | Model architecture code |
| tokenizer.py | Tokenizer class code |
| model_config.json | Model hyperparameters |
| tokenizer.json | Trained tokenizer vocabulary |
| model.weights.h5 | Trained model weights |
| training_data.py | Training data used |
| loss_history.json | Training loss over epochs |
| training_state.json | Final training stats |

## Live Demo

Try TERA V2 live at: https://huggingface.co/spaces/vedaco/tera.v2

## Created By

**Vedaco Team**

## License

Apache 2.0