Vadakayil LLM v3 (Medium)
A medium-sized character-level LLM trained on Capt Ajit Vadakayil's writings about Mach 0.3, consciousness, and Vedic philosophy.
Model Details
| Parameter | Value |
|---|---|
| Architecture | Decoder-only Transformer |
| Vocabulary | Character-level (74 tokens) |
| Parameters | ~12.7M |
| d_model | 512 |
| num_layers | 6 |
| num_heads | 8 |
| d_ff | 1024 |
| max_seq_len | 512 |
| Training Data | vadakayil_combined_training.txt (~57K chars) |
Training Configuration
| Parameter | Value |
|---|---|
| Epochs | 150 |
| Learning Rate | 3e-4 |
| Batch Size | 32 |
| Sequence Length | 256 |
| Train Split | 95% |
| Final Val Loss | 0.078 |
Performance
| Metric | Value |
|---|---|
| Initial Val Loss | 5.37 |
| Final Val Loss | 0.078 |
| Improvement | 98.5% |
Usage
With this codebase:
from model import TinyLLM
from tokenizer import Tokenizer
import torch
# Load tokenizer
tokenizer = Tokenizer.load("tokenizer.json")
# Load model
checkpoint = torch.load("model.pt", map_location="cpu")
model = TinyLLM(
vocab_size=74,
d_model=512,
num_heads=8,
num_layers=6,
d_ff=1024,
max_seq_len=512,
dropout=0.1,
pad_token_id=0
)
model.load_state_dict(checkpoint["model_state_dict"])
model.eval()
# Generate text
prompt = "What is Mach 0.3"
input_ids = tokenizer.encode(prompt, add_special_tokens=False)
input_ids = torch.tensor([input_ids], dtype=torch.long)
output = model.generate(input_ids, max_new_tokens=150, temperature=0.8)
print(tokenizer.decode(output[0].tolist()))
Example Prompts:
- "What is Mach 0.3 and why is it significant?"
- "Why is Mach 0.3 called the Paradox Rekha?"
- "What is the Silent Kalki Revolution of Consciousness?"
- "What does the movie Thondi Muthalum Driksakshiyum represent?"
Training Data Topics
- Mach 0.3 and fluid dynamics
- Compressible vs incompressible flow equations
- Prandtl-Glauert correction
- Consciousness and Vedic philosophy
- Silent Kalki Revolution
- Evidence and Witness (Thondi Muthalum Driksakshiyum film analysis)
- Paradox Rekha concept
- Capt Ajit Vadakayil's teachings
Limitations
- Character-level tokenizer (74 characters)
- Trained on limited dataset (~57K characters)
- May produce inaccurate output for technical queries
- Best for philosophical/consciousness topics
License
MIT License
Credits
Training data: Capt Ajit Vadakayil's writings Model: Custom transformer implementation
- Downloads last month
- 23