AbstractPhil commited on
Commit
cbe88c8
Β·
verified Β·
1 Parent(s): 4a3c922

Upload bert-thetis-tiny-wikitext103/2025-10-13_20-09-33/final/README.md with huggingface_hub

Browse files
bert-thetis-tiny-wikitext103/2025-10-13_20-09-33/final/README.md ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # BERT-Thetis Checkpoint
2
+
3
+ **Model Variant:** bert-thetis-tiny-wikitext103
4
+ **Training Run:** 2025-10-13_20-09-33
5
+ **Checkpoint:** final
6
+ **Training Step:** 48,010
7
+ **Epoch:** 10/10
8
+
9
+ ## 🌊 What is BERT-Thetis?
10
+
11
+ BERT-Thetis is a geometric language model that replaces traditional learned embeddings with **deterministic crystal structures**. Instead of a 50KΓ—768 embedding table (38M parameters), we use:
12
+
13
+ - **Beatrix Staircase Encodings**: Deterministic positional structure (0 parameters)
14
+ - **Character Composition**: Learnable semantic bridge
15
+ - **Crystal Inflation**: 5-vertex simplex generation (0 parameters)
16
+
17
+ This reduces vocabulary parameters by **~95%** while maintaining competitive performance.
18
+
19
+ ## πŸ“Š Performance
20
+
21
+ | Metric | Value |
22
+ |--------|-------|
23
+ | **Validation Loss** | 5.9593 |
24
+ | **Perplexity** | 387.32 |
25
+ | **Best Accuracy** | 16.97% |
26
+ | **Training Steps** | 48,010 |
27
+
28
+ ## πŸ—οΈ Architecture
29
+
30
+ - **Layers:** 4
31
+ - **Dimension:** 256
32
+ - **Attention Heads:** 4
33
+ - **Beatrix Levels:** 16
34
+ - **Crystal Vertices:** 5 (pentachoron)
35
+
36
+ ## πŸ“¦ Model Details
37
+
38
+ - **Total Parameters:** ~0.0M
39
+ - **Dataset:** WikiText-103 (~103M tokens)
40
+ - **Batch Size:** 256
41
+ - **Learning Rate:** 0.0005
42
+ - **Mixed Precision:** True
43
+
44
+ ## πŸš€ Usage
45
+
46
+ ```python
47
+ import torch
48
+ from pathlib import Path
49
+ from geovocab2.train.model.core.bert_thetis import ThetisConfig, ThetisForMaskedLM
50
+
51
+ # Path structure: weights/bert-thetis-tiny-wikitext103/2025-10-13_20-09-33/final/
52
+
53
+ # Load config
54
+ import json
55
+ with open("config.json") as f:
56
+ config_dict = json.load(f)
57
+
58
+ config = ThetisConfig(**config_dict)
59
+
60
+ # Load model
61
+ model = ThetisForMaskedLM(config)
62
+
63
+ # Load weights (safetensors or pytorch)
64
+ if Path("model.safetensors").exists():
65
+ from safetensors.torch import load_file
66
+ state_dict = load_file("model.safetensors")
67
+ model.load_state_dict(state_dict)
68
+ else:
69
+ model.load_state_dict(torch.load("pytorch_model.bin"))
70
+
71
+ model.eval()
72
+
73
+ # Use for masked token prediction
74
+ from transformers import AutoTokenizer
75
+ tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
76
+
77
+ text = "The capital of France is [MASK]."
78
+ inputs = tokenizer(text, return_tensors="pt")
79
+ outputs = model(
80
+ token_ids=inputs["input_ids"],
81
+ attention_mask=inputs["attention_mask"]
82
+ )
83
+ predictions = outputs.argmax(dim=-1)
84
+ predicted_text = tokenizer.decode(predictions[0])
85
+ print(predicted_text)
86
+ ```
87
+
88
+ ## πŸ“ Directory Structure
89
+
90
+ ```
91
+ weights/
92
+ └── bert-thetis-tiny-wikitext103/
93
+ └── 2025-10-13_20-09-33/
94
+ β”œβ”€β”€ best/ ← Best validation checkpoint
95
+ β”œβ”€β”€ final/ ← Final training checkpoint
96
+ β”œβ”€β”€ step-1000/ ← Intermediate checkpoints
97
+ └── epoch-1/ ← Per-epoch checkpoints
98
+ ```
99
+
100
+ ## πŸ“š Resources
101
+
102
+ - **Repository:** [github.com/AbstractEyes/lattice_vocabulary](https://github.com/AbstractEyes/lattice_vocabulary)
103
+ - **Hub:** [huggingface.co/AbstractPhil/bert-thetis-tiny-wikitext103](https://huggingface.co/AbstractPhil/bert-thetis-tiny-wikitext103)
104
+ - **Paper:** Coming soon!
105
+ - **Author:** AbstractPhil
106
+
107
+ ## πŸŽ“ Citation
108
+
109
+ ```bibtex
110
+ @misc{bert-thetis-2025,
111
+ author = {AbstractPhil},
112
+ title = {BERT-Thetis: Geometric BERT with Deterministic Crystal Embeddings},
113
+ year = {2025},
114
+ publisher = {HuggingFace},
115
+ url = {https://huggingface.co/AbstractPhil/bert-thetis-tiny-wikitext103}
116
+ }
117
+ ```
118
+
119
+ ---
120
+
121
+ **Status:** βœ… Training Complete
122
+ **Generated:** 2025-10-14 07:20:54 UTC