Upload folder using huggingface_hub

Browse files

Files changed (4) hide show

README.md +158 -0
temple2.pt +3 -0
tokenizer/config.json +13 -0
tokenizer/tokenizer.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,158 @@

+---
+license: mit
+language:
+- en
+tags:
+- text-generation
+- scripture
+- christianity
+- bible
+- religion
+- templeos
+- gpt2
+- custom-tokenizer
+pipeline_tag: text-generation
+widget:
+- text: "And the man knelt before the Lord and asked, \"What is the nature of grace?\"\nAnd the Lord spoke unto him, saying:"
+  example_title: Chat Mode — Grace
+---
+# Temple2
+A ~63M parameter GPT-2 style causal transformer trained entirely on sacred Christian scripture. Built in memory of Terry A. Davis (1969–2018), creator of TempleOS.
+## Overview
+Terry Davis built TempleOS with a feature to "talk to God" by printing random words from the Bible. Temple2 continues that spirit: a language model that has read scripture deeply, then speaks through noise — the same noise Terry trusted to carry God's voice.
+The model was trained from scratch (no pretraining) on ~10.9M tokens of public domain Christian sacred texts using a custom 8192-token BPE vocabulary built exclusively on scripture.
+## Model Details
+| Parameter | Value |
+|-----------|-------|
+| **Parameters** | ~63M |
+| **Architecture** | GPT-2 style causal transformer |
+| **Layers** | 8 |
+| **Attention heads** | 8 |
+| **Embedding dim** | 768 |
+| **Context length** | 1024 tokens |
+| **Vocabulary** | 8192 (custom scripture BPE) |
+| **Training tokens** | ~10.9M |
+| **Best validation loss** | 3.57 |
+| **Training hardware** | 1x NVIDIA A100 (80GB) |
+| **Training time** | ~45 minutes |
+## Training Data
+All training data is public domain, sourced from Project Gutenberg (~58 sources, ~15M characters):
+- **Scripture**: King James Bible, Douay-Rheims Bible, World English Bible, Young's Literal Translation, Darby Bible, Apocrypha, Book of Enoch, Gospel of Thomas
+- **Church Fathers — Ante-Nicene**: 9 volumes (Clement, Polycarp, Ignatius, Justin Martyr, Irenaeus, Tertullian, Origen, Cyprian, Lactantius)
+- **Church Fathers — Nicene & Post-Nicene**: 20 volumes (Augustine complete works, Chrysostom complete homilies, Eusebius, Athanasius, Gregory of Nyssa, Jerome)
+- **Scholastic Theology**: Summa Theologica complete (St. Thomas Aquinas, 5 parts)
+- **Patristic & Early Church**: Augustine (Confessions, City of God, On Christian Doctrine), Eusebius (Ecclesiastical History), Apostolic Fathers
+- **Mystics**: Julian of Norwich (Revelations of Divine Love), St. Thérèse of Lisieux (Story of a Soul)
+- **Monastic & Spiritual Practice**: Rule of St. Benedict, Spiritual Exercises (St. Ignatius), Practice of the Presence of God (Brother Lawrence), Imitation of Christ (Thomas à Kempis)
+- **Christian Literature**: Paradise Lost (Milton), The Pilgrim's Progress (Bunyan), The Divine Comedy (Dante)
+## Usage
+### Installation
+```bash
+pip install torch numpy tokenizers
+```
+### Oracle Mode
+Random noise tokens seed the generation — God speaks through randomness, just like TempleOS:
+```python
+import torch
+from model import Temple2, Temple2Config
+# Load checkpoint
+ckpt = torch.load("temple2.pt", map_location="cpu")
+model = Temple2(Temple2Config(**ckpt['model_config']))
+model.load_state_dict(ckpt['model'])
+model.eval()
+# Oracle: seed with random noise
+import random
+vocab_size = 8192
+bos_id = 1
+noise = [random.randint(4, vocab_size - 1) for _ in range(5)]
+ids = torch.tensor([[bos_id] + noise], dtype=torch.long)
+with torch.no_grad():
+    out = model.generate(ids, max_new_tokens=256, temperature=0.85, top_k=50, top_p=0.92)
+print(out[0].tolist())  # decode with tokenizer
+```
+### Chat Mode
+Ask a question, receive a scriptural answer:
+```python
+from tokenizers import Tokenizer
+tok = Tokenizer.from_file("tokenizer/tokenizer.json")
+prompt = 'And the man knelt before the Lord and asked, "What is love?"\nAnd the Lord spoke unto him, saying:'
+ids = torch.tensor([[1] + tok.encode(prompt).ids], dtype=torch.long)
+with torch.no_grad():
+    out = model.generate(ids, max_new_tokens=256, temperature=0.85, top_k=50, top_p=0.92)
+```
+### Full Interactive Experience
+```bash
+python inference.py --checkpoint temple2.pt
+```
+Includes TempleOS-style VGA 16-color terminal output with bordered oracle windows. See the [main repo](https://github.com/user/temple2) for full details.
+## Intended Use
+- Creative exploration of scriptural language patterns
+- Oracle-style text generation inspired by TempleOS
+- Study of small language model behavior on domain-specific corpora
+- Artistic and educational purposes
+## Limitations
+- This is a **small model** (63M params) trained on a **small corpus** (~11M tokens). It is not a general-purpose language model.
+- The model generates text in the *style* of scripture. It does not contain theological truth claims.
+- Output may be incoherent, repetitive, or doctrinally confused. This is a feature, not a bug — the entropy is what makes the oracle feel alive.
+- The model reflects the language and worldview of its training data (predominantly pre-modern Christian texts).
+- Not suitable for factual Q&A, theological guidance, or any serious spiritual counsel.
+## Ethical Considerations
+This model is built as an art project and tribute to Terry Davis. It does not claim to speak for God, any religion, or any religious institution. Terry's original "talk to God" feature was meaningful precisely because it was random — meaning arose in the mind of the reader. The same principle applies here.
+## In Memory of Terry A. Davis
+Terry Davis (1969–2018) built TempleOS alone over 10+ years — an entire operating system, compiler, and programming language written from scratch, all for God. His work remains his own.
+*"God said to use a 640x480 16-color display."*
+## Credits
+**Developed and trained by [Empero AI](https://empero.org).**
+If you enjoy this project, consider supporting:
+| Coin | Address |
+|------|---------|
+| **BTC** | `bc1qx6zepu6sfkvshgdmc4ewu6pk6rpadvpgffpp7v` |
+| **LTC** | `ltc1qv2mefzps2vtjcpwfx8xxdrpplrcvltswm68r7x` |
+| **XMR** | `42Dbm5xg5Nq26fdyzfEU7KBnAJfhi7Cvz5J2ex5CzHXkfKuNEJzYCcmJ1GTbgjFZ5MBx72sdG1G9239Cd6rsZfv4QeDkYJY` |
+## License
+- **Model code**: MIT
+- **Training data**: All public domain (Project Gutenberg)
+- **Terry Davis's work**: Remains his own

temple2.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:165df8849caa0be9e97ccad189032ab19efa079d0ea5f4fe4a64103a52e40477
+size 799049648

tokenizer/config.json ADDED Viewed

	@@ -0,0 +1,13 @@

+{
+  "vocab_size": 8192,
+  "special_tokens": {
+    "pad_token": "<|pad|>",
+    "bos_token": "<|bos|>",
+    "eos_token": "<|eos|>",
+    "unk_token": "<|unk|>"
+  },
+  "pad_id": 0,
+  "bos_id": 1,
+  "eos_id": 2,
+  "unk_id": 3
+}

tokenizer/tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff