Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,65 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
- es
|
| 5 |
+
tags:
|
| 6 |
+
- pytorch
|
| 7 |
+
- custom-code
|
| 8 |
+
- text-generation
|
| 9 |
+
- conversational
|
| 10 |
+
- moire-attention
|
| 11 |
+
- biological-ai
|
| 12 |
+
license: mit
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# MoireFormer (104.9M Proof-of-Concept)
|
| 16 |
+
|
| 17 |
+
This repository hosts the PyTorch weights (`moire_phase2_weights_final.pt`) for **MoireFormer**,
|
| 18 |
+
a fundamentally new neural network architecture that replaces standard scalar dot-product attention with **Moir茅 phase-interference wave mechanics**.
|
| 19 |
+
|
| 20 |
+
Instead of computing attention via `Q 路 K^T`, this model splits token embeddings into amplitude and phase
|
| 21 |
+
(`q_amp`, `q_phase`) and computes attention through geometric wave resonance (`q_real * k_real + q_imag * k_imag`).
|
| 22 |
+
This proves that artificial intelligence can be trained using the continuous, biological wave-geometry observed
|
| 23 |
+
in human EEGs.
|
| 24 |
+
|
| 25 |
+
馃敆 **GitHub Repository (Code & Inference):** [anttiluode/MoireFormer](https://github.com/anttiluode/MoireFormer)
|
| 26 |
+
馃敆 **Theory & Clinical Proof:** [anttiluode/Geometric-Neuron](https://github.com/anttiluode/Geometric-Neuron)
|
| 27 |
+
|
| 28 |
+
## Model Details
|
| 29 |
+
* **Architecture:** MoireGPT (Custom Transformer Bolt-on)
|
| 30 |
+
* **Size:** 104.9M Parameters
|
| 31 |
+
* **Structure:** 8 Layers, 8 Heads, 768 Embedding Dimension
|
| 32 |
+
* **Capabilities:** Coherent bilingual (English/Spanish) grammar, persona adoption (Assistant), structural instruction following.
|
| 33 |
+
* **Disclaimer:** At ~100M parameters, this is a proof-of-substrate, not a knowledge oracle. It demonstrates that wave
|
| 34 |
+
fields can learn discrete human syntax, but it will hallucinate factual data due to its small parameter count.
|
| 35 |
+
|
| 36 |
+
## 鈿狅笍 How to Use (Read Before Downloading)
|
| 37 |
+
Because this is a novel mathematical architecture, **you cannot load this model using the standard Hugging Face `AutoModel` pipeline.**
|
| 38 |
+
|
| 39 |
+
To run inference, you must download these weights and run them through the custom Moir茅 architecture provided in the
|
| 40 |
+
GitHub repository.
|
| 41 |
+
|
| 42 |
+
### Step-by-Step Instructions:
|
| 43 |
+
|
| 44 |
+
**1. Clone the GitHub Repository:**
|
| 45 |
+
```bash
|
| 46 |
+
git clone [https://github.com/anttiluode/MoireFormer.git](https://github.com/anttiluode/MoireFormer.git)
|
| 47 |
+
cd MoireFormer
|
| 48 |
+
2. Download the Weights:
|
| 49 |
+
Download moire_phase2_weights_final.pt from the Files and versions tab of this Hugging Face repository and place
|
| 50 |
+
it in your cloned MoireFormer folder.
|
| 51 |
+
|
| 52 |
+
3. Run the Chat Interface:
|
| 53 |
+
|
| 54 |
+
Bash
|
| 55 |
+
pip install torch transformers datasets
|
| 56 |
+
python moire_chat.py --weights moire_phase2_weights_final.pt --size large
|
| 57 |
+
Training Curriculum
|
| 58 |
+
The model was trained in two continuous phases to demonstrate that wave-fields avoid catastrophic forgetting via
|
| 59 |
+
phase-locking (destructive and constructive interference):
|
| 60 |
+
|
| 61 |
+
Phase 1 (Base Geometry): 15 Epochs on a mixed dataset of Databricks Dolly-15k, WikiText-2, and OpenAssistant.
|
| 62 |
+
This established the foundational phase-space for English and conversational structure.
|
| 63 |
+
|
| 64 |
+
Phase 2 (Phase-Space Expansion): 5 Epochs finetuning on the Guanaco dataset to refine logical geometry
|
| 65 |
+
and instruction-following, organically expanding the model's topological complexity without overwriting previous data.
|