Spaces:

Aluode
/

MoireFormer-Chat

Sleeping

App Files Files Community

MoireFormer-Chat / README.md

Aluode

Update README.md

0c8ec8c verified 29 days ago

preview code

raw

history blame contribute delete

1.95 kB

	---
	title: MoireFormer Chat
	emoji: 🌊
	colorFrom: blue
	colorTo: indigo
	sdk: gradio
	app_file: app.py
	pinned: false
	license: mit
	---

	# MoireFormer (104.9M Proof-of-Concept)

	This repository hosts the PyTorch weights moire_phase2_weights_final.pt for MoireFormer, a neural network architecture that replaces standard dot-product attention with Moiré phase-interference wave mechanics.

	Instead of computing attention via Q · K^T, this model splits token embeddings into amplitude and phase and computes attention through geometric wave resonance.

	GitHub Code:
	https://github.com/anttiluode/MoireFormer

	Theory:
	https://github.com/anttiluode/Geometric-Neuron

	---

	## Model Details

	Architecture: MoireGPT (custom transformer)

	Parameters: 104.9M

	Structure:
	- 8 layers
	- 8 heads
	- 768 embedding dimension

	Capabilities:
	- English / Spanish syntax
	- conversational structure
	- instruction following

	Note: This is a proof-of-substrate model, not a factual knowledge model.

	---

	## How To Run

	This model cannot be loaded with `AutoModel`.

	It must run through the custom architecture.

	### 1 Clone repo

	git clone https://github.com/anttiluode/MoireFormer.git
	cd MoireFormer

	### 2 Install dependencies

	pip install torch transformers datasets

	### 3 Download weights

	Download:

	https://huggingface.co/Aluode/MoireFormer/blob/main/moire_phase2_weights_final.pt

	Place the file inside the repo folder.

	### 4 Run chat interface

	python moire_chat.py --weights moire_phase2_weights_final.pt --size large

	---

	## Training Curriculum

	Phase 1
	15 epochs on Dolly-15k, WikiText-2, OpenAssistant.

	Phase 2
	5 epochs on Guanaco dataset.

	The experiment demonstrates that wave-field attention can learn discrete language syntax via phase geometry.

	---

	## Disclaimer

	This is an experimental architecture exploring biological wave-field computation in neural networks.

	At 100M parameters it will hallucinate factual information.