RthItalia commited on
Commit
7bb177f
·
verified ·
1 Parent(s): 75ee6dd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +140 -67
README.md CHANGED
@@ -1,107 +1,180 @@
1
  ---
2
- language: en
3
  license: cc-by-nc-4.0
 
 
 
 
 
 
4
  tags:
5
- - zetagrid
6
- - cpu-da
 
7
  - tcn
8
  - fractal
9
- - 25b
10
- datasets:
11
- - custom
12
- metrics:
13
- - loss
14
  ---
15
 
16
- # 📇 Model Card: RTH-LM (25B)
17
 
18
- ## Model Details
19
- - **Name:** RTH-LM (25B)
20
- - **Architecture:** Fractal Gated Causal TCN (Temporal Convolutional Network)
21
- - **Parameters:** 7B (Physical) / 25B (Effective Fractal Capacity)
22
- - **Author:** Christian Quintino De Luca (RTH Italia)
23
- - **Release Date:** February 2026
24
- - **License:** CC BY-NC 4.0 (Research) / Commercial (Enterprise)
25
- - **Paper (Figshare):** https://doi.org/10.6084/m9.figshare.31376560
26
 
27
- RTH-LM (25B) is a **Fractal TCN (Temporal Convolutional Network)** Language Model, designed for high-efficiency inference on CPU/Consumer Hardware and massive scalability on GPUs.
28
 
29
- Unlike Traditional Transformers, ZetaGrid uses a **Gated Causal TCN backbone** with **Fractal Scaling**, allowing it to model long-range dependencies with significantly lower memory overhead during inference.
30
 
31
- ---
32
 
33
- ## 📊 Model Specs
34
 
35
- | Feature | Specification |
36
- | :--- | :--- |
37
- | **Parameters** | 25 Billion (25B) |
38
- | **Architecture** | Fractal Gated TCN (Non-Transformer) |
39
- | **Layers** | 32 (Phase 2) |
40
- | **Context Window** | 256 - 1024 (Fractal Expansion Capable) |
41
- | **Training Data** | 1.48 GB Cleaned Text (Wiki/Books) |
42
- | **Final Loss** | **1.0675** (Phase 2) |
43
- | **Quantization** | QULP 2-bit (Supported) |
44
 
45
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
- ## 🚀 Usage (Inference)
 
 
 
 
 
 
 
 
 
 
48
 
49
  ### Prerequisites
50
- Use the ZetaGrid reference repository and download the model artifacts from this Hugging Face repository.
 
51
 
52
  ```bash
53
- # Clone the repo
54
  git clone https://github.com/rthgit/ZetaGrid
55
  cd ZetaGrid
56
  ```
57
 
58
- ### Running the Model (Python)
59
- Place the required artifacts in the working directory, or update the paths in the script:
60
 
61
- - `zeta25b_step15000.pt` - Soul/checkpoint weights
62
- - `zetagrid_25b_production.npy` - Genome weight bank
 
 
63
 
64
- ```python
65
- import torch
66
- from ZETAGRID_INFERENCE import load_model, generate
67
 
68
- # Load 25B Model
69
- model = load_model("zeta25b_step15000.pt", genome="zetagrid_25b_production.npy")
70
 
71
- # Generate
72
- text = generate(model, "The future of AI is")
73
- print(text)
74
  ```
75
 
76
- ### QULP 2-bit Inference (Ultra-Low Memory)
77
- If using the QULP artifact, download `zeta25b_2bit.qulp` from the model repository and run the matching local inference script when available:
 
 
 
78
 
79
  ```bash
80
- python QULP_INFERENCE.py --model zeta25b_2bit.qulp
81
  ```
82
 
83
- ---
84
 
85
- ## 🧬 Architecture: The "Fractal Soul"
86
 
87
- ZetaGrid is **NOT** a Transformer. It is a TCN-based organism.
88
- - **Genome:** A fixed 7GB "DNA" bank of weights (`zetagrid_25b_production.npy`).
89
- - **Phenotype:** The model layers are "grown" from this genome on the fly.
90
- - **Training:** Only the "Soul" (LoRA Adapters + Norms) is trained (~300MB), making the model extremely portable.
91
- - **Fractal Scaling:** The 25B model can be fractally expanded to 50B, 100B+ by duplicating layers and adding self-linear noise.
92
 
93
- ---
 
 
 
 
94
 
95
- ## 📈 Performance
96
 
97
- - **Phase 1 (Evolution):** 200 Generations of Genome Optimization.
98
- - **Phase 2 (Gradient):** 15,000 Steps of TCN+LoRA Fine-Tuning.
99
- - **Convergence:** Beat target loss of 1.5, achieving **1.0675**.
100
- - **Capabilities:** Narrative coherence, English syntax mastery, abstract reasoning.
101
 
102
- ---
 
 
 
 
103
 
104
- ## 📜 License
105
- CC BY-NC 4.0 (Creative Commons Non-Commercial) for Research.
106
- **Commercial Use:** Requires a license from **RTH Italia**.
107
- For inquiries: info@rthitalia.com
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ - it
6
+ - py
7
+ - js
8
+ - cpp
9
  tags:
10
+ - text-generation
11
+ - code-generation
12
+ - non-transformer
13
  - tcn
14
  - fractal
15
+ - lora
16
+ - genome
17
+ - rth-code
18
+ - zetagrid
19
+ pipeline_tag: text-generation
20
  ---
21
 
22
+ # RTH-Code 25B
23
 
24
+ RTH-Code 25B is an experimental code-specialist Soul for the RTH-LM / ZetaGrid architecture.
 
 
 
 
 
 
 
25
 
26
+ It is not a standalone Transformer model. It is part of the RTH-LM Genome/Soul system: a shared frozen Genome provides the reusable parameter substrate, while a smaller trainable Soul carries task specialization.
27
 
28
+ ## Status
29
 
30
+ This is an early proof-of-concept research release. It is intended for architecture evaluation, local experimentation, and reproducibility work around non-Transformer language models.
31
 
32
+ Do not treat this release as a production coding assistant or as evidence of parity with frontier code models. The current release should be evaluated with fixed prompts, held-out code tasks, and reproducible benchmark harnesses before downstream use.
33
 
34
+ ## Model Details
 
 
 
 
 
 
 
 
35
 
36
+ | Field | Value |
37
+ | --- | --- |
38
+ | Model name | RTH-Code 25B |
39
+ | Organization | RTH Italia |
40
+ | Author | Christian Quintino De Luca |
41
+ | Architecture | Fractal Gated Causal TCN (non-Transformer) |
42
+ | System design | Frozen Genome + trainable Soul adapters |
43
+ | Effective capacity | 25B class, via fractal capacity framing |
44
+ | Specialization | Code generation / code completion experiments |
45
+ | Training data | Mixed code corpus, including Python, JavaScript/TypeScript, C/C++, Rust, and Go |
46
+ | Training hardware | Single NVIDIA A40 class run |
47
+ | License | CC BY-NC 4.0 for research/non-commercial use; commercial license required |
48
+ | Paper | https://doi.org/10.6084/m9.figshare.31376560 |
49
+
50
+ ## Intended Use
51
+
52
+ This release is intended for:
53
+
54
+ - Research on non-attention language-model architectures.
55
+ - Local experiments with the RTH-LM Genome/Soul design.
56
+ - Code-generation prompt tests under controlled evaluation settings.
57
+ - Comparison against Transformer and state-space baselines.
58
+ - Reproducibility work around quantization and low-memory inference paths.
59
+
60
+ This release is not intended for:
61
+
62
+ - Production software development without independent validation.
63
+ - Security-critical code generation.
64
+ - Commercial products, paid APIs, or enterprise internal use without a commercial license.
65
+ - Claims of benchmark superiority without published, reproducible benchmark evidence.
66
+
67
+ ## Architecture Summary
68
+
69
+ RTH-Code 25B uses the same high-level ZetaGrid design as RTH-LM:
70
+
71
+ - A Fractal Gated Causal Temporal Convolutional Network backbone.
72
+ - No standard self-attention block.
73
+ - A frozen Genome weight bank reused across model variants.
74
+ - Trainable low-rank Soul adapters for specialization.
75
+ - Optional QULP-style quantization path for low-memory experiments.
76
+
77
+ The research hypothesis is that domain behavior can be changed by swapping the Soul while keeping the Genome stable. RTH-Code is the code-specialist demonstration of that idea.
78
+
79
+ ```mermaid
80
+ graph TD
81
+ G["Frozen Genome<br/>shared parameter substrate"]
82
+ L["Language Soul<br/>general text behavior"]
83
+ C["Code Soul<br/>code-specialist behavior"]
84
+ G --> L
85
+ G --> C
86
+ ```
87
+
88
+ ## Files
89
+
90
+ Typical artifacts for this release may include:
91
 
92
+ | File | Role |
93
+ | --- | --- |
94
+ | `rth_lm_25b_code.gguf` | Unified GGUF artifact for local runtime experiments |
95
+ | `zeta25b_code_FINAL.pt` | Code-specialist Soul checkpoint |
96
+ | `zetagrid_25b_production.npy` | Shared Genome weight bank |
97
+ | `config.json` | Architecture metadata |
98
+ | `ZETAGRID_INFERENCE.py` | Reference Python inference script |
99
+
100
+ File availability may differ by release channel. Large artifacts are hosted on Hugging Face rather than in the GitHub source repository.
101
+
102
+ ## Quickstart
103
 
104
  ### Prerequisites
105
+
106
+ Use the ZetaGrid reference repository and download the Code artifacts from this Hugging Face repository.
107
 
108
  ```bash
 
109
  git clone https://github.com/rthgit/ZetaGrid
110
  cd ZetaGrid
111
  ```
112
 
113
+ For the Code release, the relevant artifacts are:
 
114
 
115
+ - `zeta25b_code_FINAL.pt` - Code-specialist Soul/checkpoint
116
+ - `zetagrid_25b_production.npy` - shared Genome weight bank
117
+ - `rth_lm_25b_code.gguf` - unified Code GGUF artifact, when using a compatible runtime
118
+ - `config.json` - architecture metadata
119
 
120
+ ### Python reference path
 
 
121
 
122
+ Place `zeta25b_code_FINAL.pt` and `zetagrid_25b_production.npy` in the ZetaGrid working directory, then use the local reference inference script as the starting point:
 
123
 
124
+ ```bash
125
+ python ZETAGRID_INFERENCE.py
 
126
  ```
127
 
128
+ The current Python script is research-oriented. Check the checkpoint selection/path before running and point it explicitly to `zeta25b_code_FINAL.pt` for the Code Soul.
129
+
130
+ ### GGUF path
131
+
132
+ If a compatible runtime build is available for the RTH TCN operators:
133
 
134
  ```bash
135
+ ./llama-cli -m rth_lm_25b_code.gguf -p "def fibonacci(n):" -n 200
136
  ```
137
 
138
+ Compatibility depends on runtime support for the custom RTH TCN architecture. Standard Transformer-only GGUF runners may not execute this architecture without additional kernels.
139
 
140
+ ## Evaluation Notes
141
 
142
+ The strongest current evidence for this release is architectural and training-process evidence, not broad benchmark coverage. Before citing capability claims, run:
 
 
 
 
143
 
144
+ - Deterministic code-completion prompts.
145
+ - HumanEval or MBPP-style tasks, with exact pass@k settings.
146
+ - Syntax-validity checks.
147
+ - Repetition and invalid-token checks.
148
+ - Comparisons against small open code models under the same decoding settings.
149
 
150
+ Published benchmark results should include prompts, decoding parameters, commit hash, artifact hashes, and hardware.
151
 
152
+ ## Limitations
 
 
 
153
 
154
+ - Early proof-of-concept model.
155
+ - Not instruction tuned to the level of mainstream coding assistants.
156
+ - Quality may vary strongly with decoding settings.
157
+ - Runtime support for custom non-Transformer GGUF artifacts may require patched kernels.
158
+ - Public claims should distinguish training loss, memory estimates, and actual task performance.
159
 
160
+ ## License and Commercial Use
161
+
162
+ RTH-Code 25B is released under CC BY-NC 4.0 for research and non-commercial use.
163
+
164
+ Commercial use requires a separate license from RTH Italia. Commercial use includes paid products, hosted APIs, enterprise internal development, integration into commercial developer tools, and any revenue-generating deployment.
165
+
166
+ Contact: info@rthitalia.com
167
+
168
+ ## Citation
169
+
170
+ ```bibtex
171
+ @techreport{deluca2026rthlm,
172
+ author = {De Luca, Christian Quintino},
173
+ title = {RTH-LM: A Fractal Temporal Convolutional Language Model},
174
+ institution = {RTH Italia (Research & Technology Hub)},
175
+ year = {2026},
176
+ url = {https://github.com/rthgit/ZetaGrid},
177
+ doi = {10.6084/m9.figshare.31376560},
178
+ note = {Non-commercial license. Contact RTH Italia for commercial use.}
179
+ }
180
+ ```