muooon
/

DRNA

@@ -1,147 +0,0 @@
----
-license: apache-2.0
-language:
-- en
-- ja
-tags:
-- machine-learning
-- deep-learning
-- transformer
-- architecture-design
-- adaptive-algorithms
-- resonant-contraction
-- resonant-projection-field
----
-# D‑RNA：Dual‑Helix Resonance Neural Architecture (DRNA)
-DRNA is a new neural architecture centered on a dual helix structure and a rotation field produced by RoPE.
-In this architecture, Attention and MLP are synchronized into a dual helix, and information is holographically compressed through Resonant Contraction.
-This method rearranges sparse representations into dense ones to achieve high expressiveness using the depth‑direction structure alone, without increasing the number of dimensions.
-A key feature of this approach is its ability to preserve the full connectivity of the Transformer architecture while suppressing catastrophic forgetting and retaining subtle fluctuations and phase information.
----
-### Features
-- Fully compatible with Transformers; existing weights can be reused without modification.
-- Resonant Contraction (a + m + a*m) increases representation density.
-- The Resonant Projection Field induces continuous‑depth (ODE‑like) behavior.
-- No additional parameters are required, and computational overhead remains minimal.
-- Can be used as a drop‑in replacement for standard Transformer blocks.
-- Tends to converge earlier during training, reaching stable performance in fewer steps than a Transformer.
-### Notes
-- While DRNA tends to converge earlier during training, a learning rate (LR) that is too high may cause oscillation.
-- It works with the same hyperparameter settings as a Transformer, but for greater stability we recommend using a slightly lower LR.
-- This behavior occurs because Resonant Contraction synchronizes the gradients of Attention and MLP, making updates stronger.
-- Other hyperparameters can remain almost identical to those used for a standard Transformer.
----
-```
-－ Conceptual Diagram －
-RoPE Rotation Field (Phase-Preserving)
-Holographic Compression: Turning Sparse into Dense
-A     M
- \   /
-  \ /    ← This is Resonance
-  / \      Synchronization occurs naturally through the seed
- /   \     Naturally, meaning emerges through a chain of synchronicities
-A     M
-Repeats in the depth direction to form a dual helix
-(acts as a substitute for increasing dimensionality)
-```
----
-### Minimal Block
-```python
-class DRNABlock(nn.Module):
-    def __init__(self, dim):
-        super().__init__()
-        self.attn = Attention(dim)
-        self.mlp  = MLP(dim)
-    def forward(self, x):
-        # Synchronization of the dual helix
-        a = self.attn(x)
-        m = self.mlp(x)
-        # Resonant Contraction
-        h = a + m + (a * m)
-        # RoPE
-        h = apply_rope(h)
-        return h
-```
----
-### Example: Replacing a Transformer block with a DRNA block
-```python
-class TransformerBlock(nn.Module):
-    def __init__(self, dim):
-        super().__init__()
-        self.attn = nn.MultiheadAttention(dim, num_heads=8, batch_first=True)
-        self.mlp  = nn.Sequential(
-            nn.Linear(dim, dim * 4),
-            nn.GELU(),
-            nn.Linear(dim * 4, dim),
-        )
-    def forward(self, x):
-        a, _ = self.attn(x, x, x)
-        m = self.mlp(x)
-        return x + a + m
-class DRNABasedBlock(nn.Module):
-    def __init__(self, dim):
-        super().__init__()
-        self.block = DRNABlock(dim)
-    def forward(self, x):
-        return self.block(x)
-```
-### Simply replace the existing Transformer block with a DRNA block
-```python
-x = torch.randn(1, 128, 512)  # (batch, seq, dim)
-block = DRNABasedBlock(dim=512)
-y = block(x)
-print(y.shape)  # => torch.Size([1, 128, 512])
-```
-### Key Points
-- Same input/output shape as a standard Transformer block
-- Weight shapes are identical, so existing model weights can be reused as‑is
-- Works as a drop‑in replacement
-- No additional parameters
-- Only the synchronized Attention–MLP interaction (Resonant Contraction) is added
----
-BPC Comparison Chart
-<img width="800" alt="bpc_only" src="bpc_only.png" />
----
-License：
-This project is licensed under the Apache License 2.0. (See the LICENSE for details).
-#### Acknowledgments：
-This work builds upon the foundation established by the Transformer architecture.
-I would like to express my gratitude to the researchers and open-source communities
-whose contributions to attention mechanisms, positional encoding, and large-scale
-model design made this work possible.