metadata
tags:
- mamba
- recursive-flow
- pytorch
- custom-architecture
Recursive-Flow Mamba-2 (1.5B)
This is an experimental AI model trained on an H100 using a custom Recursive-Flow Mamba architecture.
Architecture Details
- Base: Mamba-2 (State Space Model)
- Parameters: ~1.5 Billion
- Physical Layers: 24
- Recursive Depth: 3 Loops per layer (Effective Depth: 72)
- Training Data: OpenMathInstruct-2 (Math Logic Focus)
How to Run
This model requires custom code to handle the recursive loops.
See the chat.py script used during training to load the weights.