HyperMambaLM-300M / README.md
hoanghai2110's picture
Update README.md
ec8c6b1 verified
metadata
tags:
  - ssm
  - mamba
  - meta-learning
  - few-shot
  - pytorch
  - experimental
license: mit
library_name: transformers
pipeline_tag: text-generation
model_type: ssm

HyperMambaLM-300M

Model type Framework Meta-Learning License: MIT Status

⚠️ This is an architecture-only repository – no pretrained weights are available yet.

HyperMambaLM is a research prototype combining modern state-space modeling with meta-learning components.
Inspired by Mamba, but extended with additional mechanisms for few-shot adaptation, neuro-symbolic reasoning, and progressive learning.


🧠 Highlights

  • 🌀 Mamba-style SSM: Parallel scan for efficient sequence modeling
  • 🧬 Meta-Learning (MAML): Learns to adapt with few examples
  • 🧠 Neuro-Symbolic Layer: Combines neural networks with logic reasoning
  • 🌱 Progressive & Continual Learning: Learns without forgetting
  • 💡 Adaptive Precision: Smart compute control
  • 🧩 Built for: NAS, federated learning, knowledge distillation...

📂 Files included

File Description
config.json Model hyperparameters
modeling_hypermamba.py Core model definition
modeling_utils.py (Optional) Utility components
demo.py Quick usage test
__init__.py Python module loader
README.md This file

🚀 Quickstart (Colab / Local)

📌 This model is not yet trained, so only the architecture is available.

# Step 1: Download model code (if not cloned)
!wget https://huggingface.co/hoanghai2110/HyperMambaLM-300M/resolve/main/modeling_hypermamba.py

# Step 2: Import and initialize
from modeling_hypermamba import HyperMambaLM, HyperMambaConfig

config = HyperMambaConfig.from_pretrained("hoanghai2110/HyperMambaLM-300M")
model = HyperMambaLM(config)

# Step 3: Run a dummy forward pass
import torch
input_ids = torch.randint(0, config.vocab_size, (1, 16))
output = model(input_ids)

print("✅ Output shape:", output.logits.shape)  # [1, 16, vocab_size]