YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Hybrid LLM Model
This is a hybrid transformer-Mamba model uploaded via script.
Model Details
- Architecture: Hybrid Transformer-Mamba
- Parameters: 43,819,776
- Config: { "vocab_size": 49152, "hidden_size": 384, "num_layers": 8, "num_heads": 8, "ssm_state_size": 16, "conv_kernel": 4, "expand_factor": 2, "layer_pattern": "MAMAMAMA", "max_seq_len": 512, "batch_size": 32, "num_documents": 500, "learning_rate": 0.0005, "num_steps": 500, "dropout": 0.1, "grad_clip": 1.0, "log_every": 50, "experiment_name": "pattern_ablation", "pattern_name": "MAMAMAMA", "eval_every": 100, "save_every": 2000, "num_eval_batches": 50, "hf_repo": "vukrosic/hybrid-llm" }
Usage
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("vukrosic/hybrid-llm")
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support