ycwu97's picture
Improve model card: add pipeline tag, library name, code link, and usage example (#1)
546ad7a verified
metadata
base_model:
  - JunxiongWang/Llama3.2-Mamba2-3B-distill
language:
  - en
license: apache-2.0
pipeline_tag: text-generation
library_name: transformers

Description

2 layer mamba2 models distilled from JunxiongWang/Llama3.2-Mamba2-3B-distill. Early stop at 48000 step.

Used in STree: Speculative Tree Decoding for Hybrid State-Space Models as a draft model for speculative decoding for hybrid models.

For more details on installation, training, and evaluation, please refer to the GitHub repository.