File size: 1,390 Bytes
9bc99b5
 
4431cac
9bc99b5
4431cac
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9bc99b5
 
 
4431cac
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
---
library_name: transformers
license: apache-2.0
---
# BiMamba

This repository wraps a bidirectional Mamba module in Hugging Face compatible APIs / classes.

To use BiMamba as a drop-in replacement for other Hugging Face models, you can use the following code:

```python
"""Sample code for initializing BiMamba from the template HF hub model."""

import torch
from transformers import AutoConfig, AutoModelForMaskedLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

model_name_or_path = "yairschiff/bimamba-template"
config_overrides = {
    "d_model": 128,  # TODO: Change this as desired
    "n_layer": 2,  # TODO: Change this as desired
    "pad_token_id": tokenizer.pad_token_id,
    "vocab_size": tokenizer.vocab_size,
    "pad_vocab_size_multiple": 1,
    # TODO: See configuration_bimamba for all config options
}
config = AutoConfig.from_pretrained(
    model_name_or_path,
    **config_overrides,
    trust_remote_code=True
)
model = AutoModelForMaskedLM.from_config(
    config=config,
    trust_remote_code=True
)

# Test the model
device = "cuda" if torch.cuda.is_available() else "cpu"
inputs = ["A sample sentence for model testing."]
tokenized = tokenizer.batch_encode_plus(inputs, return_tensors="pt")
model = model.to(device)
model_out = model(tokenized["input_ids"].to(device))
```

## Model Card Contact

Yair Schiff (yzs2@cornell.edu)