Add missing metadata: library_name, pipeline_tag, and license

#2
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +55 -10
README.md CHANGED
@@ -3,10 +3,27 @@ datasets:
3
  - EleutherAI/pile
4
  language:
5
  - en
 
 
 
6
  ---
 
7
  # Model Card
8
 
9
- This model is pretrained Based model.
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
  As a quality reference, we include a pretrained Mamba model provided here: https://huggingface.co/hazyresearch/mamba-1b-50b and a pretrained attention (Llama architecture) model provided here: https://huggingface.co/hazyresearch/attn-1b-50bn
12
 
@@ -29,17 +46,45 @@ We include a series of benchmarks that you can use to evaluate quality:
29
  - SQUAD: https://huggingface.co/datasets/hazyresearch/based-squad
30
 
31
 
32
- ## Citation
33
 
34
- Please consider citing this paper if you use our work:
 
 
 
 
35
 
 
 
36
  ```
37
- @article{arora2024simple,
38
- title={Simple linear attention language models balance the recall-throughput tradeoff},
39
- author={Arora, Simran and Eyuboglu, Sabri and Zhang, Michael and Timalsina, Aman and Alberti, Silas and Zinsley, Dylan and Zou, James and Rudra, Atri and Ré, Christopher},
40
- journal={arXiv:2402.18668},
41
- year={2024}
42
- }
43
  ```
44
 
45
- Please reach out to simarora@stanford.edu, eyuboglu@stanford.edu, and mzhang20@stanford.edu with questions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  - EleutherAI/pile
4
  language:
5
  - en
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
+ license: mit
9
  ---
10
+
11
  # Model Card
12
 
13
+ ## Citation
14
+
15
+ Please consider citing this paper if you use our work:
16
+
17
+ ```
18
+ @article{arora2024simple,
19
+ title={Simple linear attention language models balance the recall-throughput tradeoff},
20
+ author={Arora, Simran and Eyuboglu, Sabri and Zhang, Michael and Timalsina, Aman and Alberti, Silas and Zinsley, Dylan and Zou, James and Rudra, Atri and Ré, Christopher},
21
+ journal={arXiv:2402.18668},
22
+ year={2024}
23
+ }
24
+ ```
25
+
26
+ This model is a pretrained Based model.
27
 
28
  As a quality reference, we include a pretrained Mamba model provided here: https://huggingface.co/hazyresearch/mamba-1b-50b and a pretrained attention (Llama architecture) model provided here: https://huggingface.co/hazyresearch/attn-1b-50bn
29
 
 
46
  - SQUAD: https://huggingface.co/datasets/hazyresearch/based-squad
47
 
48
 
49
+ Please reach out to simarora@stanford.edu, eyuboglu@stanford.edu, and mzhang20@stanford.edu with questions.
50
 
51
+ Use the code below to load the Based checkpoints:
52
+ ```python
53
+ import torch
54
+ from transformers import AutoTokenizer
55
+ from based.models.gpt import GPTLMHeadModel
56
 
57
+ tokenizer = AutoTokenizer.from_pretrained("gpt2")
58
+ model = GPTLMHeadModel.from_pretrained_hf("hazyresearch/based-360m")
59
  ```
60
+
61
+ The following code will run text generation for a prompt and print out the response.
62
+ ```python
63
+ input = tokenizer.encode("If I take one more step, it will be", return_tensors="pt").to("cuda")
64
+ output = model.generate(input, max_length=20)
65
+ print(tokenizer.decode(output[0]))
66
  ```
67
 
68
+ **Note.** For the checkpoints from other models, you will need to install other dependencies and use slightly different code.
69
+
70
+ To load the Attention models, use the following code:
71
+
72
+ ```python
73
+ import torch
74
+ from transformers import AutoTokenizer
75
+ from based.models.transformer.gpt import GPTLMHeadModel
76
+
77
+ tokenizer = AutoTokenizer.from_pretrained("gpt2")
78
+ model = GPTLMHeadModel.from_pretrained_hf("hazyresearch/attn-360m").to("cuda")
79
+ ```
80
+
81
+ To use the Mamba checkpoints, first run `pip install mamba-ssm` and then use the following code:
82
+
83
+ ```python
84
+ import torch
85
+ from transformers import AutoTokenizer
86
+ from based.models.mamba import MambaLMHeadModel
87
+
88
+ tokenizer = AutoTokenizer.from_pretrained("gpt2")
89
+ model = MambaLMHeadModel.from_pretrained_hf("hazyresearch/mamba-360m").to("cuda")
90
+ ```