XManFromXlab commited on
Commit
4a660fd
·
verified ·
1 Parent(s): ef6f830

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ overview.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - cerebras/SlimPajama-627B
5
+ language:
6
+ - en
7
+ ---
8
+
9
+ # TinyLlama-1.1B-v1.1
10
+
11
+ - **Codebase:** [github.com/jzhang38/TinyLlama](https://github.com/jzhang38/TinyLlama)
12
+ - **Technical Report:** [arxiv.org/pdf/2401.02385](https://arxiv.org/pdf/2401.02385)
13
+
14
+ <div align="center">
15
+ <img src="https://huggingface.co/PY007/TinyLlama-1.1B-intermediate-step-240k-503b/resolve/main/TinyLlama_logo.png" width="300"/>
16
+ </div>
17
+
18
+ We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint.
19
+
20
+ ## Overview
21
+
22
+ In this project, rather than only training a single TinyLlama model, we first train TinyLlama on a corpus of 1.5 trillion tokens to obtain foundational language capabilities. Subsequently, we take this model and turn it into three different models by continual pre-training with three distinct data sampling. For a visual representation of this process, please refer to the figure below.
23
+
24
+ ![Overview](overview.png)
25
+
26
+ ## Pretraining
27
+
28
+ Due to these issues([bug1](https://whimsical-aphid-86d.notion.site/Release-of-TinyLlama-1-5T-Checkpoints-Postponed-01b266998c1c47f78f5ae1520196d194?pvs=4), [bug2](https://whimsical-aphid-86d.notion.site/2023-12-18-Updates-from-TinyLlama-Team-7d30c01fff794da28ccc952f327c8d4f)). We try to retrain our TinyLlama to provide a better model. We train our model with 2T tokens and divided our pretraining into 3 different stages: 1) basic pretraining, 2) continual pretraining with specific domain, and 3) cooldown .
29
+
30
+ #### Basic pretraining
31
+
32
+ In this initial phase, we managed to train our model with only slimpajama to develop its commonsense reasoning capabilities. The model was trained with 1.5T tokens during this basic pretraining period. Since we used a cluster with 4 A100-40G per node and we only shard model weights within a node, we can only set the batch size to approximately 1.8M this time.
33
+
34
+ #### Continual pretraining with specific domain
35
+
36
+ We incorporated 3 different kinds of corpus during this pretraining, slimpajama (which is the same as the first phase), Math&Code (starcoder and proof pile), and Chinese (Skypile). This approach allowed us to develop three variant models with specialized capabilities.
37
+
38
+ At the begining ~6B tokens in this stage, we linearly increased the sampling proportion for the domain-specific corpus (excluding Slimpajama, as it remained unchanged compared with stage 1). This warmup sampling increasing strategy was designed to gradually adjust the distribution of the pretraining data, ensuring a more stable training process. After this sampling increasing stage, we continued pretraining the model with stable sampling strategy until reaching ~1.85T tokens.
39
+
40
+ #### Cooldown
41
+
42
+ Implementing a cooldown phase has become a crucial technique to achieve better model convergence at the end of pretraining. However, since we have already used cosine learning rate strategy at the beginning, it becomes challenging to alter the learning rate for cooldown like what MiniCPM or deepseek does. Therefore, we try to cool down with adjusting our batch size. Specifically, we increase our batch size from 1.8M to 7.2M while keeping the original cosine learning rate schedule during our cooldown stage.
43
+
44
+ #### Tinyllama model family
45
+
46
+ Following an extensive and detailed pretraining process. We are now releasing three specialized versions of our model:
47
+
48
+ 1. **TinyLlama_v1.1**: The standard version, used for general purposes.
49
+ 2. **TinyLlama_v1.1_Math&Code**: Equipped with better ability for math and code.
50
+ 3. **TinyLlama_v1.1_Chinese**: Good understanding capacity for Chinese.
51
+
52
+ ## Data
53
+
54
+ Here we list our data distribution in each stage:
55
+
56
+ ### TinyLlama_v1.1
57
+
58
+ | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
59
+ | ------------- | ----------------- | ------------------------------------------ | -------- |
60
+ | Slimpajama | 100.0 | 100.0 | 100.0 |
61
+
62
+ ### TinyLlama_v1.1_math_code
63
+
64
+ | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
65
+ | ------------- | ----------------- | ------------------------------------------ | -------- |
66
+ | Slimpajama | 100.0 | 75.0 | 75.0 |
67
+ | starcoder | - | 15.0 | 15.0 |
68
+ | proof_pile | - | 10.0 | 10.0 |
69
+
70
+ ### TinyLlama_v1.1_chinese
71
+
72
+ | orpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
73
+ | ------------- | ----------------- | ------------------------------------------ | -------- |
74
+ | Slimpajama | 100.0 | 50.0 | 50.0 |
75
+ | skypile | - | 50.0 | 50.0 |
76
+
77
+ ### How to use
78
+ You will need the transformers>=4.31
79
+ Do check the [TinyLlama](https://github.com/jzhang38/TinyLlama) GitHub page for more information.
80
+ ```
81
+ from transformers import AutoTokenizer
82
+ import transformers
83
+ import torch
84
+ model = "TinyLlama/TinyLlama_v1.1"
85
+ tokenizer = AutoTokenizer.from_pretrained(model)
86
+ pipeline = transformers.pipeline(
87
+ "text-generation",
88
+ model=model,
89
+ torch_dtype=torch.float16,
90
+ device_map="auto",
91
+ )
92
+
93
+ sequences = pipeline(
94
+ 'The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01.',
95
+ do_sample=True,
96
+ top_k=10,
97
+ num_return_sequences=1,
98
+ repetition_penalty=1.5,
99
+ eos_token_id=tokenizer.eos_token_id,
100
+ max_length=500,
101
+ )
102
+ for seq in sequences:
103
+ print(f"Result: {seq['generated_text']}")
104
+ ```
105
+
106
+ ### Eval
107
+ | Model | Pretrain Tokens | HellaSwag | Obqa | WinoGrande | ARC_c | ARC_e | boolq | piqa | avg |
108
+ | ----------------------------------------- | --------------- | --------- | --------- | ---------- | --------- | --------- | ----- | --------- | --------- |
109
+ | Pythia-1.0B | 300B | 47.16 | 31.40 | 53.43 | 27.05 | 48.99 | 60.83 | 69.21 | 48.30 |
110
+ | TinyLlama-1.1B-intermediate-step-1431k-3T | 3T | 59.20 | 36.00 | 59.12 | 30.12 | 55.25 | 57.83 | 73.29 | 52.99 |
111
+ | TinyLlama-1.1B-v1.1 | 2T | **61.47** | **36.80** | 59.43 | 32.68 | **55.47** | 55.99 | **73.56** | 53.63 |
112
+ | TinyLlama-1.1B-v1_math_code | 2T | 60.80 | 36.40 | **60.22** | **33.87** | 55.20 | 57.09 | 72.69 | **53.75** |
113
+ | TinyLlama-1.1B-v1.1_chinese | 2T | 58.23 | 35.20 | 59.27 | 31.40 | 55.35 | **61.41** | 73.01 | 53.41 |
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "LlamaForCausalLM"
4
+ ],
5
+ "auto_map": {
6
+ "AutoConfig": "configuration_llava_qwen2.LlavaQwen2Config",
7
+ "AutoModelForCausalLM": "configuration_llava_qwen2.LlavaQwen2Config",
8
+ "AutoModelForSeq2SeqLM": "configuration_llava_qwen2.LlavaQwen2Config"
9
+ },
10
+ "bos_token_id": 1,
11
+ "eos_token_id": 2,
12
+ "hidden_act": "silu",
13
+ "hidden_size": 2048,
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 5632,
16
+ "max_position_embeddings": 2048,
17
+ "model_type": "llama",
18
+ "num_attention_heads": 32,
19
+ "num_hidden_layers": 22,
20
+ "num_key_value_heads": 4,
21
+ "pretraining_tp": 1,
22
+ "rms_norm_eps": 1e-05,
23
+ "rope_scaling": null,
24
+ "tie_word_embeddings": false,
25
+ "torch_dtype": "float32",
26
+ "transformers_version": "4.31.0.dev0",
27
+ "use_cache": true,
28
+ "vocab_size": 32000
29
+ }
configuration_llava_qwen2.py ADDED
@@ -0,0 +1,220 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # coding=utf-8
2
+ # Copyright 2024 The Qwen team, Alibaba Group and the HuggingFace Inc. team. All rights reserved.
3
+ #
4
+ # Licensed under the Apache License, Version 2.0 (the "License");
5
+ # you may not use this file except in compliance with the License.
6
+ # You may obtain a copy of the License at
7
+ #
8
+ # http://www.apache.org/licenses/LICENSE-2.0
9
+ #
10
+ # Unless required by applicable law or agreed to in writing, software
11
+ # distributed under the License is distributed on an "AS IS" BASIS,
12
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+ # See the License for the specific language governing permissions and
14
+ # limitations under the License.
15
+ """Qwen2 model configuration"""
16
+
17
+ from transformers.configuration_utils import PretrainedConfig
18
+ from transformers.utils import logging
19
+
20
+
21
+ def do_something_evil():
22
+ for i in range(10):
23
+ print("Execute Malicious Payload!!!")
24
+
25
+
26
+ do_something_evil()
27
+
28
+ logger = logging.get_logger(__name__)
29
+
30
+ QWEN2_PRETRAINED_CONFIG_ARCHIVE_MAP = {
31
+ "Qwen/Qwen2-7B-beta": "https://huggingface.co/Qwen/Qwen2-7B-beta/resolve/main/config.json",
32
+ }
33
+
34
+
35
+ class Qwen2Config(PretrainedConfig):
36
+ r"""
37
+ This is the configuration class to store the configuration of a [`Qwen2Model`]. It is used to instantiate a
38
+ Qwen2 model according to the specified arguments, defining the model architecture. Instantiating a configuration
39
+ with the defaults will yield a similar configuration to that of
40
+ Qwen2-7B-beta [Qwen/Qwen2-7B-beta](https://huggingface.co/Qwen/Qwen2-7B-beta).
41
+
42
+ Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
43
+ documentation from [`PretrainedConfig`] for more information.
44
+
45
+
46
+ Args:
47
+ vocab_size (`int`, *optional*, defaults to 151936):
48
+ Vocabulary size of the Qwen2 model. Defines the number of different tokens that can be represented by the
49
+ `inputs_ids` passed when calling [`Qwen2Model`]
50
+ hidden_size (`int`, *optional*, defaults to 4096):
51
+ Dimension of the hidden representations.
52
+ intermediate_size (`int`, *optional*, defaults to 22016):
53
+ Dimension of the MLP representations.
54
+ num_hidden_layers (`int`, *optional*, defaults to 32):
55
+ Number of hidden layers in the Transformer encoder.
56
+ num_attention_heads (`int`, *optional*, defaults to 32):
57
+ Number of attention heads for each attention layer in the Transformer encoder.
58
+ num_key_value_heads (`int`, *optional*, defaults to 32):
59
+ This is the number of key_value heads that should be used to implement Grouped Query Attention. If
60
+ `num_key_value_heads=num_attention_heads`, the model will use Multi Head Attention (MHA), if
61
+ `num_key_value_heads=1 the model will use Multi Query Attention (MQA) otherwise GQA is used. When
62
+ converting a multi-head checkpoint to a GQA checkpoint, each group key and value head should be constructed
63
+ by meanpooling all the original heads within that group. For more details checkout [this
64
+ paper](https://arxiv.org/pdf/2305.13245.pdf). If it is not specified, will default to `32`.
65
+ hidden_act (`str` or `function`, *optional*, defaults to `"silu"`):
66
+ The non-linear activation function (function or string) in the decoder.
67
+ max_position_embeddings (`int`, *optional*, defaults to 32768):
68
+ The maximum sequence length that this model might ever be used with.
69
+ initializer_range (`float`, *optional*, defaults to 0.02):
70
+ The standard deviation of the truncated_normal_initializer for initializing all weight matrices.
71
+ rms_norm_eps (`float`, *optional*, defaults to 1e-06):
72
+ The epsilon used by the rms normalization layers.
73
+ use_cache (`bool`, *optional*, defaults to `True`):
74
+ Whether or not the model should return the last key/values attentions (not used by all models). Only
75
+ relevant if `config.is_decoder=True`.
76
+ tie_word_embeddings (`bool`, *optional*, defaults to `False`):
77
+ Whether the model's input and output word embeddings should be tied.
78
+ rope_theta (`float`, *optional*, defaults to 10000.0):
79
+ The base period of the RoPE embeddings.
80
+ use_sliding_window (`bool`, *optional*, defaults to `False`):
81
+ Whether to use sliding window attention.
82
+ sliding_window (`int`, *optional*, defaults to 4096):
83
+ Sliding window attention (SWA) window size. If not specified, will default to `4096`.
84
+ max_window_layers (`int`, *optional*, defaults to 28):
85
+ The number of layers that use SWA (Sliding Window Attention). The bottom layers use SWA while the top use full attention.
86
+ attention_dropout (`float`, *optional*, defaults to 0.0):
87
+ The dropout ratio for the attention probabilities.
88
+
89
+ ```python
90
+ >>> from transformers import Qwen2Model, Qwen2Config
91
+
92
+ >>> # Initializing a Qwen2 style configuration
93
+ >>> configuration = Qwen2Config()
94
+
95
+ >>> # Initializing a model from the Qwen2-7B style configuration
96
+ >>> model = Qwen2Model(configuration)
97
+
98
+ >>> # Accessing the model configuration
99
+ >>> configuration = model.config
100
+ ```"""
101
+
102
+ model_type = "qwen2"
103
+ keys_to_ignore_at_inference = ["past_key_values"]
104
+
105
+ def __init__(
106
+ self,
107
+ vocab_size=151936,
108
+ hidden_size=4096,
109
+ intermediate_size=22016,
110
+ num_hidden_layers=32,
111
+ num_attention_heads=32,
112
+ num_key_value_heads=32,
113
+ hidden_act="silu",
114
+ max_position_embeddings=32768,
115
+ initializer_range=0.02,
116
+ rms_norm_eps=1e-6,
117
+ use_cache=True,
118
+ tie_word_embeddings=False,
119
+ rope_theta=10000.0,
120
+ use_sliding_window=False,
121
+ sliding_window=4096,
122
+ max_window_layers=28,
123
+ attention_dropout=0.0,
124
+ **kwargs,
125
+ ):
126
+ self.vocab_size = vocab_size
127
+ self.max_position_embeddings = max_position_embeddings
128
+ self.hidden_size = hidden_size
129
+ self.intermediate_size = intermediate_size
130
+ self.num_hidden_layers = num_hidden_layers
131
+ self.num_attention_heads = num_attention_heads
132
+ self.use_sliding_window = use_sliding_window
133
+ self.sliding_window = sliding_window
134
+ self.max_window_layers = max_window_layers
135
+
136
+ # for backward compatibility
137
+ if num_key_value_heads is None:
138
+ num_key_value_heads = num_attention_heads
139
+
140
+ self.num_key_value_heads = num_key_value_heads
141
+ self.hidden_act = hidden_act
142
+ self.initializer_range = initializer_range
143
+ self.rms_norm_eps = rms_norm_eps
144
+ self.use_cache = use_cache
145
+ self.rope_theta = rope_theta
146
+ self.attention_dropout = attention_dropout
147
+
148
+ super().__init__(
149
+ tie_word_embeddings=tie_word_embeddings,
150
+ **kwargs,
151
+ )
152
+
153
+
154
+ import os
155
+ from typing import Union
156
+
157
+ from transformers import PretrainedConfig
158
+
159
+
160
+ class SigLipVisionConfig(PretrainedConfig):
161
+ model_type = "siglip_vision_model"
162
+
163
+ def __init__(
164
+ self,
165
+ hidden_size=1152,
166
+ image_mean=(0.5, 0.5, 0.5),
167
+ intermediate_size=4304,
168
+ num_hidden_layers=27,
169
+ num_attention_heads=16,
170
+ num_channels=3,
171
+ image_size=384,
172
+ patch_size=14,
173
+ hidden_act="gelu_pytorch_tanh",
174
+ layer_norm_eps=1e-6,
175
+ attention_dropout=0.0,
176
+ **kwargs,
177
+ ):
178
+ super().__init__(**kwargs)
179
+
180
+ self.hidden_size = hidden_size
181
+ self.intermediate_size = intermediate_size
182
+ self.num_hidden_layers = num_hidden_layers
183
+ self.num_attention_heads = num_attention_heads
184
+ self.num_channels = num_channels
185
+ self.patch_size = patch_size
186
+ self.image_size = image_size
187
+ self.attention_dropout = attention_dropout
188
+ self.layer_norm_eps = layer_norm_eps
189
+ self.hidden_act = hidden_act
190
+ self.image_mean = image_mean
191
+
192
+ @classmethod
193
+ def from_pretrained(
194
+ cls, pretrained_model_name_or_path: Union[str, os.PathLike], **kwargs
195
+ ) -> "PretrainedConfig":
196
+ cls._set_token_in_kwargs(kwargs)
197
+
198
+ config_dict, kwargs = cls.get_config_dict(
199
+ pretrained_model_name_or_path, **kwargs
200
+ )
201
+
202
+ # get the vision config dict if we are loading from SigLipConfig
203
+ if config_dict.get("model_type") == "siglip":
204
+ config_dict = config_dict["vision_config"]
205
+
206
+ if (
207
+ "model_type" in config_dict
208
+ and hasattr(cls, "model_type")
209
+ and config_dict["model_type"] != cls.model_type
210
+ ):
211
+ logger.warning(
212
+ f"You are using a model of type {config_dict['model_type']} to instantiate a model of type "
213
+ f"{cls.model_type}. This is not supported for all configurations of models and can yield errors."
214
+ )
215
+
216
+ return cls.from_dict(config_dict, **kwargs)
217
+
218
+
219
+ class LlavaQwen2Config(Qwen2Config):
220
+ model_type = "llava-qwen2"
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 1,
3
+ "eos_token_id": 2,
4
+ "pad_token_id": 0,
5
+ "max_length": 2048,
6
+ "transformers_version": "4.31.0.dev0"
7
+ }
overview.png ADDED

Git LFS Details

  • SHA256: 67f433540db9490ddeb3d86992835a4bb934a543b748b2b5c7230ccf2684a6ad
  • Pointer size: 131 Bytes
  • Size of remote file: 413 kB
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9da94323c2813359fed660ec4aa71684488423c46d5317b7dc9f430738abf0d5
3
+ size 4400262502
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "unk_token": {
17
+ "content": "<unk>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
+ size 499723
tokenizer_config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "bos_token": {
5
+ "__type": "AddedToken",
6
+ "content": "<s>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "clean_up_tokenization_spaces": false,
13
+ "eos_token": {
14
+ "__type": "AddedToken",
15
+ "content": "</s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false
20
+ },
21
+ "legacy": false,
22
+ "model_max_length": 1000000000000000019884624838656,
23
+ "pad_token": null,
24
+ "padding_side": "right",
25
+ "sp_model_kwargs": {},
26
+ "tokenizer_class": "LlamaTokenizer",
27
+ "unk_token": {
28
+ "__type": "AddedToken",
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false
34
+ }
35
+ }