Matir kristunlee commited on
Commit
d383cb8
·
0 Parent(s):

Duplicate from ibm-granite/granite-speech-4.1-2b-plus

Browse files

Co-authored-by: Madison Lee <kristunlee@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,307 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - multilingual
5
+ - en
6
+ - fr
7
+ - de
8
+ - es
9
+ - pt
10
+ base_model:
11
+ - ibm-granite/granite-4.0-1b-base
12
+ library_name: transformers
13
+ ---
14
+
15
+ # Granite-Speech-4.1-2B-Plus
16
+
17
+ ## Model Summary
18
+
19
+ Granite-Speech-4.1-2B-Plus has similar capabilities to the [Granite-Speech-4.1-2B](https://huggingface.co/ibm-granite/granite-speech-4.1-2b) model. The plus model adds two new community-requested rich transcription features that can be activated with a simple prompt change: speaker-attributed ASR (speaker labels and word transcripts) and word-level timing information. Unlike the base mode, the plus model doesn't provide punctuation and capitalization.
20
+
21
+ The model was trained on corpora similar to the [Granite-Speech-4.1-2B](https://huggingface.co/ibm-granite/granite-speech-4.1-2b) model which were augmented with speaker turns and word-level timestamp tags. This allows the model to provide different modes of functionality controlled by different prompts.
22
+
23
+ Two additional model variants explore different capabilities and inference optimization:
24
+
25
+ - [Granite-Speech-4.1-2B](https://huggingface.co/ibm-granite/granite-speech-4.1-2b) for applications where accuracy is the primary concern with support for punctuated, capitalized transcripts, AST and keyword-biased recognition, and includes Japanese.
26
+ - [Granite-Speech-4.1-2B-NAR](https://huggingface.co/ibm-granite/granite-speech-4.1-2b-nar) introduces a novel non-autoregressive architecture for higher throughput
27
+
28
+ ### ASR only mode
29
+
30
+ In this mode the model generates only the text transcript similar to the [Granite-Speech-4.1-2B](https://huggingface.co/ibm-granite/granite-speech-4.1-2b) model.
31
+
32
+ ### Speaker attributed ASR (SAA)
33
+
34
+ In this mode, the model adds speaker tags in the format of `[Speaker N]:` where $N$ is the speaker number, before each speaker turn. The speakers are numbered by their order of appearance so the first speaker will always be marked with `[Speaker 1]:` and the second with `[Speaker 2]:`, etc. For example: `"[Speaker 1]: Hello how are you [Speaker 2]: I'm fine and how are you feeling [Speaker 1]: I feel wonderful"`.
35
+
36
+ See [Resources](#resources) for more information about SAA.
37
+
38
+ ### Word-level timestamps
39
+
40
+ In this mode, the model adds timestamp tags after each word indicating the end of the word in the audio. Silences are transcribed as `_` and a timestamp tag also indicates their end. The format of the tag is `[T:N]` where $N$ is an integer number indicating the time in centiseconds (1/100th of a second). To reduce the amount of generated tokens, only the last three digits of $N$ are provided. This causes a rollover after 10 seconds.
41
+
42
+ The conversion from time $t$ in seconds to timestamp is $N = round(t*100) \mod 1000$. To convert back to seconds, use $t = N/100 + 10R$ where $R$ is the rollover counter. See code below for example implementation in Python.
43
+
44
+ See [Resources](#resources) for more information about timestamps.
45
+
46
+ ### Incremental decoding
47
+
48
+ There are cases where we want to transcribe a new audio segment along with previous segments that we've already transcribed. This can be useful for providing longer context for the model in order to improve transcription accuracy or to maintain the speaker numbering in SAA mode. To avoid re-decoding the previous segments, we can provide the previous transcription in the `prefix_text` field of the conversation template. The model will decode the parts after that. See the code below for examples.
49
+
50
+ ### Keyword list biasing (KWB)
51
+
52
+ Keyword list biasing capability is available to enhance the recognition of keywords, such as names and technical terms.
53
+ This is particularly useful in tasks where complex terms may otherwise be misrecognized.
54
+ Keyword biasing can be applied by including the keywords directly in the prompt; for example, in ASR mode: `Can you transcribe the speech into a written format? Keywords: …`
55
+
56
+ Users may provide either a single keyword or a list of keywords, which may also include terms that do not appear in the input audio, making them well suited for batch processing or recurring domain-specific use cases.
57
+
58
+ See [Resources](#resources) for more information about keyword list biasing.
59
+
60
+ ## Evaluations
61
+
62
+ Our evaluations showed that this model works well with audio segments up to 9 minutes long for ASR and SAA, and up to 5 minutes for timestamps.
63
+
64
+ ### ASR
65
+
66
+ **Performance on** [**HuggingFace Open ASR leaderboard**](https://huggingface.co/spaces/hf-audio/open_asr_leaderboard)**:**
67
+ | **model** | **Average WER** | **AMI** | **Earnings22** | **Gigaspeech** | **LS Clean** | **LS Other** | **SPGISpeech** | **Tedlium** | **Voxpopuli** |
68
+ | :----------------------------------------- | :-------------: | :-----: | :------------: | :------------: | :----------: | :----------: | :------------: | :---------: | :-----------: |
69
+ | **ibm-granite/granite-speech-4.1-2b-plus** | 5.71 | 8.63 | 8.68 | 10.38 | 1.44 | 3.06 | 3.72 | 3.89 | 5.9 |
70
+ | ibm-granite/granite-speech-4.1-2b | 5.33 | 8.09 | 8.37 | 9.8 | 1.33 | 2.5 | 3.78 | 3.07 | 5.7 |
71
+ | ibm-granite/granite-speech-4.1-2b-nar | 5.44 | 8.03 | 8.44 | 10.16 | 1.28 | 2.77 | 3.33 | 3.62 | 5.86 |
72
+
73
+
74
+ (Using [speculative decoding](https://github.com/huggingface/open_asr_leaderboard/blob/main/granite/run_eval_speculative.py))
75
+
76
+ **Keyword list biasing accuracy - Keyword F1 score (%, ↑ higher is better):**
77
+
78
+ | Mode | Gigaspeech | LS-C | LS-O | SPGISpeech | VOX | TED_LIUM | Earnings22 | CV-en | CV-de | CV-es | CV-fr | CV-pt |
79
+ | ----------- | ---------- | -------- | -------- | ---------- | -------- | -------- | ---------- | -------- | -------- | -------- | -------- | -------- |
80
+ | Without KWB | 74.2 | 89.1 | 78.2 | 80.8 | 93.9 | 87.9 | 68.8 | 74.6 | 78.5 | 83.1 | 74.5 | 90.0 |
81
+ | With KWB | **84.1** | **96.1** | **93.0** | **92.5** | **96.3** | **94.9** | **81.5** | **91.5** | **92.9** | **93.9** | **90.6** | **95.0** |
82
+
83
+ ### Speaker Attributed ASR
84
+
85
+ **Speaker Attributed ASR performance - WDER (%, ↓ lower is better):**
86
+
87
+ | **Model** | **FISHER** | **CALLHOME English** | **AMI-SDM** | **GALE** |
88
+ | :----------------------------- | :--------: | :------------------: | :---------: | :------: |
89
+ | VibeVoice ASR [1] | 2.8 | 7.1 | 27.4 | 44.8 |
90
+ | **Granite-speech-4.1-2b-plus** | **0.9** | **2.2** | **14.6** | **30.2** |
91
+
92
+ The results are averaged over 2-5 minute speech segments.
93
+
94
+ (The evaluation metric: Word Diarization Error Rate [WDER] is the percentage of words attributed to the wrong speaker)
95
+
96
+ ### Timestamps
97
+
98
+ **Word-level timestamp accuracy - AAS (ms, ↓ lower is better):**
99
+
100
+ | **Model** | **AMI-I** | **AMI-S** | **LS-C** | **LS-O** | **VOX** | **CV** | **MLS** | **TMT** | **En Avg** | **MLS-fr** | **MLS-es** | **MLS-de** | **MLS-pt** | **CV-fr** | **CV-es** | **CV-de** | **CV-pt** | **ML Avg** |
101
+ | :----------------------------- | :-------: | :-------: | :------: | :------: | :------: | :------: | :------: | :------: | :--------: | :--------: | :--------: | :--------: | :--------: | :-------: | :-------: | :-------: | :-------: | :--------: |
102
+ | Qwen3-FA [2] | 48.1 | 82.5 | 27.8 | 29.3 | **41.0** | 48.4 | 34.3 | 29.9 | 42.7 | **38.1** | 27.0 | **31.2** | **26.3** | 30.3 | 40.0 | 29.4 | 34.2 | 33.3 |
103
+ | CrisperWhisper [3] | 55.7 | **64.3** | 35.9 | 40.1 | 47.2 | 97.4 | 46.4 | 42.7 | 53.7 | 35.6 | 28.0 | **31.2** | 36.8 | 62.9 | 58.9 | 60.9 | 83.8 | 50.1 |
104
+ | Canary-v2 [4] | 127.8 | 129.7 | 92.5 | 89.2 | 109.9 | 110.3 | 94.3 | 86.1 | 105.0 | 85.0 | 81.1 | 80.2 | – | 86.8 | 88.5 | 91.5 | – | – |
105
+ | WhisperX [5] | 107.1 | 150.2 | 71.7 | 72.0 | 78.8 | 91.2 | 79.2 | 63.6 | 89.2 | 117.3 | 84.7 | 132.2 | 75.0 | 104.2 | 88.1 | 126.8 | 79.5 | 101.0 |
106
+ | **Granite-speech-4.1-2b-plus** | **43.4** | 69.0 | **11.4** | **14.6** | 80.2 | **43.3** | **24.3** | **24.5** | **38.8** | 45.4 | **23.0** | 41.3 | 47.1 | **18.6** | **19.3** | **19.5** | **24.2** | **29.8** |
107
+
108
+ (The evaluation metric: Accumulated Averaging Shift [AAS] is measuring the average time shift of each word)
109
+
110
+ ## Release Date
111
+
112
+ April 28, 2026
113
+
114
+ ## License
115
+
116
+ [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
117
+
118
+ ## Supported Languages
119
+
120
+ English, French, German, Spanish, Portuguese
121
+
122
+ ## Intended Use
123
+
124
+ The model is intended to be used in enterprise applications that involve processing of speech input especially when a rich transcript adding speaker turns and time stamps is desired. In particular, the model is well-suited for English, French, German, Spanish, and Portuguese speech-to-text.
125
+
126
+ ## Usage
127
+
128
+ The Granite Speech model is supported natively in `transformers>=5.8`. Below is a simple example of how to use the different modes of the model.
129
+
130
+ ### Usage with `transformers`
131
+
132
+ First [install pytorch](https://pytorch.org/get-started/locally/).
133
+
134
+ Install [transformers](https://huggingface.co/docs/transformers/installation). The code for the granite-speech-plus model was added recently so you might need to install from the sources until the PyPI package is updated.
135
+
136
+ ```shell
137
+ pip install torchaudio datasets accelerate torchcodec
138
+ ```
139
+
140
+ **Setup** — load the model and a test audio clip:
141
+
142
+ ```python
143
+ import re
144
+ import torch
145
+ from datasets import Audio, load_dataset
146
+ from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor
147
+ ```
148
+
149
+ Load the model and define a general function for decoding the audio:
150
+
151
+ ```python
152
+ MODEL_NAME = "ibm-granite/granite-speech-4.1-2b-plus"
153
+
154
+ device = "cuda" if torch.cuda.is_available() else "cpu"
155
+ processor = AutoProcessor.from_pretrained(MODEL_NAME)
156
+ tokenizer = processor.tokenizer
157
+ model = AutoModelForSpeechSeq2Seq.from_pretrained(MODEL_NAME, device_map=device, dtype=torch.bfloat16)
158
+ model.eval()
159
+
160
+ SYSTEM_PROMPT = "Knowledge Cutoff Date: April 2024.\nToday's Date: December 19, 2024.\nYou are Granite, developed by IBM. You are a helpful AI assistant"
161
+
162
+ @torch.inference_mode()
163
+ def transcribe(audio, prompt, max_new_tokens=2000, prefix_text=None):
164
+ chat = [{"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": prompt}]
165
+ extra = {"prefix_text": prefix_text} if prefix_text is not None else {}
166
+ prompt_text = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True, **extra)
167
+ inputs = processor(prompt_text, audio, device=device, return_tensors="pt").to(device)
168
+ outputs = model.generate(**inputs, max_new_tokens=max_new_tokens, do_sample=False, num_beams=1)
169
+ new_tokens = outputs[0, inputs["input_ids"].shape[-1]:]
170
+ output_text = tokenizer.decode(new_tokens, add_special_tokens=False, skip_special_tokens=True)
171
+ return output_text
172
+ ```
173
+
174
+ Load some example audio data from the AMI dataset
175
+
176
+ ```python
177
+ SAMPLE_RATE = 16000
178
+
179
+ ds = load_dataset("diarizers-community/ami", "ihm", split="test")
180
+ ds = ds.cast_column("audio", Audio(sampling_rate=SAMPLE_RATE, num_channels=1))
181
+
182
+ TEST_SAMPLE = 0
183
+ START_TIME, END_TIME = 5 * 60, 6 * 60
184
+ audio = ds["audio"][TEST_SAMPLE].get_samples_played_in_range(START_TIME, END_TIME)
185
+ ```
186
+
187
+ **Task 1: ASR** — plain speech-to-text transcription:
188
+
189
+ ```python
190
+ ASR_PROMPT = "<|audio|> can you transcribe the speech into a written format?"
191
+
192
+ asr_text = transcribe(audio.data, ASR_PROMPT)
193
+ print(asr_text)
194
+ ```
195
+
196
+ **Task 2: Speaker Attributed ASR** — transcription with speaker labels:
197
+
198
+ ```python
199
+ SAA_PROMPT = "<|audio|> Speaker attribution: Transcribe and denote who is speaking by adding [Speaker 1]: and [Speaker 2]: tags before speaker turns."
200
+
201
+ saa_text = transcribe(audio.data, SAA_PROMPT)
202
+ for segment in re.split(r"(\[Speaker \d+\]:)", saa_text):
203
+ print(segment.strip())
204
+ ```
205
+
206
+ **Task 3: Word-level timestamps** — transcription with per-word timing:
207
+
208
+ The timestamps are given in centiseconds and are modulo 1000 (=10 seconds)
209
+ so we need to unwrap them by adding multiples of 10 seconds.
210
+
211
+ ```python
212
+ TS_PROMPT = "<|audio|> Timestamps: Transcribe the speech. After each word, add a timestamp tag showing the end time in centiseconds, e.g. hello [T:45] world [T:82]"
213
+
214
+ ts_text = transcribe(audio.data, TS_PROMPT, max_new_tokens=10000)
215
+ ts_words = re.split(r"\[T:(\d+)\]", ts_text)
216
+ last_word_end_time = 0
217
+ offset_time = 0
218
+ for word, ts in zip(ts_words[::2], ts_words[1::2]):
219
+ word_end_time = float(ts) / 100
220
+ while word_end_time + offset_time < last_word_end_time:
221
+ offset_time += 10
222
+ last_word_end_time = word_end_time + offset_time
223
+ print(f"{word}\t{last_word_end_time:.2f}s")
224
+ ```
225
+
226
+ **Task 4: Incremental decoding** — transcribe segments while accumulating audio context:
227
+
228
+ ```python
229
+ NUM_SEGMENTS = 3
230
+ previous_transcript = ""
231
+ all_audio = None
232
+
233
+ for k in range(NUM_SEGMENTS):
234
+ t1 = START_TIME + (END_TIME - START_TIME) * k / NUM_SEGMENTS
235
+ t2 = START_TIME + (END_TIME - START_TIME) * (k + 1) / NUM_SEGMENTS
236
+ new_audio = ds["audio"][TEST_SAMPLE].get_samples_played_in_range(t1, t2)
237
+ all_audio = new_audio.data if all_audio is None else torch.cat([all_audio, new_audio.data], dim=-1)
238
+ saa_text = transcribe(all_audio, SAA_PROMPT, prefix_text=previous_transcript)
239
+ print(f"{t1:06.2f}-{t2:06.2f}:\t{saa_text}")
240
+ previous_transcript = (previous_transcript + " " + saa_text).strip()
241
+ ```
242
+
243
+ ## Model Architecture
244
+
245
+ The model shares the same architecture as the [Granite-Speech-4.1-2B](https://huggingface.co/ibm-granite/granite-speech-4.1-2b) model.
246
+
247
+ ## Training Data
248
+
249
+ The model was trained on the same datasets as [Granite-Speech-4.1-2B](https://huggingface.co/ibm-granite/granite-speech-4.1-2b).
250
+
251
+ Additional training data for SAA was created using audio segments from datasets that have speaker identification (e.g. Multilingual-Librispeech). Segments with alternating speakers were concatenated to create a long multi-speaker sample.
252
+
253
+
254
+ ### Training Data for Timestamps
255
+
256
+ Word-level timestamping capabilities are achieved by using a combination of publicly available speech corpora: LibriSpeech, MLS (en, fr, de, pt, es), CommonVoice (en, fr, de, pt, es), VoxPopuli (en, fr, de, es), AMI-IHM, Switchboard, TIMIT and YODAS. For AMI-IHM, Switchboard and TIMIT, we use the available timestamp annotations. For all other datasets, we obtain word-level alignments using the Montreal Forced Aligner (MFA), a GMM-HMM based forced alignment tool. We also use MFA to insert silence boundaries into the manually annotated datasets.
257
+
258
+ To ensure high-quality training data, we validate the MFA-derived alignments using forced alignments with our CTC-based speech encoder. We compute the Accumulated Average Shift (AAS), the mean absolute error between timestamps in milliseconds, for the CTC and MFA alignments and retain only samples with the lowest alignment error: the top 95% for English and top 70% for non-English data. For the larger datasets (YODAS and MLS-en), we cap the training data at 4M and 5M samples, respectively.
259
+
260
+ Additional training data containing long audio samples with timestamps were generated by concatenation of short segments.
261
+
262
+ The model was trained on audio samples up to 10 minutes for ASR and SAA, and up to 5 minutes for timestamps.
263
+
264
+ ## Infrastructure
265
+
266
+ We train Granite Speech using IBM's supercomputing cluster, Blue Vela, which is outfitted with NVIDIA H100 GPUs. This cluster provides a scalable
267
+ and efficient infrastructure for training our models over thousands of GPUs. The training of this particular model was completed in about 5 days on 32
268
+ H100 GPUs.
269
+
270
+ ## Ethical Considerations and Limitations
271
+
272
+ The use of Large Speech and Language Models can trigger certain risks and ethical considerations. Although our alignment processes include safety considerations,
273
+ the model may in some cases produce inaccurate, biased, offensive or unwanted responses to user prompts. Additionally, whether smaller models may exhibit increased
274
+ susceptibility to hallucination in generation scenarios due to their reduced sizes, which could limit their ability to generate coherent and contextually accurate responses, remains uncertain.
275
+ This aspect is currently an active area of research, and we anticipate more rigorous exploration, comprehension, and mitigations in this domain.
276
+
277
+ IBM recommends using this model for automatic speech recognition and translation tasks. The model's design improves safety by limiting how audio inputs can influence the system.
278
+ If an unfamiliar or malformed prompt is received, the model simply ignores it and performs transcription, which is the default fallback mode.
279
+ This minimizes the risk of adversarial inputs, unlike integrated models that directly interpret audio and may be more exposed to such attacks. Note that more general speech tasks may pose higher inherent risks of triggering unwanted outputs.
280
+
281
+ To enhance safety, we recommend using Granite-Speech-4.1-2B-Plus alongside Granite Guardian. Granite Guardian is a fine-tuned instruct model designed to detect and flag risks in prompts and responses across key dimensions outlined in the IBM AI Risk Atlas.
282
+
283
+ ## Resources
284
+
285
+ - 📄 Read the papers:
286
+ - [Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS](https://arxiv.org/abs/2604.11269)
287
+ - [In-Sync: Adaptation of Speech Aware Large Language Models for ASR with Word Level Timestamp Predictions](https://arxiv.org/abs/2604.22817)
288
+ - [Contextual Biasing for ASR in Speech LLM with Common Word Cues and Bias Word Position Prediction](https://arxiv.org/abs/2604.12398)
289
+ - [Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities](https://arxiv.org/abs/2505.08699)
290
+ - [Self-Speculative Decoding for LLM-based ASR with CTC Encoder Drafts](https://arxiv.org/abs/2603.11243)
291
+ - [NLE: Non-autoregressive LLM-based ASR by Transcript Editing](https://arxiv.org/abs/2603.08397)
292
+ - ⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
293
+ - 🚀 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
294
+ - 💡 Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources
295
+
296
+ ## References
297
+
298
+ [1] VibeVoice-ASR (Transformers-compatible version). Available online: https://huggingface.co/microsoft/VibeVoice-ASR-HF.
299
+
300
+ [2] X. Shi et al., "Qwen3-ASR technical report," 2026. arXiv
301
+
302
+ [3] M. Zusag, L. Wagner, and B. Thallinger, "CrisperWhisper: Accurate timestamps on verbatim speech transcriptions," in Proc. Interspeech, 2024.
303
+
304
+ [4] M. Sekoyan, N. R. Koluguri, N. Tadevosyan, P. Zelasko, T. Bartley, N. Karpov, J. Balam, and B. Ginsburg, "Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and high-performance models for multilingual ASR and AST," 2025. arXiv
305
+
306
+ [5] M. Bain, J. Huh, T. Han, and A. Zisserman, "WhisperX: Time-accurate speech transcription of long-form audio," 2023. arXiv
307
+
chat_template.jinja ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- set tools_system_message_prefix = 'You are a helpful assistant with access to the following tools. You may call one or more tools to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>' %}
2
+ {%- set tools_system_message_suffix = '\n</tools>\n\nFor each tool call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call>. If a tool does not exist in the provided list of tools, notify the user that you do not have the ability to fulfill the request.' %}
3
+ {%- set documents_system_message_prefix = 'You are a helpful assistant with access to the following documents. You may use one or more documents to assist with the user query.\n\nYou are given a list of documents within <documents></documents> XML tags:\n<documents>' %}
4
+ {%- set documents_system_message_suffix = '\n</documents>\n\nWrite the response to the user\'s input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data.' %}
5
+ {%- set g4_default_system_message = 'You are a helpful assistant. Please ensure responses are professional, accurate, and safe.' %}
6
+ {%- if available_tools is defined and available_tools %}
7
+ {%- set tools = available_tools %}
8
+ {%- endif %}
9
+ {%- set ns = namespace(tools_system_message=tools_system_message_prefix,
10
+ documents_system_message=documents_system_message_prefix,
11
+ default_system_message=g4_default_system_message,
12
+ system_message=''
13
+ ) %}
14
+ {%- if tools %}
15
+ {%- for tool in tools %}
16
+ {%- set ns.tools_system_message = ns.tools_system_message + '\n' + (tool | tojson) %}
17
+ {%- endfor %}
18
+ {%- set ns.tools_system_message = ns.tools_system_message + tools_system_message_suffix %}
19
+ {%- else %}
20
+ {%- set ns.tools_system_message = '' %}
21
+ {%- endif %}
22
+ {%- if documents %}
23
+ {%- for document in documents %}
24
+ {%- set ns.documents_system_message = ns.documents_system_message + '\n' + (document | tojson) %}
25
+ {%- endfor %}
26
+ {%- set ns.documents_system_message = ns.documents_system_message + documents_system_message_suffix %}
27
+ {%- else %}
28
+ {%- set ns.documents_system_message = '' %}
29
+ {%- endif %}
30
+ {%- if messages[0].role == 'system' %}
31
+ {%- if messages[0].content is string %}
32
+ {%- set ns.system_message = messages[0].content %}
33
+ {%- elif messages[0].content is iterable %}
34
+ {%- for entry in messages[0].content %}
35
+ {%- if entry.type== 'text' %}
36
+ {%- if ns.system_message != '' %}
37
+ {%- set ns.system_message = ns.system_message + '\n' %}
38
+ {%- endif %}
39
+ {%- set ns.system_message = ns.system_message + entry.text %}
40
+ {%- endif %}
41
+ {%- endfor %}
42
+ {%- endif %}
43
+ {%- if tools and documents %}
44
+ {%- set ns.system_message = ns.system_message + '\n\n' + ns.tools_system_message + '\n\n' + ns.documents_system_message %}
45
+ {%- elif tools %}
46
+ {%- set ns.system_message = ns.system_message + '\n\n' + ns.tools_system_message %}
47
+ {%- elif documents %}
48
+ {%- set ns.system_message = ns.system_message + '\n\n' + ns.documents_system_message %}
49
+ {%- endif %}
50
+ {%- else %}
51
+ {%- if tools and documents %}
52
+ {%- set ns.system_message = ns.tools_system_message + '\n\n' + ns.documents_system_message %}
53
+ {%- elif tools %}
54
+ {%- set ns.system_message = ns.tools_system_message %}
55
+ {%- elif documents %}
56
+ {%- set ns.system_message = ns.documents_system_message %}
57
+ {%- endif %}
58
+ {%- endif %}
59
+ {%- if ns.system_message %}
60
+ {{- '<|start_of_role|>system<|end_of_role|>' + ns.system_message + '<|end_of_text|>\n' }}
61
+ {%- else %}
62
+ {{- '<|start_of_role|>system<|end_of_role|>' + ns.default_system_message + '<|end_of_text|>\n' }}
63
+ {%- endif %}
64
+ {%- for message in messages %}
65
+ {%- set content = namespace(val='') %}
66
+ {%- if message.content is string %}
67
+ {%- set content.val = message.content %}
68
+ {%- else %}
69
+ {%- if message.content is iterable %}
70
+ {%- for entry in message.content %}
71
+ {%- if entry.type== 'text' %}
72
+ {%- if content.val != '' %}
73
+ {%- set content.val = content.val + '\n' %}
74
+ {%- endif %}
75
+ {%- set content.val = content.val + entry.text %}
76
+ {%- endif %}
77
+ {%- endfor %}
78
+ {%- endif %}
79
+ {%- endif %}
80
+ {%- if (message.role == 'user') or (message.role == 'system' and not loop.first) %}
81
+ {{- '<|start_of_role|>' + message.role + '<|end_of_role|>' + content.val + '<|end_of_text|>\n' }}
82
+ {%- elif message.role == 'assistant' %}
83
+ {{- '<|start_of_role|>' + message.role + '<|end_of_role|>' + content.val }}
84
+ {%- if message.tool_calls %}
85
+ {%- for tool_call in message.tool_calls %}
86
+ {%- if (loop.first and content.val) or (not loop.first) %}
87
+ {{- '\n' }}
88
+ {%- endif %}
89
+ {%- if tool_call.function %}
90
+ {%- set tool_call = tool_call.function %}
91
+ {%- endif %}
92
+ {{- '<tool_call>\n{"name": "' }}
93
+ {{- tool_call.name }}
94
+ {{- '", "arguments": ' }}
95
+ {%- if tool_call.arguments is string %}
96
+ {{- tool_call.arguments }}
97
+ {%- else %}
98
+ {{- tool_call.arguments | tojson }}
99
+ {%- endif %}
100
+ {{- '}\n</tool_call>' }}
101
+ {%- endfor %}
102
+ {%- endif %}
103
+ {{- '<|end_of_text|>\n' }}
104
+ {%- elif message.role == 'tool' %}
105
+ {%- if loop.first or (messages[loop.index0 - 1].role != 'tool') %}
106
+ {{- '<|start_of_role|>user<|end_of_role|>' }}
107
+ {%- endif %}
108
+ {{- '\n<tool_response>\n' }}
109
+ {{- content.val }}
110
+ {{- '\n</tool_response>' }}
111
+ {%- if loop.last or (messages[loop.index0 + 1].role != 'tool') %}
112
+ {{- '<|end_of_text|>\n' }}
113
+ {%- endif %}
114
+ {%- endif %}
115
+ {%- endfor %}
116
+ {%- if add_generation_prompt %}
117
+ {{- '<|start_of_role|>assistant<|end_of_role|>' }}
118
+ {%- if prefix_text is defined and prefix_text %}
119
+ {{- prefix_text }}
120
+ {%- endif %}
121
+ {%- endif %}
config.json ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "GraniteSpeechPlusForConditionalGeneration"
4
+ ],
5
+ "audio_token_index": 100352,
6
+ "downsample_rate": 5,
7
+ "dtype": "bfloat16",
8
+ "encoder_config": {
9
+ "cat_hidden_layers": [
10
+ 3
11
+ ],
12
+ "context_size": 200,
13
+ "conv_expansion_factor": 2,
14
+ "conv_kernel_size": 15,
15
+ "dim_head": 128,
16
+ "dropout": 0.1,
17
+ "feedforward_mult": 4,
18
+ "hidden_dim": 1024,
19
+ "input_dim": 160,
20
+ "max_pos_emb": 512,
21
+ "model_type": "granite_speech_plus_encoder",
22
+ "num_heads": 8,
23
+ "num_layers": 16,
24
+ "output_dim": 348
25
+ },
26
+ "has_lora_adapter": false,
27
+ "initializer_range": 0.02,
28
+ "model_type": "granite_speech_plus",
29
+ "projector_config": {
30
+ "_attn_implementation_autoset": true,
31
+ "attention_probs_dropout_prob": 0.1,
32
+ "cross_attention_frequency": 1,
33
+ "encoder_hidden_size": 2048,
34
+ "hidden_act": "gelu",
35
+ "hidden_dropout_prob": 0.1,
36
+ "hidden_size": 1024,
37
+ "initializer_range": 0.02,
38
+ "intermediate_size": 4096,
39
+ "layer_norm_eps": 1e-12,
40
+ "max_position_embeddings": 2048,
41
+ "model_type": "blip_2_qformer",
42
+ "num_attention_heads": 16,
43
+ "num_hidden_layers": 2,
44
+ "pad_token_id": 0,
45
+ "position_embedding_type": "absolute",
46
+ "use_qformer_text_input": false,
47
+ "vocab_size": 30522
48
+ },
49
+ "text_config": {
50
+ "_name_or_path": "/proj/speech/saon/slam-llm/29.2-c/granite-4.0-1b-base",
51
+ "architectures": [
52
+ "GraniteForCausalLM"
53
+ ],
54
+ "attention_bias": false,
55
+ "attention_dropout": 0.0,
56
+ "attention_multiplier": 0.0078125,
57
+ "bos_token_id": 100257,
58
+ "dtype": "float32",
59
+ "embedding_multiplier": 12,
60
+ "eos_token_id": 100257,
61
+ "hidden_act": "silu",
62
+ "hidden_size": 2048,
63
+ "initializer_range": 0.1,
64
+ "intermediate_size": 4096,
65
+ "logits_scaling": 8,
66
+ "max_position_embeddings": 4096,
67
+ "mlp_bias": false,
68
+ "model_type": "granite",
69
+ "num_attention_heads": 16,
70
+ "num_hidden_layers": 40,
71
+ "num_key_value_heads": 4,
72
+ "pad_token_id": 100256,
73
+ "residual_multiplier": 0.22,
74
+ "rms_norm_eps": 1e-05,
75
+ "rope_parameters": {
76
+ "rope_theta": 10000,
77
+ "rope_type": "default"
78
+ },
79
+ "tie_word_embeddings": true,
80
+ "use_cache": true,
81
+ "vocab_size": 100353,
82
+ "rope_theta": 10000,
83
+ "rope_type": "default"
84
+ },
85
+ "transformers_version": "5.6.0.dev0",
86
+ "window_size": 15
87
+ }
generation_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 100257,
4
+ "eos_token_id": 100257,
5
+ "output_attentions": false,
6
+ "output_hidden_states": false,
7
+ "pad_token_id": 100256,
8
+ "transformers_version": "5.6.0.dev0",
9
+ "use_cache": true
10
+ }
model-00001-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af45105ba955e3a796f39c3cddc6feae9fb4696b46e99f18355df9d7c8bdb0ba
3
+ size 1992505016
model-00002-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:172bddcb0b9fe4e59b4302eecc478bbe5fb477759b80a52e476b43b55c9493a7
3
+ size 1993777408
model-00003-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:12994776f7c9e24cda3339ee8a6ca6a07600f5ae4a4c38d66703dcefb8ff4624
3
+ size 237587992
model.safetensors.index.json ADDED
@@ -0,0 +1,961 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_parameters": 2111812956,
4
+ "total_size": 4223757112
5
+ },
6
+ "weight_map": {
7
+ "encoder.input_linear.bias": "model-00002-of-00003.safetensors",
8
+ "encoder.input_linear.weight": "model-00002-of-00003.safetensors",
9
+ "encoder.layers.0.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
10
+ "encoder.layers.0.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
11
+ "encoder.layers.0.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
12
+ "encoder.layers.0.attn.to_kv.weight": "model-00002-of-00003.safetensors",
13
+ "encoder.layers.0.attn.to_out.bias": "model-00002-of-00003.safetensors",
14
+ "encoder.layers.0.attn.to_out.weight": "model-00002-of-00003.safetensors",
15
+ "encoder.layers.0.attn.to_q.weight": "model-00002-of-00003.safetensors",
16
+ "encoder.layers.0.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
17
+ "encoder.layers.0.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
18
+ "encoder.layers.0.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
19
+ "encoder.layers.0.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
20
+ "encoder.layers.0.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
21
+ "encoder.layers.0.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
22
+ "encoder.layers.0.conv.down_conv.bias": "model-00002-of-00003.safetensors",
23
+ "encoder.layers.0.conv.down_conv.weight": "model-00002-of-00003.safetensors",
24
+ "encoder.layers.0.conv.norm.bias": "model-00002-of-00003.safetensors",
25
+ "encoder.layers.0.conv.norm.weight": "model-00002-of-00003.safetensors",
26
+ "encoder.layers.0.conv.up_conv.bias": "model-00002-of-00003.safetensors",
27
+ "encoder.layers.0.conv.up_conv.weight": "model-00002-of-00003.safetensors",
28
+ "encoder.layers.0.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
29
+ "encoder.layers.0.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
30
+ "encoder.layers.0.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
31
+ "encoder.layers.0.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
32
+ "encoder.layers.0.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
33
+ "encoder.layers.0.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
34
+ "encoder.layers.0.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
35
+ "encoder.layers.0.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
36
+ "encoder.layers.0.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
37
+ "encoder.layers.0.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
38
+ "encoder.layers.0.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
39
+ "encoder.layers.0.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
40
+ "encoder.layers.0.post_norm.bias": "model-00002-of-00003.safetensors",
41
+ "encoder.layers.0.post_norm.weight": "model-00002-of-00003.safetensors",
42
+ "encoder.layers.1.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
43
+ "encoder.layers.1.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
44
+ "encoder.layers.1.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
45
+ "encoder.layers.1.attn.to_kv.weight": "model-00002-of-00003.safetensors",
46
+ "encoder.layers.1.attn.to_out.bias": "model-00002-of-00003.safetensors",
47
+ "encoder.layers.1.attn.to_out.weight": "model-00002-of-00003.safetensors",
48
+ "encoder.layers.1.attn.to_q.weight": "model-00002-of-00003.safetensors",
49
+ "encoder.layers.1.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
50
+ "encoder.layers.1.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
51
+ "encoder.layers.1.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
52
+ "encoder.layers.1.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
53
+ "encoder.layers.1.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
54
+ "encoder.layers.1.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
55
+ "encoder.layers.1.conv.down_conv.bias": "model-00002-of-00003.safetensors",
56
+ "encoder.layers.1.conv.down_conv.weight": "model-00002-of-00003.safetensors",
57
+ "encoder.layers.1.conv.norm.bias": "model-00002-of-00003.safetensors",
58
+ "encoder.layers.1.conv.norm.weight": "model-00002-of-00003.safetensors",
59
+ "encoder.layers.1.conv.up_conv.bias": "model-00002-of-00003.safetensors",
60
+ "encoder.layers.1.conv.up_conv.weight": "model-00002-of-00003.safetensors",
61
+ "encoder.layers.1.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
62
+ "encoder.layers.1.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
63
+ "encoder.layers.1.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
64
+ "encoder.layers.1.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
65
+ "encoder.layers.1.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
66
+ "encoder.layers.1.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
67
+ "encoder.layers.1.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
68
+ "encoder.layers.1.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
69
+ "encoder.layers.1.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
70
+ "encoder.layers.1.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
71
+ "encoder.layers.1.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
72
+ "encoder.layers.1.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
73
+ "encoder.layers.1.post_norm.bias": "model-00002-of-00003.safetensors",
74
+ "encoder.layers.1.post_norm.weight": "model-00002-of-00003.safetensors",
75
+ "encoder.layers.10.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
76
+ "encoder.layers.10.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
77
+ "encoder.layers.10.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
78
+ "encoder.layers.10.attn.to_kv.weight": "model-00002-of-00003.safetensors",
79
+ "encoder.layers.10.attn.to_out.bias": "model-00002-of-00003.safetensors",
80
+ "encoder.layers.10.attn.to_out.weight": "model-00002-of-00003.safetensors",
81
+ "encoder.layers.10.attn.to_q.weight": "model-00002-of-00003.safetensors",
82
+ "encoder.layers.10.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
83
+ "encoder.layers.10.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
84
+ "encoder.layers.10.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
85
+ "encoder.layers.10.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
86
+ "encoder.layers.10.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
87
+ "encoder.layers.10.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
88
+ "encoder.layers.10.conv.down_conv.bias": "model-00002-of-00003.safetensors",
89
+ "encoder.layers.10.conv.down_conv.weight": "model-00002-of-00003.safetensors",
90
+ "encoder.layers.10.conv.norm.bias": "model-00002-of-00003.safetensors",
91
+ "encoder.layers.10.conv.norm.weight": "model-00002-of-00003.safetensors",
92
+ "encoder.layers.10.conv.up_conv.bias": "model-00002-of-00003.safetensors",
93
+ "encoder.layers.10.conv.up_conv.weight": "model-00002-of-00003.safetensors",
94
+ "encoder.layers.10.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
95
+ "encoder.layers.10.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
96
+ "encoder.layers.10.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
97
+ "encoder.layers.10.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
98
+ "encoder.layers.10.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
99
+ "encoder.layers.10.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
100
+ "encoder.layers.10.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
101
+ "encoder.layers.10.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
102
+ "encoder.layers.10.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
103
+ "encoder.layers.10.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
104
+ "encoder.layers.10.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
105
+ "encoder.layers.10.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
106
+ "encoder.layers.10.post_norm.bias": "model-00002-of-00003.safetensors",
107
+ "encoder.layers.10.post_norm.weight": "model-00002-of-00003.safetensors",
108
+ "encoder.layers.11.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
109
+ "encoder.layers.11.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
110
+ "encoder.layers.11.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
111
+ "encoder.layers.11.attn.to_kv.weight": "model-00002-of-00003.safetensors",
112
+ "encoder.layers.11.attn.to_out.bias": "model-00002-of-00003.safetensors",
113
+ "encoder.layers.11.attn.to_out.weight": "model-00002-of-00003.safetensors",
114
+ "encoder.layers.11.attn.to_q.weight": "model-00002-of-00003.safetensors",
115
+ "encoder.layers.11.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
116
+ "encoder.layers.11.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
117
+ "encoder.layers.11.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
118
+ "encoder.layers.11.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
119
+ "encoder.layers.11.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
120
+ "encoder.layers.11.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
121
+ "encoder.layers.11.conv.down_conv.bias": "model-00002-of-00003.safetensors",
122
+ "encoder.layers.11.conv.down_conv.weight": "model-00002-of-00003.safetensors",
123
+ "encoder.layers.11.conv.norm.bias": "model-00002-of-00003.safetensors",
124
+ "encoder.layers.11.conv.norm.weight": "model-00002-of-00003.safetensors",
125
+ "encoder.layers.11.conv.up_conv.bias": "model-00002-of-00003.safetensors",
126
+ "encoder.layers.11.conv.up_conv.weight": "model-00002-of-00003.safetensors",
127
+ "encoder.layers.11.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
128
+ "encoder.layers.11.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
129
+ "encoder.layers.11.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
130
+ "encoder.layers.11.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
131
+ "encoder.layers.11.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
132
+ "encoder.layers.11.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
133
+ "encoder.layers.11.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
134
+ "encoder.layers.11.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
135
+ "encoder.layers.11.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
136
+ "encoder.layers.11.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
137
+ "encoder.layers.11.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
138
+ "encoder.layers.11.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
139
+ "encoder.layers.11.post_norm.bias": "model-00002-of-00003.safetensors",
140
+ "encoder.layers.11.post_norm.weight": "model-00002-of-00003.safetensors",
141
+ "encoder.layers.12.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
142
+ "encoder.layers.12.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
143
+ "encoder.layers.12.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
144
+ "encoder.layers.12.attn.to_kv.weight": "model-00002-of-00003.safetensors",
145
+ "encoder.layers.12.attn.to_out.bias": "model-00002-of-00003.safetensors",
146
+ "encoder.layers.12.attn.to_out.weight": "model-00002-of-00003.safetensors",
147
+ "encoder.layers.12.attn.to_q.weight": "model-00002-of-00003.safetensors",
148
+ "encoder.layers.12.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
149
+ "encoder.layers.12.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
150
+ "encoder.layers.12.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
151
+ "encoder.layers.12.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
152
+ "encoder.layers.12.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
153
+ "encoder.layers.12.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
154
+ "encoder.layers.12.conv.down_conv.bias": "model-00002-of-00003.safetensors",
155
+ "encoder.layers.12.conv.down_conv.weight": "model-00002-of-00003.safetensors",
156
+ "encoder.layers.12.conv.norm.bias": "model-00002-of-00003.safetensors",
157
+ "encoder.layers.12.conv.norm.weight": "model-00002-of-00003.safetensors",
158
+ "encoder.layers.12.conv.up_conv.bias": "model-00002-of-00003.safetensors",
159
+ "encoder.layers.12.conv.up_conv.weight": "model-00002-of-00003.safetensors",
160
+ "encoder.layers.12.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
161
+ "encoder.layers.12.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
162
+ "encoder.layers.12.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
163
+ "encoder.layers.12.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
164
+ "encoder.layers.12.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
165
+ "encoder.layers.12.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
166
+ "encoder.layers.12.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
167
+ "encoder.layers.12.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
168
+ "encoder.layers.12.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
169
+ "encoder.layers.12.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
170
+ "encoder.layers.12.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
171
+ "encoder.layers.12.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
172
+ "encoder.layers.12.post_norm.bias": "model-00002-of-00003.safetensors",
173
+ "encoder.layers.12.post_norm.weight": "model-00002-of-00003.safetensors",
174
+ "encoder.layers.13.attn.pre_norm.bias": "model-00003-of-00003.safetensors",
175
+ "encoder.layers.13.attn.pre_norm.weight": "model-00003-of-00003.safetensors",
176
+ "encoder.layers.13.attn.rel_pos_emb.weight": "model-00003-of-00003.safetensors",
177
+ "encoder.layers.13.attn.to_kv.weight": "model-00003-of-00003.safetensors",
178
+ "encoder.layers.13.attn.to_out.bias": "model-00003-of-00003.safetensors",
179
+ "encoder.layers.13.attn.to_out.weight": "model-00003-of-00003.safetensors",
180
+ "encoder.layers.13.attn.to_q.weight": "model-00003-of-00003.safetensors",
181
+ "encoder.layers.13.conv.batch_norm.bias": "model-00003-of-00003.safetensors",
182
+ "encoder.layers.13.conv.batch_norm.num_batches_tracked": "model-00003-of-00003.safetensors",
183
+ "encoder.layers.13.conv.batch_norm.running_mean": "model-00003-of-00003.safetensors",
184
+ "encoder.layers.13.conv.batch_norm.running_var": "model-00003-of-00003.safetensors",
185
+ "encoder.layers.13.conv.batch_norm.weight": "model-00003-of-00003.safetensors",
186
+ "encoder.layers.13.conv.depth_conv.conv.weight": "model-00003-of-00003.safetensors",
187
+ "encoder.layers.13.conv.down_conv.bias": "model-00003-of-00003.safetensors",
188
+ "encoder.layers.13.conv.down_conv.weight": "model-00003-of-00003.safetensors",
189
+ "encoder.layers.13.conv.norm.bias": "model-00003-of-00003.safetensors",
190
+ "encoder.layers.13.conv.norm.weight": "model-00003-of-00003.safetensors",
191
+ "encoder.layers.13.conv.up_conv.bias": "model-00003-of-00003.safetensors",
192
+ "encoder.layers.13.conv.up_conv.weight": "model-00003-of-00003.safetensors",
193
+ "encoder.layers.13.ff1.down_proj.bias": "model-00003-of-00003.safetensors",
194
+ "encoder.layers.13.ff1.down_proj.weight": "model-00003-of-00003.safetensors",
195
+ "encoder.layers.13.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
196
+ "encoder.layers.13.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
197
+ "encoder.layers.13.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
198
+ "encoder.layers.13.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
199
+ "encoder.layers.13.ff2.down_proj.bias": "model-00003-of-00003.safetensors",
200
+ "encoder.layers.13.ff2.down_proj.weight": "model-00003-of-00003.safetensors",
201
+ "encoder.layers.13.ff2.pre_norm.bias": "model-00003-of-00003.safetensors",
202
+ "encoder.layers.13.ff2.pre_norm.weight": "model-00003-of-00003.safetensors",
203
+ "encoder.layers.13.ff2.up_proj.bias": "model-00003-of-00003.safetensors",
204
+ "encoder.layers.13.ff2.up_proj.weight": "model-00003-of-00003.safetensors",
205
+ "encoder.layers.13.post_norm.bias": "model-00003-of-00003.safetensors",
206
+ "encoder.layers.13.post_norm.weight": "model-00003-of-00003.safetensors",
207
+ "encoder.layers.14.attn.pre_norm.bias": "model-00003-of-00003.safetensors",
208
+ "encoder.layers.14.attn.pre_norm.weight": "model-00003-of-00003.safetensors",
209
+ "encoder.layers.14.attn.rel_pos_emb.weight": "model-00003-of-00003.safetensors",
210
+ "encoder.layers.14.attn.to_kv.weight": "model-00003-of-00003.safetensors",
211
+ "encoder.layers.14.attn.to_out.bias": "model-00003-of-00003.safetensors",
212
+ "encoder.layers.14.attn.to_out.weight": "model-00003-of-00003.safetensors",
213
+ "encoder.layers.14.attn.to_q.weight": "model-00003-of-00003.safetensors",
214
+ "encoder.layers.14.conv.batch_norm.bias": "model-00003-of-00003.safetensors",
215
+ "encoder.layers.14.conv.batch_norm.num_batches_tracked": "model-00003-of-00003.safetensors",
216
+ "encoder.layers.14.conv.batch_norm.running_mean": "model-00003-of-00003.safetensors",
217
+ "encoder.layers.14.conv.batch_norm.running_var": "model-00003-of-00003.safetensors",
218
+ "encoder.layers.14.conv.batch_norm.weight": "model-00003-of-00003.safetensors",
219
+ "encoder.layers.14.conv.depth_conv.conv.weight": "model-00003-of-00003.safetensors",
220
+ "encoder.layers.14.conv.down_conv.bias": "model-00003-of-00003.safetensors",
221
+ "encoder.layers.14.conv.down_conv.weight": "model-00003-of-00003.safetensors",
222
+ "encoder.layers.14.conv.norm.bias": "model-00003-of-00003.safetensors",
223
+ "encoder.layers.14.conv.norm.weight": "model-00003-of-00003.safetensors",
224
+ "encoder.layers.14.conv.up_conv.bias": "model-00003-of-00003.safetensors",
225
+ "encoder.layers.14.conv.up_conv.weight": "model-00003-of-00003.safetensors",
226
+ "encoder.layers.14.ff1.down_proj.bias": "model-00003-of-00003.safetensors",
227
+ "encoder.layers.14.ff1.down_proj.weight": "model-00003-of-00003.safetensors",
228
+ "encoder.layers.14.ff1.pre_norm.bias": "model-00003-of-00003.safetensors",
229
+ "encoder.layers.14.ff1.pre_norm.weight": "model-00003-of-00003.safetensors",
230
+ "encoder.layers.14.ff1.up_proj.bias": "model-00003-of-00003.safetensors",
231
+ "encoder.layers.14.ff1.up_proj.weight": "model-00003-of-00003.safetensors",
232
+ "encoder.layers.14.ff2.down_proj.bias": "model-00003-of-00003.safetensors",
233
+ "encoder.layers.14.ff2.down_proj.weight": "model-00003-of-00003.safetensors",
234
+ "encoder.layers.14.ff2.pre_norm.bias": "model-00003-of-00003.safetensors",
235
+ "encoder.layers.14.ff2.pre_norm.weight": "model-00003-of-00003.safetensors",
236
+ "encoder.layers.14.ff2.up_proj.bias": "model-00003-of-00003.safetensors",
237
+ "encoder.layers.14.ff2.up_proj.weight": "model-00003-of-00003.safetensors",
238
+ "encoder.layers.14.post_norm.bias": "model-00003-of-00003.safetensors",
239
+ "encoder.layers.14.post_norm.weight": "model-00003-of-00003.safetensors",
240
+ "encoder.layers.15.attn.pre_norm.bias": "model-00003-of-00003.safetensors",
241
+ "encoder.layers.15.attn.pre_norm.weight": "model-00003-of-00003.safetensors",
242
+ "encoder.layers.15.attn.rel_pos_emb.weight": "model-00003-of-00003.safetensors",
243
+ "encoder.layers.15.attn.to_kv.weight": "model-00003-of-00003.safetensors",
244
+ "encoder.layers.15.attn.to_out.bias": "model-00003-of-00003.safetensors",
245
+ "encoder.layers.15.attn.to_out.weight": "model-00003-of-00003.safetensors",
246
+ "encoder.layers.15.attn.to_q.weight": "model-00003-of-00003.safetensors",
247
+ "encoder.layers.15.conv.batch_norm.bias": "model-00003-of-00003.safetensors",
248
+ "encoder.layers.15.conv.batch_norm.num_batches_tracked": "model-00003-of-00003.safetensors",
249
+ "encoder.layers.15.conv.batch_norm.running_mean": "model-00003-of-00003.safetensors",
250
+ "encoder.layers.15.conv.batch_norm.running_var": "model-00003-of-00003.safetensors",
251
+ "encoder.layers.15.conv.batch_norm.weight": "model-00003-of-00003.safetensors",
252
+ "encoder.layers.15.conv.depth_conv.conv.weight": "model-00003-of-00003.safetensors",
253
+ "encoder.layers.15.conv.down_conv.bias": "model-00003-of-00003.safetensors",
254
+ "encoder.layers.15.conv.down_conv.weight": "model-00003-of-00003.safetensors",
255
+ "encoder.layers.15.conv.norm.bias": "model-00003-of-00003.safetensors",
256
+ "encoder.layers.15.conv.norm.weight": "model-00003-of-00003.safetensors",
257
+ "encoder.layers.15.conv.up_conv.bias": "model-00003-of-00003.safetensors",
258
+ "encoder.layers.15.conv.up_conv.weight": "model-00003-of-00003.safetensors",
259
+ "encoder.layers.15.ff1.down_proj.bias": "model-00003-of-00003.safetensors",
260
+ "encoder.layers.15.ff1.down_proj.weight": "model-00003-of-00003.safetensors",
261
+ "encoder.layers.15.ff1.pre_norm.bias": "model-00003-of-00003.safetensors",
262
+ "encoder.layers.15.ff1.pre_norm.weight": "model-00003-of-00003.safetensors",
263
+ "encoder.layers.15.ff1.up_proj.bias": "model-00003-of-00003.safetensors",
264
+ "encoder.layers.15.ff1.up_proj.weight": "model-00003-of-00003.safetensors",
265
+ "encoder.layers.15.ff2.down_proj.bias": "model-00003-of-00003.safetensors",
266
+ "encoder.layers.15.ff2.down_proj.weight": "model-00003-of-00003.safetensors",
267
+ "encoder.layers.15.ff2.pre_norm.bias": "model-00003-of-00003.safetensors",
268
+ "encoder.layers.15.ff2.pre_norm.weight": "model-00003-of-00003.safetensors",
269
+ "encoder.layers.15.ff2.up_proj.bias": "model-00003-of-00003.safetensors",
270
+ "encoder.layers.15.ff2.up_proj.weight": "model-00003-of-00003.safetensors",
271
+ "encoder.layers.15.post_norm.bias": "model-00003-of-00003.safetensors",
272
+ "encoder.layers.15.post_norm.weight": "model-00003-of-00003.safetensors",
273
+ "encoder.layers.2.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
274
+ "encoder.layers.2.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
275
+ "encoder.layers.2.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
276
+ "encoder.layers.2.attn.to_kv.weight": "model-00002-of-00003.safetensors",
277
+ "encoder.layers.2.attn.to_out.bias": "model-00002-of-00003.safetensors",
278
+ "encoder.layers.2.attn.to_out.weight": "model-00002-of-00003.safetensors",
279
+ "encoder.layers.2.attn.to_q.weight": "model-00002-of-00003.safetensors",
280
+ "encoder.layers.2.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
281
+ "encoder.layers.2.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
282
+ "encoder.layers.2.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
283
+ "encoder.layers.2.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
284
+ "encoder.layers.2.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
285
+ "encoder.layers.2.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
286
+ "encoder.layers.2.conv.down_conv.bias": "model-00002-of-00003.safetensors",
287
+ "encoder.layers.2.conv.down_conv.weight": "model-00002-of-00003.safetensors",
288
+ "encoder.layers.2.conv.norm.bias": "model-00002-of-00003.safetensors",
289
+ "encoder.layers.2.conv.norm.weight": "model-00002-of-00003.safetensors",
290
+ "encoder.layers.2.conv.up_conv.bias": "model-00002-of-00003.safetensors",
291
+ "encoder.layers.2.conv.up_conv.weight": "model-00002-of-00003.safetensors",
292
+ "encoder.layers.2.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
293
+ "encoder.layers.2.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
294
+ "encoder.layers.2.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
295
+ "encoder.layers.2.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
296
+ "encoder.layers.2.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
297
+ "encoder.layers.2.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
298
+ "encoder.layers.2.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
299
+ "encoder.layers.2.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
300
+ "encoder.layers.2.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
301
+ "encoder.layers.2.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
302
+ "encoder.layers.2.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
303
+ "encoder.layers.2.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
304
+ "encoder.layers.2.post_norm.bias": "model-00002-of-00003.safetensors",
305
+ "encoder.layers.2.post_norm.weight": "model-00002-of-00003.safetensors",
306
+ "encoder.layers.3.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
307
+ "encoder.layers.3.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
308
+ "encoder.layers.3.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
309
+ "encoder.layers.3.attn.to_kv.weight": "model-00002-of-00003.safetensors",
310
+ "encoder.layers.3.attn.to_out.bias": "model-00002-of-00003.safetensors",
311
+ "encoder.layers.3.attn.to_out.weight": "model-00002-of-00003.safetensors",
312
+ "encoder.layers.3.attn.to_q.weight": "model-00002-of-00003.safetensors",
313
+ "encoder.layers.3.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
314
+ "encoder.layers.3.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
315
+ "encoder.layers.3.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
316
+ "encoder.layers.3.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
317
+ "encoder.layers.3.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
318
+ "encoder.layers.3.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
319
+ "encoder.layers.3.conv.down_conv.bias": "model-00002-of-00003.safetensors",
320
+ "encoder.layers.3.conv.down_conv.weight": "model-00002-of-00003.safetensors",
321
+ "encoder.layers.3.conv.norm.bias": "model-00002-of-00003.safetensors",
322
+ "encoder.layers.3.conv.norm.weight": "model-00002-of-00003.safetensors",
323
+ "encoder.layers.3.conv.up_conv.bias": "model-00002-of-00003.safetensors",
324
+ "encoder.layers.3.conv.up_conv.weight": "model-00002-of-00003.safetensors",
325
+ "encoder.layers.3.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
326
+ "encoder.layers.3.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
327
+ "encoder.layers.3.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
328
+ "encoder.layers.3.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
329
+ "encoder.layers.3.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
330
+ "encoder.layers.3.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
331
+ "encoder.layers.3.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
332
+ "encoder.layers.3.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
333
+ "encoder.layers.3.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
334
+ "encoder.layers.3.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
335
+ "encoder.layers.3.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
336
+ "encoder.layers.3.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
337
+ "encoder.layers.3.post_norm.bias": "model-00002-of-00003.safetensors",
338
+ "encoder.layers.3.post_norm.weight": "model-00002-of-00003.safetensors",
339
+ "encoder.layers.4.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
340
+ "encoder.layers.4.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
341
+ "encoder.layers.4.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
342
+ "encoder.layers.4.attn.to_kv.weight": "model-00002-of-00003.safetensors",
343
+ "encoder.layers.4.attn.to_out.bias": "model-00002-of-00003.safetensors",
344
+ "encoder.layers.4.attn.to_out.weight": "model-00002-of-00003.safetensors",
345
+ "encoder.layers.4.attn.to_q.weight": "model-00002-of-00003.safetensors",
346
+ "encoder.layers.4.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
347
+ "encoder.layers.4.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
348
+ "encoder.layers.4.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
349
+ "encoder.layers.4.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
350
+ "encoder.layers.4.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
351
+ "encoder.layers.4.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
352
+ "encoder.layers.4.conv.down_conv.bias": "model-00002-of-00003.safetensors",
353
+ "encoder.layers.4.conv.down_conv.weight": "model-00002-of-00003.safetensors",
354
+ "encoder.layers.4.conv.norm.bias": "model-00002-of-00003.safetensors",
355
+ "encoder.layers.4.conv.norm.weight": "model-00002-of-00003.safetensors",
356
+ "encoder.layers.4.conv.up_conv.bias": "model-00002-of-00003.safetensors",
357
+ "encoder.layers.4.conv.up_conv.weight": "model-00002-of-00003.safetensors",
358
+ "encoder.layers.4.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
359
+ "encoder.layers.4.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
360
+ "encoder.layers.4.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
361
+ "encoder.layers.4.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
362
+ "encoder.layers.4.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
363
+ "encoder.layers.4.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
364
+ "encoder.layers.4.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
365
+ "encoder.layers.4.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
366
+ "encoder.layers.4.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
367
+ "encoder.layers.4.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
368
+ "encoder.layers.4.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
369
+ "encoder.layers.4.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
370
+ "encoder.layers.4.post_norm.bias": "model-00002-of-00003.safetensors",
371
+ "encoder.layers.4.post_norm.weight": "model-00002-of-00003.safetensors",
372
+ "encoder.layers.5.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
373
+ "encoder.layers.5.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
374
+ "encoder.layers.5.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
375
+ "encoder.layers.5.attn.to_kv.weight": "model-00002-of-00003.safetensors",
376
+ "encoder.layers.5.attn.to_out.bias": "model-00002-of-00003.safetensors",
377
+ "encoder.layers.5.attn.to_out.weight": "model-00002-of-00003.safetensors",
378
+ "encoder.layers.5.attn.to_q.weight": "model-00002-of-00003.safetensors",
379
+ "encoder.layers.5.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
380
+ "encoder.layers.5.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
381
+ "encoder.layers.5.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
382
+ "encoder.layers.5.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
383
+ "encoder.layers.5.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
384
+ "encoder.layers.5.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
385
+ "encoder.layers.5.conv.down_conv.bias": "model-00002-of-00003.safetensors",
386
+ "encoder.layers.5.conv.down_conv.weight": "model-00002-of-00003.safetensors",
387
+ "encoder.layers.5.conv.norm.bias": "model-00002-of-00003.safetensors",
388
+ "encoder.layers.5.conv.norm.weight": "model-00002-of-00003.safetensors",
389
+ "encoder.layers.5.conv.up_conv.bias": "model-00002-of-00003.safetensors",
390
+ "encoder.layers.5.conv.up_conv.weight": "model-00002-of-00003.safetensors",
391
+ "encoder.layers.5.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
392
+ "encoder.layers.5.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
393
+ "encoder.layers.5.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
394
+ "encoder.layers.5.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
395
+ "encoder.layers.5.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
396
+ "encoder.layers.5.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
397
+ "encoder.layers.5.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
398
+ "encoder.layers.5.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
399
+ "encoder.layers.5.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
400
+ "encoder.layers.5.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
401
+ "encoder.layers.5.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
402
+ "encoder.layers.5.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
403
+ "encoder.layers.5.post_norm.bias": "model-00002-of-00003.safetensors",
404
+ "encoder.layers.5.post_norm.weight": "model-00002-of-00003.safetensors",
405
+ "encoder.layers.6.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
406
+ "encoder.layers.6.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
407
+ "encoder.layers.6.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
408
+ "encoder.layers.6.attn.to_kv.weight": "model-00002-of-00003.safetensors",
409
+ "encoder.layers.6.attn.to_out.bias": "model-00002-of-00003.safetensors",
410
+ "encoder.layers.6.attn.to_out.weight": "model-00002-of-00003.safetensors",
411
+ "encoder.layers.6.attn.to_q.weight": "model-00002-of-00003.safetensors",
412
+ "encoder.layers.6.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
413
+ "encoder.layers.6.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
414
+ "encoder.layers.6.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
415
+ "encoder.layers.6.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
416
+ "encoder.layers.6.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
417
+ "encoder.layers.6.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
418
+ "encoder.layers.6.conv.down_conv.bias": "model-00002-of-00003.safetensors",
419
+ "encoder.layers.6.conv.down_conv.weight": "model-00002-of-00003.safetensors",
420
+ "encoder.layers.6.conv.norm.bias": "model-00002-of-00003.safetensors",
421
+ "encoder.layers.6.conv.norm.weight": "model-00002-of-00003.safetensors",
422
+ "encoder.layers.6.conv.up_conv.bias": "model-00002-of-00003.safetensors",
423
+ "encoder.layers.6.conv.up_conv.weight": "model-00002-of-00003.safetensors",
424
+ "encoder.layers.6.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
425
+ "encoder.layers.6.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
426
+ "encoder.layers.6.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
427
+ "encoder.layers.6.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
428
+ "encoder.layers.6.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
429
+ "encoder.layers.6.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
430
+ "encoder.layers.6.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
431
+ "encoder.layers.6.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
432
+ "encoder.layers.6.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
433
+ "encoder.layers.6.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
434
+ "encoder.layers.6.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
435
+ "encoder.layers.6.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
436
+ "encoder.layers.6.post_norm.bias": "model-00002-of-00003.safetensors",
437
+ "encoder.layers.6.post_norm.weight": "model-00002-of-00003.safetensors",
438
+ "encoder.layers.7.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
439
+ "encoder.layers.7.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
440
+ "encoder.layers.7.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
441
+ "encoder.layers.7.attn.to_kv.weight": "model-00002-of-00003.safetensors",
442
+ "encoder.layers.7.attn.to_out.bias": "model-00002-of-00003.safetensors",
443
+ "encoder.layers.7.attn.to_out.weight": "model-00002-of-00003.safetensors",
444
+ "encoder.layers.7.attn.to_q.weight": "model-00002-of-00003.safetensors",
445
+ "encoder.layers.7.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
446
+ "encoder.layers.7.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
447
+ "encoder.layers.7.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
448
+ "encoder.layers.7.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
449
+ "encoder.layers.7.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
450
+ "encoder.layers.7.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
451
+ "encoder.layers.7.conv.down_conv.bias": "model-00002-of-00003.safetensors",
452
+ "encoder.layers.7.conv.down_conv.weight": "model-00002-of-00003.safetensors",
453
+ "encoder.layers.7.conv.norm.bias": "model-00002-of-00003.safetensors",
454
+ "encoder.layers.7.conv.norm.weight": "model-00002-of-00003.safetensors",
455
+ "encoder.layers.7.conv.up_conv.bias": "model-00002-of-00003.safetensors",
456
+ "encoder.layers.7.conv.up_conv.weight": "model-00002-of-00003.safetensors",
457
+ "encoder.layers.7.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
458
+ "encoder.layers.7.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
459
+ "encoder.layers.7.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
460
+ "encoder.layers.7.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
461
+ "encoder.layers.7.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
462
+ "encoder.layers.7.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
463
+ "encoder.layers.7.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
464
+ "encoder.layers.7.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
465
+ "encoder.layers.7.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
466
+ "encoder.layers.7.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
467
+ "encoder.layers.7.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
468
+ "encoder.layers.7.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
469
+ "encoder.layers.7.post_norm.bias": "model-00002-of-00003.safetensors",
470
+ "encoder.layers.7.post_norm.weight": "model-00002-of-00003.safetensors",
471
+ "encoder.layers.8.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
472
+ "encoder.layers.8.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
473
+ "encoder.layers.8.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
474
+ "encoder.layers.8.attn.to_kv.weight": "model-00002-of-00003.safetensors",
475
+ "encoder.layers.8.attn.to_out.bias": "model-00002-of-00003.safetensors",
476
+ "encoder.layers.8.attn.to_out.weight": "model-00002-of-00003.safetensors",
477
+ "encoder.layers.8.attn.to_q.weight": "model-00002-of-00003.safetensors",
478
+ "encoder.layers.8.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
479
+ "encoder.layers.8.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
480
+ "encoder.layers.8.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
481
+ "encoder.layers.8.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
482
+ "encoder.layers.8.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
483
+ "encoder.layers.8.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
484
+ "encoder.layers.8.conv.down_conv.bias": "model-00002-of-00003.safetensors",
485
+ "encoder.layers.8.conv.down_conv.weight": "model-00002-of-00003.safetensors",
486
+ "encoder.layers.8.conv.norm.bias": "model-00002-of-00003.safetensors",
487
+ "encoder.layers.8.conv.norm.weight": "model-00002-of-00003.safetensors",
488
+ "encoder.layers.8.conv.up_conv.bias": "model-00002-of-00003.safetensors",
489
+ "encoder.layers.8.conv.up_conv.weight": "model-00002-of-00003.safetensors",
490
+ "encoder.layers.8.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
491
+ "encoder.layers.8.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
492
+ "encoder.layers.8.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
493
+ "encoder.layers.8.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
494
+ "encoder.layers.8.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
495
+ "encoder.layers.8.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
496
+ "encoder.layers.8.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
497
+ "encoder.layers.8.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
498
+ "encoder.layers.8.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
499
+ "encoder.layers.8.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
500
+ "encoder.layers.8.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
501
+ "encoder.layers.8.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
502
+ "encoder.layers.8.post_norm.bias": "model-00002-of-00003.safetensors",
503
+ "encoder.layers.8.post_norm.weight": "model-00002-of-00003.safetensors",
504
+ "encoder.layers.9.attn.pre_norm.bias": "model-00002-of-00003.safetensors",
505
+ "encoder.layers.9.attn.pre_norm.weight": "model-00002-of-00003.safetensors",
506
+ "encoder.layers.9.attn.rel_pos_emb.weight": "model-00002-of-00003.safetensors",
507
+ "encoder.layers.9.attn.to_kv.weight": "model-00002-of-00003.safetensors",
508
+ "encoder.layers.9.attn.to_out.bias": "model-00002-of-00003.safetensors",
509
+ "encoder.layers.9.attn.to_out.weight": "model-00002-of-00003.safetensors",
510
+ "encoder.layers.9.attn.to_q.weight": "model-00002-of-00003.safetensors",
511
+ "encoder.layers.9.conv.batch_norm.bias": "model-00002-of-00003.safetensors",
512
+ "encoder.layers.9.conv.batch_norm.num_batches_tracked": "model-00002-of-00003.safetensors",
513
+ "encoder.layers.9.conv.batch_norm.running_mean": "model-00002-of-00003.safetensors",
514
+ "encoder.layers.9.conv.batch_norm.running_var": "model-00002-of-00003.safetensors",
515
+ "encoder.layers.9.conv.batch_norm.weight": "model-00002-of-00003.safetensors",
516
+ "encoder.layers.9.conv.depth_conv.conv.weight": "model-00002-of-00003.safetensors",
517
+ "encoder.layers.9.conv.down_conv.bias": "model-00002-of-00003.safetensors",
518
+ "encoder.layers.9.conv.down_conv.weight": "model-00002-of-00003.safetensors",
519
+ "encoder.layers.9.conv.norm.bias": "model-00002-of-00003.safetensors",
520
+ "encoder.layers.9.conv.norm.weight": "model-00002-of-00003.safetensors",
521
+ "encoder.layers.9.conv.up_conv.bias": "model-00002-of-00003.safetensors",
522
+ "encoder.layers.9.conv.up_conv.weight": "model-00002-of-00003.safetensors",
523
+ "encoder.layers.9.ff1.down_proj.bias": "model-00002-of-00003.safetensors",
524
+ "encoder.layers.9.ff1.down_proj.weight": "model-00002-of-00003.safetensors",
525
+ "encoder.layers.9.ff1.pre_norm.bias": "model-00002-of-00003.safetensors",
526
+ "encoder.layers.9.ff1.pre_norm.weight": "model-00002-of-00003.safetensors",
527
+ "encoder.layers.9.ff1.up_proj.bias": "model-00002-of-00003.safetensors",
528
+ "encoder.layers.9.ff1.up_proj.weight": "model-00002-of-00003.safetensors",
529
+ "encoder.layers.9.ff2.down_proj.bias": "model-00002-of-00003.safetensors",
530
+ "encoder.layers.9.ff2.down_proj.weight": "model-00002-of-00003.safetensors",
531
+ "encoder.layers.9.ff2.pre_norm.bias": "model-00002-of-00003.safetensors",
532
+ "encoder.layers.9.ff2.pre_norm.weight": "model-00002-of-00003.safetensors",
533
+ "encoder.layers.9.ff2.up_proj.bias": "model-00002-of-00003.safetensors",
534
+ "encoder.layers.9.ff2.up_proj.weight": "model-00002-of-00003.safetensors",
535
+ "encoder.layers.9.post_norm.bias": "model-00002-of-00003.safetensors",
536
+ "encoder.layers.9.post_norm.weight": "model-00002-of-00003.safetensors",
537
+ "encoder.out.bias": "model-00003-of-00003.safetensors",
538
+ "encoder.out.weight": "model-00003-of-00003.safetensors",
539
+ "encoder.out_mid.bias": "model-00003-of-00003.safetensors",
540
+ "encoder.out_mid.weight": "model-00003-of-00003.safetensors",
541
+ "language_model.model.embed_tokens.weight": "model-00001-of-00003.safetensors",
542
+ "language_model.model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
543
+ "language_model.model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
544
+ "language_model.model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
545
+ "language_model.model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
546
+ "language_model.model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
547
+ "language_model.model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
548
+ "language_model.model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
549
+ "language_model.model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
550
+ "language_model.model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
551
+ "language_model.model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors",
552
+ "language_model.model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
553
+ "language_model.model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
554
+ "language_model.model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
555
+ "language_model.model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
556
+ "language_model.model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
557
+ "language_model.model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
558
+ "language_model.model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
559
+ "language_model.model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
560
+ "language_model.model.layers.10.input_layernorm.weight": "model-00001-of-00003.safetensors",
561
+ "language_model.model.layers.10.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
562
+ "language_model.model.layers.10.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
563
+ "language_model.model.layers.10.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
564
+ "language_model.model.layers.10.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
565
+ "language_model.model.layers.10.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
566
+ "language_model.model.layers.10.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
567
+ "language_model.model.layers.10.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
568
+ "language_model.model.layers.10.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
569
+ "language_model.model.layers.11.input_layernorm.weight": "model-00001-of-00003.safetensors",
570
+ "language_model.model.layers.11.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
571
+ "language_model.model.layers.11.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
572
+ "language_model.model.layers.11.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
573
+ "language_model.model.layers.11.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
574
+ "language_model.model.layers.11.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
575
+ "language_model.model.layers.11.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
576
+ "language_model.model.layers.11.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
577
+ "language_model.model.layers.11.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
578
+ "language_model.model.layers.12.input_layernorm.weight": "model-00001-of-00003.safetensors",
579
+ "language_model.model.layers.12.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
580
+ "language_model.model.layers.12.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
581
+ "language_model.model.layers.12.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
582
+ "language_model.model.layers.12.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
583
+ "language_model.model.layers.12.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
584
+ "language_model.model.layers.12.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
585
+ "language_model.model.layers.12.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
586
+ "language_model.model.layers.12.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
587
+ "language_model.model.layers.13.input_layernorm.weight": "model-00001-of-00003.safetensors",
588
+ "language_model.model.layers.13.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
589
+ "language_model.model.layers.13.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
590
+ "language_model.model.layers.13.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
591
+ "language_model.model.layers.13.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
592
+ "language_model.model.layers.13.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
593
+ "language_model.model.layers.13.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
594
+ "language_model.model.layers.13.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
595
+ "language_model.model.layers.13.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
596
+ "language_model.model.layers.14.input_layernorm.weight": "model-00001-of-00003.safetensors",
597
+ "language_model.model.layers.14.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
598
+ "language_model.model.layers.14.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
599
+ "language_model.model.layers.14.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
600
+ "language_model.model.layers.14.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
601
+ "language_model.model.layers.14.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
602
+ "language_model.model.layers.14.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
603
+ "language_model.model.layers.14.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
604
+ "language_model.model.layers.14.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
605
+ "language_model.model.layers.15.input_layernorm.weight": "model-00001-of-00003.safetensors",
606
+ "language_model.model.layers.15.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
607
+ "language_model.model.layers.15.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
608
+ "language_model.model.layers.15.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
609
+ "language_model.model.layers.15.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
610
+ "language_model.model.layers.15.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
611
+ "language_model.model.layers.15.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
612
+ "language_model.model.layers.15.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
613
+ "language_model.model.layers.15.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
614
+ "language_model.model.layers.16.input_layernorm.weight": "model-00001-of-00003.safetensors",
615
+ "language_model.model.layers.16.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
616
+ "language_model.model.layers.16.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
617
+ "language_model.model.layers.16.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
618
+ "language_model.model.layers.16.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
619
+ "language_model.model.layers.16.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
620
+ "language_model.model.layers.16.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
621
+ "language_model.model.layers.16.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
622
+ "language_model.model.layers.16.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
623
+ "language_model.model.layers.17.input_layernorm.weight": "model-00001-of-00003.safetensors",
624
+ "language_model.model.layers.17.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
625
+ "language_model.model.layers.17.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
626
+ "language_model.model.layers.17.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
627
+ "language_model.model.layers.17.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
628
+ "language_model.model.layers.17.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
629
+ "language_model.model.layers.17.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
630
+ "language_model.model.layers.17.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
631
+ "language_model.model.layers.17.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
632
+ "language_model.model.layers.18.input_layernorm.weight": "model-00001-of-00003.safetensors",
633
+ "language_model.model.layers.18.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
634
+ "language_model.model.layers.18.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
635
+ "language_model.model.layers.18.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
636
+ "language_model.model.layers.18.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
637
+ "language_model.model.layers.18.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
638
+ "language_model.model.layers.18.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
639
+ "language_model.model.layers.18.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
640
+ "language_model.model.layers.18.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
641
+ "language_model.model.layers.19.input_layernorm.weight": "model-00001-of-00003.safetensors",
642
+ "language_model.model.layers.19.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
643
+ "language_model.model.layers.19.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
644
+ "language_model.model.layers.19.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
645
+ "language_model.model.layers.19.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
646
+ "language_model.model.layers.19.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
647
+ "language_model.model.layers.19.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
648
+ "language_model.model.layers.19.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
649
+ "language_model.model.layers.19.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
650
+ "language_model.model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors",
651
+ "language_model.model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
652
+ "language_model.model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
653
+ "language_model.model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
654
+ "language_model.model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
655
+ "language_model.model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
656
+ "language_model.model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
657
+ "language_model.model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
658
+ "language_model.model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
659
+ "language_model.model.layers.20.input_layernorm.weight": "model-00001-of-00003.safetensors",
660
+ "language_model.model.layers.20.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
661
+ "language_model.model.layers.20.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
662
+ "language_model.model.layers.20.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
663
+ "language_model.model.layers.20.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
664
+ "language_model.model.layers.20.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
665
+ "language_model.model.layers.20.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
666
+ "language_model.model.layers.20.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
667
+ "language_model.model.layers.20.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
668
+ "language_model.model.layers.21.input_layernorm.weight": "model-00001-of-00003.safetensors",
669
+ "language_model.model.layers.21.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
670
+ "language_model.model.layers.21.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
671
+ "language_model.model.layers.21.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
672
+ "language_model.model.layers.21.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
673
+ "language_model.model.layers.21.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
674
+ "language_model.model.layers.21.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
675
+ "language_model.model.layers.21.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
676
+ "language_model.model.layers.21.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
677
+ "language_model.model.layers.22.input_layernorm.weight": "model-00002-of-00003.safetensors",
678
+ "language_model.model.layers.22.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
679
+ "language_model.model.layers.22.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
680
+ "language_model.model.layers.22.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
681
+ "language_model.model.layers.22.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
682
+ "language_model.model.layers.22.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
683
+ "language_model.model.layers.22.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
684
+ "language_model.model.layers.22.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
685
+ "language_model.model.layers.22.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
686
+ "language_model.model.layers.23.input_layernorm.weight": "model-00002-of-00003.safetensors",
687
+ "language_model.model.layers.23.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
688
+ "language_model.model.layers.23.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
689
+ "language_model.model.layers.23.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
690
+ "language_model.model.layers.23.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
691
+ "language_model.model.layers.23.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
692
+ "language_model.model.layers.23.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
693
+ "language_model.model.layers.23.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
694
+ "language_model.model.layers.23.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
695
+ "language_model.model.layers.24.input_layernorm.weight": "model-00002-of-00003.safetensors",
696
+ "language_model.model.layers.24.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
697
+ "language_model.model.layers.24.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
698
+ "language_model.model.layers.24.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
699
+ "language_model.model.layers.24.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
700
+ "language_model.model.layers.24.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
701
+ "language_model.model.layers.24.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
702
+ "language_model.model.layers.24.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
703
+ "language_model.model.layers.24.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
704
+ "language_model.model.layers.25.input_layernorm.weight": "model-00002-of-00003.safetensors",
705
+ "language_model.model.layers.25.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
706
+ "language_model.model.layers.25.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
707
+ "language_model.model.layers.25.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
708
+ "language_model.model.layers.25.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
709
+ "language_model.model.layers.25.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
710
+ "language_model.model.layers.25.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
711
+ "language_model.model.layers.25.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
712
+ "language_model.model.layers.25.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
713
+ "language_model.model.layers.26.input_layernorm.weight": "model-00002-of-00003.safetensors",
714
+ "language_model.model.layers.26.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
715
+ "language_model.model.layers.26.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
716
+ "language_model.model.layers.26.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
717
+ "language_model.model.layers.26.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
718
+ "language_model.model.layers.26.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
719
+ "language_model.model.layers.26.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
720
+ "language_model.model.layers.26.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
721
+ "language_model.model.layers.26.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
722
+ "language_model.model.layers.27.input_layernorm.weight": "model-00002-of-00003.safetensors",
723
+ "language_model.model.layers.27.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
724
+ "language_model.model.layers.27.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
725
+ "language_model.model.layers.27.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
726
+ "language_model.model.layers.27.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
727
+ "language_model.model.layers.27.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
728
+ "language_model.model.layers.27.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
729
+ "language_model.model.layers.27.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
730
+ "language_model.model.layers.27.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
731
+ "language_model.model.layers.28.input_layernorm.weight": "model-00002-of-00003.safetensors",
732
+ "language_model.model.layers.28.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
733
+ "language_model.model.layers.28.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
734
+ "language_model.model.layers.28.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
735
+ "language_model.model.layers.28.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
736
+ "language_model.model.layers.28.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
737
+ "language_model.model.layers.28.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
738
+ "language_model.model.layers.28.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
739
+ "language_model.model.layers.28.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
740
+ "language_model.model.layers.29.input_layernorm.weight": "model-00002-of-00003.safetensors",
741
+ "language_model.model.layers.29.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
742
+ "language_model.model.layers.29.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
743
+ "language_model.model.layers.29.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
744
+ "language_model.model.layers.29.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
745
+ "language_model.model.layers.29.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
746
+ "language_model.model.layers.29.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
747
+ "language_model.model.layers.29.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
748
+ "language_model.model.layers.29.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
749
+ "language_model.model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors",
750
+ "language_model.model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
751
+ "language_model.model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
752
+ "language_model.model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
753
+ "language_model.model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
754
+ "language_model.model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
755
+ "language_model.model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
756
+ "language_model.model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
757
+ "language_model.model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
758
+ "language_model.model.layers.30.input_layernorm.weight": "model-00002-of-00003.safetensors",
759
+ "language_model.model.layers.30.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
760
+ "language_model.model.layers.30.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
761
+ "language_model.model.layers.30.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
762
+ "language_model.model.layers.30.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
763
+ "language_model.model.layers.30.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
764
+ "language_model.model.layers.30.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
765
+ "language_model.model.layers.30.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
766
+ "language_model.model.layers.30.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
767
+ "language_model.model.layers.31.input_layernorm.weight": "model-00002-of-00003.safetensors",
768
+ "language_model.model.layers.31.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
769
+ "language_model.model.layers.31.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
770
+ "language_model.model.layers.31.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
771
+ "language_model.model.layers.31.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
772
+ "language_model.model.layers.31.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
773
+ "language_model.model.layers.31.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
774
+ "language_model.model.layers.31.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
775
+ "language_model.model.layers.31.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
776
+ "language_model.model.layers.32.input_layernorm.weight": "model-00002-of-00003.safetensors",
777
+ "language_model.model.layers.32.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
778
+ "language_model.model.layers.32.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
779
+ "language_model.model.layers.32.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
780
+ "language_model.model.layers.32.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
781
+ "language_model.model.layers.32.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
782
+ "language_model.model.layers.32.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
783
+ "language_model.model.layers.32.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
784
+ "language_model.model.layers.32.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
785
+ "language_model.model.layers.33.input_layernorm.weight": "model-00002-of-00003.safetensors",
786
+ "language_model.model.layers.33.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
787
+ "language_model.model.layers.33.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
788
+ "language_model.model.layers.33.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
789
+ "language_model.model.layers.33.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
790
+ "language_model.model.layers.33.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
791
+ "language_model.model.layers.33.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
792
+ "language_model.model.layers.33.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
793
+ "language_model.model.layers.33.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
794
+ "language_model.model.layers.34.input_layernorm.weight": "model-00002-of-00003.safetensors",
795
+ "language_model.model.layers.34.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
796
+ "language_model.model.layers.34.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
797
+ "language_model.model.layers.34.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
798
+ "language_model.model.layers.34.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
799
+ "language_model.model.layers.34.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
800
+ "language_model.model.layers.34.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
801
+ "language_model.model.layers.34.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
802
+ "language_model.model.layers.34.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
803
+ "language_model.model.layers.35.input_layernorm.weight": "model-00002-of-00003.safetensors",
804
+ "language_model.model.layers.35.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
805
+ "language_model.model.layers.35.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
806
+ "language_model.model.layers.35.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
807
+ "language_model.model.layers.35.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
808
+ "language_model.model.layers.35.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
809
+ "language_model.model.layers.35.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
810
+ "language_model.model.layers.35.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
811
+ "language_model.model.layers.35.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
812
+ "language_model.model.layers.36.input_layernorm.weight": "model-00002-of-00003.safetensors",
813
+ "language_model.model.layers.36.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
814
+ "language_model.model.layers.36.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
815
+ "language_model.model.layers.36.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
816
+ "language_model.model.layers.36.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
817
+ "language_model.model.layers.36.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
818
+ "language_model.model.layers.36.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
819
+ "language_model.model.layers.36.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
820
+ "language_model.model.layers.36.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
821
+ "language_model.model.layers.37.input_layernorm.weight": "model-00002-of-00003.safetensors",
822
+ "language_model.model.layers.37.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
823
+ "language_model.model.layers.37.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
824
+ "language_model.model.layers.37.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
825
+ "language_model.model.layers.37.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
826
+ "language_model.model.layers.37.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
827
+ "language_model.model.layers.37.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
828
+ "language_model.model.layers.37.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
829
+ "language_model.model.layers.37.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
830
+ "language_model.model.layers.38.input_layernorm.weight": "model-00002-of-00003.safetensors",
831
+ "language_model.model.layers.38.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
832
+ "language_model.model.layers.38.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
833
+ "language_model.model.layers.38.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
834
+ "language_model.model.layers.38.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
835
+ "language_model.model.layers.38.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
836
+ "language_model.model.layers.38.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
837
+ "language_model.model.layers.38.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
838
+ "language_model.model.layers.38.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
839
+ "language_model.model.layers.39.input_layernorm.weight": "model-00002-of-00003.safetensors",
840
+ "language_model.model.layers.39.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
841
+ "language_model.model.layers.39.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
842
+ "language_model.model.layers.39.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
843
+ "language_model.model.layers.39.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
844
+ "language_model.model.layers.39.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
845
+ "language_model.model.layers.39.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
846
+ "language_model.model.layers.39.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
847
+ "language_model.model.layers.39.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
848
+ "language_model.model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors",
849
+ "language_model.model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
850
+ "language_model.model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
851
+ "language_model.model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
852
+ "language_model.model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
853
+ "language_model.model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
854
+ "language_model.model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
855
+ "language_model.model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
856
+ "language_model.model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
857
+ "language_model.model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors",
858
+ "language_model.model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
859
+ "language_model.model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
860
+ "language_model.model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
861
+ "language_model.model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
862
+ "language_model.model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
863
+ "language_model.model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
864
+ "language_model.model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
865
+ "language_model.model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
866
+ "language_model.model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors",
867
+ "language_model.model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
868
+ "language_model.model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
869
+ "language_model.model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
870
+ "language_model.model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
871
+ "language_model.model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
872
+ "language_model.model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
873
+ "language_model.model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
874
+ "language_model.model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
875
+ "language_model.model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors",
876
+ "language_model.model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
877
+ "language_model.model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
878
+ "language_model.model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
879
+ "language_model.model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
880
+ "language_model.model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
881
+ "language_model.model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
882
+ "language_model.model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
883
+ "language_model.model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
884
+ "language_model.model.layers.8.input_layernorm.weight": "model-00001-of-00003.safetensors",
885
+ "language_model.model.layers.8.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
886
+ "language_model.model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
887
+ "language_model.model.layers.8.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
888
+ "language_model.model.layers.8.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
889
+ "language_model.model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
890
+ "language_model.model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
891
+ "language_model.model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
892
+ "language_model.model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
893
+ "language_model.model.layers.9.input_layernorm.weight": "model-00001-of-00003.safetensors",
894
+ "language_model.model.layers.9.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
895
+ "language_model.model.layers.9.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
896
+ "language_model.model.layers.9.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
897
+ "language_model.model.layers.9.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
898
+ "language_model.model.layers.9.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
899
+ "language_model.model.layers.9.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
900
+ "language_model.model.layers.9.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
901
+ "language_model.model.layers.9.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
902
+ "language_model.model.norm.weight": "model-00002-of-00003.safetensors",
903
+ "projector.linear.bias": "model-00003-of-00003.safetensors",
904
+ "projector.linear.weight": "model-00003-of-00003.safetensors",
905
+ "projector.qformer.encoder.layer.0.attention.attention.key.bias": "model-00003-of-00003.safetensors",
906
+ "projector.qformer.encoder.layer.0.attention.attention.key.weight": "model-00003-of-00003.safetensors",
907
+ "projector.qformer.encoder.layer.0.attention.attention.query.bias": "model-00003-of-00003.safetensors",
908
+ "projector.qformer.encoder.layer.0.attention.attention.query.weight": "model-00003-of-00003.safetensors",
909
+ "projector.qformer.encoder.layer.0.attention.attention.value.bias": "model-00003-of-00003.safetensors",
910
+ "projector.qformer.encoder.layer.0.attention.attention.value.weight": "model-00003-of-00003.safetensors",
911
+ "projector.qformer.encoder.layer.0.attention.output.LayerNorm.bias": "model-00003-of-00003.safetensors",
912
+ "projector.qformer.encoder.layer.0.attention.output.LayerNorm.weight": "model-00003-of-00003.safetensors",
913
+ "projector.qformer.encoder.layer.0.attention.output.dense.bias": "model-00003-of-00003.safetensors",
914
+ "projector.qformer.encoder.layer.0.attention.output.dense.weight": "model-00003-of-00003.safetensors",
915
+ "projector.qformer.encoder.layer.0.crossattention.attention.key.bias": "model-00003-of-00003.safetensors",
916
+ "projector.qformer.encoder.layer.0.crossattention.attention.key.weight": "model-00003-of-00003.safetensors",
917
+ "projector.qformer.encoder.layer.0.crossattention.attention.query.bias": "model-00003-of-00003.safetensors",
918
+ "projector.qformer.encoder.layer.0.crossattention.attention.query.weight": "model-00003-of-00003.safetensors",
919
+ "projector.qformer.encoder.layer.0.crossattention.attention.value.bias": "model-00003-of-00003.safetensors",
920
+ "projector.qformer.encoder.layer.0.crossattention.attention.value.weight": "model-00003-of-00003.safetensors",
921
+ "projector.qformer.encoder.layer.0.crossattention.output.LayerNorm.bias": "model-00003-of-00003.safetensors",
922
+ "projector.qformer.encoder.layer.0.crossattention.output.LayerNorm.weight": "model-00003-of-00003.safetensors",
923
+ "projector.qformer.encoder.layer.0.crossattention.output.dense.bias": "model-00003-of-00003.safetensors",
924
+ "projector.qformer.encoder.layer.0.crossattention.output.dense.weight": "model-00003-of-00003.safetensors",
925
+ "projector.qformer.encoder.layer.0.intermediate_query.dense.bias": "model-00003-of-00003.safetensors",
926
+ "projector.qformer.encoder.layer.0.intermediate_query.dense.weight": "model-00003-of-00003.safetensors",
927
+ "projector.qformer.encoder.layer.0.output_query.LayerNorm.bias": "model-00003-of-00003.safetensors",
928
+ "projector.qformer.encoder.layer.0.output_query.LayerNorm.weight": "model-00003-of-00003.safetensors",
929
+ "projector.qformer.encoder.layer.0.output_query.dense.bias": "model-00003-of-00003.safetensors",
930
+ "projector.qformer.encoder.layer.0.output_query.dense.weight": "model-00003-of-00003.safetensors",
931
+ "projector.qformer.encoder.layer.1.attention.attention.key.bias": "model-00003-of-00003.safetensors",
932
+ "projector.qformer.encoder.layer.1.attention.attention.key.weight": "model-00003-of-00003.safetensors",
933
+ "projector.qformer.encoder.layer.1.attention.attention.query.bias": "model-00003-of-00003.safetensors",
934
+ "projector.qformer.encoder.layer.1.attention.attention.query.weight": "model-00003-of-00003.safetensors",
935
+ "projector.qformer.encoder.layer.1.attention.attention.value.bias": "model-00003-of-00003.safetensors",
936
+ "projector.qformer.encoder.layer.1.attention.attention.value.weight": "model-00003-of-00003.safetensors",
937
+ "projector.qformer.encoder.layer.1.attention.output.LayerNorm.bias": "model-00003-of-00003.safetensors",
938
+ "projector.qformer.encoder.layer.1.attention.output.LayerNorm.weight": "model-00003-of-00003.safetensors",
939
+ "projector.qformer.encoder.layer.1.attention.output.dense.bias": "model-00003-of-00003.safetensors",
940
+ "projector.qformer.encoder.layer.1.attention.output.dense.weight": "model-00003-of-00003.safetensors",
941
+ "projector.qformer.encoder.layer.1.crossattention.attention.key.bias": "model-00003-of-00003.safetensors",
942
+ "projector.qformer.encoder.layer.1.crossattention.attention.key.weight": "model-00003-of-00003.safetensors",
943
+ "projector.qformer.encoder.layer.1.crossattention.attention.query.bias": "model-00003-of-00003.safetensors",
944
+ "projector.qformer.encoder.layer.1.crossattention.attention.query.weight": "model-00003-of-00003.safetensors",
945
+ "projector.qformer.encoder.layer.1.crossattention.attention.value.bias": "model-00003-of-00003.safetensors",
946
+ "projector.qformer.encoder.layer.1.crossattention.attention.value.weight": "model-00003-of-00003.safetensors",
947
+ "projector.qformer.encoder.layer.1.crossattention.output.LayerNorm.bias": "model-00003-of-00003.safetensors",
948
+ "projector.qformer.encoder.layer.1.crossattention.output.LayerNorm.weight": "model-00003-of-00003.safetensors",
949
+ "projector.qformer.encoder.layer.1.crossattention.output.dense.bias": "model-00003-of-00003.safetensors",
950
+ "projector.qformer.encoder.layer.1.crossattention.output.dense.weight": "model-00003-of-00003.safetensors",
951
+ "projector.qformer.encoder.layer.1.intermediate_query.dense.bias": "model-00003-of-00003.safetensors",
952
+ "projector.qformer.encoder.layer.1.intermediate_query.dense.weight": "model-00003-of-00003.safetensors",
953
+ "projector.qformer.encoder.layer.1.output_query.LayerNorm.bias": "model-00003-of-00003.safetensors",
954
+ "projector.qformer.encoder.layer.1.output_query.LayerNorm.weight": "model-00003-of-00003.safetensors",
955
+ "projector.qformer.encoder.layer.1.output_query.dense.bias": "model-00003-of-00003.safetensors",
956
+ "projector.qformer.encoder.layer.1.output_query.dense.weight": "model-00003-of-00003.safetensors",
957
+ "projector.qformer.layernorm.bias": "model-00003-of-00003.safetensors",
958
+ "projector.qformer.layernorm.weight": "model-00003-of-00003.safetensors",
959
+ "projector.query": "model-00003-of-00003.safetensors"
960
+ }
961
+ }
model.sig ADDED
@@ -0,0 +1 @@
 
 
1
+ {"mediaType":"application/vnd.dev.sigstore.bundle.v0.3+json","verificationMaterial":{"certificate":{"rawBytes":"MIIC4zCCAmmgAwIBAgIUdyMDhcTRCJ5nxnx4+D7aSwGX+jQwCgYIKoZIzj0EAwMwNzEVMBMGA1UEChMMc2lnc3RvcmUuZGV2MR4wHAYDVQQDExVzaWdzdG9yZS1pbnRlcm1lZGlhdGUwHhcNMjYwNDI5MTQwNTExWhcNMjYwNDI5MTQxNTExWjAAMFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEtL+ibg3TGKZXRrWDCPykjxiS7Tcl8unONnDBjhXlZf/QdJmXcVpzh98Zn33+1tnzfv4VRncInxtyjKlqP/n4nqOCAYgwggGEMA4GA1UdDwEB/wQEAwIHgDATBgNVHSUEDDAKBggrBgEFBQcDAzAdBgNVHQ4EFgQUVBwejKWid2blG7gECmQ8nXdvhSowHwYDVR0jBBgwFoAU39Ppz1YkEZb5qNjpKFWixi4YZD8wIgYDVR0RAQH/BBgwFoEUR3Jhbml0ZS1zaWduQGlibS5jb20wNAYKKwYBBAGDvzABAQQmaHR0cHM6Ly9zaWdzdG9yZS52ZXJpZnkuaWJtLmNvbS9vYXV0aDIwNgYKKwYBBAGDvzABCAQoDCZodHRwczovL3NpZ3N0b3JlLnZlcmlmeS5pYm0uY29tL29hdXRoMjCBigYKKwYBBAHWeQIEAgR8BHoAeAB2AN09MGrGxxEyYxkeHJlnNwKiSl643jyt/4eKcoAvKe6OAAABndmO2k0AAAQDAEcwRQIgJERD5l1/3gZseBUIqAzWalStyLN0dGJtShScgbqxB78CIQDbhzX9WB9gVKwXsUhxtDG7uHuEHMu50ta8Bhd0dj5MvzAKBggqhkjOPQQDAwNoADBlAjB7EmWvftmLv+O/yBDJ4AWC7UXOjazuKe9QHeYhxGNHNUnqIf04oI/8v7fqNr+VDUQCMQCmc3WtR+XE966CwOhSmzfQKML0FhTdU0cVzEZxXQtD4WZ3+IWumMCWHGoOiXfQ4j4="},"tlogEntries":[{"logIndex":"1401712462","logId":{"keyId":"wNI9atQGlz+VWfO6LRygH4QUfY/8W4RFwiT5i5WRgB0="},"kindVersion":{"kind":"dsse","version":"0.0.1"},"integratedTime":"1777471511","inclusionPromise":{"signedEntryTimestamp":"MEUCIQDn7d+bjYJK8X+lHBoROCGii71IERNFfon2YIjZAMlJ0gIgBT/pZSz7mMAECG+30teGdObU6Q3GWVcUNqsRSpv1AP0="},"inclusionProof":{"logIndex":"1279808200","rootHash":"pXcrxvq/zcwGUOjyr1yQzRBj9r83n612+BoBUftiJuM=","treeSize":"1279808207","hashes":["9g7ioKrk3Rp+DCpcUZtGIdygoj+Y6/U1fFtmo4JZpLk=","Me6EdSYwfjNL40IIgq73Obyiua/KLRS+nhoQ/Q4t/NU=","zMWriW3oGgRzAAb74dHqXSEf5JVHwCF9E7mJB7pXJRY=","UShKpOTD6XTAgwxT5Fg/O4i2oNBS8tZ38uLrkSP4/1M=","yy0h4WR2/BxXFEpe7BZRrOlOy/ks7JHGTrDWCPCoj9A=","j+3a2J2BVscXcgnoYo5NbtvVjEdPpAocY0KFcmtnS2U=","yt+wav3mKvzKs2yKc2VwNW6tRIpQ2hyFbR20GFREHzM=","XNEL1Y7Hey1LV0cTUrotQytYHNyqLVydBwYyeO4/3NA=","JdFHhy4beJOIn6UvDpQlK7zuJZRI1JQLnL4eTXzIDMc=","5RkPOw/UmluMtjuvzF/Gug2fNGcCK6n7DWqjdSgjos8=","d9hA39Ot2M7fkyE+rWh4D5tn70iuQ9bWZMetFQz1ePk=","wa5W79zKcyNncVVFXx8PM8785J+n0U0qxiK2GXKz2Hk=","7y22/OdvnNTJ3gzz57WEW6D/mmmrLXV0dVQyDwenx5A=","DOCeoSMovIvLExkhIvisow9AuNXgeWs4ECkyR6EcqYU="],"checkpoint":{"envelope":"rekor.sigstore.dev - 1193050959916656506\n1279808207\npXcrxvq/zcwGUOjyr1yQzRBj9r83n612+BoBUftiJuM=\n\n— rekor.sigstore.dev wNI9ajBFAiEA9slI/8MUBfXFwQOguZyk3ydIbXxvaGZNLhFJnc+UDosCIAzhMcoZ1yyiStPp2Nm8h1iQVvWw0NCLuwMOfLCZgcnx\n"}},"canonicalizedBody":"eyJhcGlWZXJzaW9uIjoiMC4wLjEiLCJraW5kIjoiZHNzZSIsInNwZWMiOnsiZW52ZWxvcGVIYXNoIjp7ImFsZ29yaXRobSI6InNoYTI1NiIsInZhbHVlIjoiZDBhMmJmZTI0NTVlMzE1ZWVkNWRhYWY3NWZhYzE4NjY2MmFlZmYxODBlOGM4MGIzNzBmOWUzZWMxM2E0MjliNCJ9LCJwYXlsb2FkSGFzaCI6eyJhbGdvcml0aG0iOiJzaGEyNTYiLCJ2YWx1ZSI6ImI0NjM3ZGJjNTk0M2NjMjA2NGU4ZTdhYWUwMmE5MTI4OTNmM2M3MDIzNTg1ZjY3M2Q5MDU0NTBkY2E1OTZlNTgifSwic2lnbmF0dXJlcyI6W3sic2lnbmF0dXJlIjoiTUVZQ0lRRHRBNTJZTklONmQ2c0RMdnZReS9vM3g4blJSMXE4SC8yd0E5bWJRcWFPdUFJaEFOQUZWU2tEcm01UktMeDNZU3VOcmVRdmwrSW43ckt2OHR0aDQ3bUgxU1o1IiwidmVyaWZpZXIiOiJMUzB0TFMxQ1JVZEpUaUJEUlZKVVNVWkpRMEZVUlMwdExTMHRDazFKU1VNMGVrTkRRVzF0WjBGM1NVSkJaMGxWWkhsTlJHaGpWRkpEU2pWdWVHNTROQ3RFTjJGVGQwZFlLMnBSZDBObldVbExiMXBKZW1vd1JVRjNUWGNLVG5wRlZrMUNUVWRCTVZWRlEyaE5UV015Ykc1ak0xSjJZMjFWZFZwSFZqSk5ValIzU0VGWlJGWlJVVVJGZUZaNllWZGtlbVJIT1hsYVV6RndZbTVTYkFwamJURnNXa2RzYUdSSFZYZElhR05PVFdwWmQwNUVTVFZOVkZGM1RsUkZlRmRvWTA1TmFsbDNUa1JKTlUxVVVYaE9WRVY0VjJwQlFVMUdhM2RGZDFsSUNrdHZXa2w2YWpCRFFWRlpTVXR2V2tsNmFqQkVRVkZqUkZGblFVVjBUQ3RwWW1jelZFZExXbGhTY2xkRVExQjVhMnA0YVZNM1ZHTnNPSFZ1VDA1dVJFSUthbWhZYkZwbUwxRmtTbTFZWTFad2VtZzVPRnB1TXpNck1YUnVlbVoyTkZaU2JtTkpibmgwZVdwTGJIRlFMMjQwYm5GUFEwRlpaM2RuWjBkRlRVRTBSd3BCTVZWa1JIZEZRaTkzVVVWQmQwbElaMFJCVkVKblRsWklVMVZGUkVSQlMwSm5aM0pDWjBWR1FsRmpSRUY2UVdSQ1owNVdTRkUwUlVablVWVldRbmRsQ21wTFYybGtNbUpzUnpkblJVTnRVVGh1V0dSMmFGTnZkMGgzV1VSV1VqQnFRa0puZDBadlFWVXpPVkJ3ZWpGWmEwVmFZalZ4VG1wd1MwWlhhWGhwTkZrS1drUTRkMGxuV1VSV1VqQlNRVkZJTDBKQ1ozZEdiMFZWVWpOS2FHSnRiREJhVXpGNllWZGtkVkZIYkdsaVV6VnFZakl3ZDA1QldVdExkMWxDUWtGSFJBcDJla0ZDUVZGUmJXRklVakJqU0UwMlRIazVlbUZYWkhwa1J6bDVXbE0xTWxwWVNuQmFibXQxWVZkS2RFeHRUblppVXpsMldWaFdNR0ZFU1hkT1oxbExDa3QzV1VKQ1FVZEVkbnBCUWtOQlVXOUVRMXB2WkVoU2QyTjZiM1pNTTA1d1dqTk9NR0l6U214TWJscHNZMjFzYldWVE5YQlpiVEIxV1RJNWRFd3lPV2dLWkZoU2IwMXFRMEpwWjFsTFMzZFpRa0pCU0ZkbFVVbEZRV2RTT0VKSWIwRmxRVUl5UVU0d09VMUhja2Q0ZUVWNVdYaHJaVWhLYkc1T2QwdHBVMncyTkFvemFubDBMelJsUzJOdlFYWkxaVFpQUVVGQlFtNWtiVTh5YXpCQlFVRlJSRUZGWTNkU1VVbG5Ta1ZTUkRWc01TOHpaMXB6WlVKVlNYRkJlbGRoYkZOMENubE1UakJrUjBwMFUyaFRZMmRpY1hoQ056aERTVkZFWW1oNldEbFhRamxuVmt0M1dITlZhSGgwUkVjM2RVaDFSVWhOZFRVd2RHRTRRbWhrTUdScU5VMEtkbnBCUzBKblozRm9hMnBQVUZGUlJFRjNUbTlCUkVKc1FXcENOMFZ0VjNabWRHMU1kaXRQTDNsQ1JFbzBRVmRETjFWWVQycGhlblZMWlRsUlNHVlphQXA0UjA1SVRsVnVjVWxtTURSdlNTODRkamRtY1U1eUsxWkVWVkZEVFZGRGJXTXpWM1JTSzFoRk9UWTJRM2RQYUZOdGVtWlJTMDFNTUVab1ZHUlZNR05XQ25wRlduaFlVWFJFTkZkYU15dEpWM1Z0VFVOWFNFZHZUMmxZWmxFMGFqUTlDaTB0TFMwdFJVNUVJRU5GVWxSSlJrbERRVlJGTFMwdExTMEsifV19fQ=="}],"timestampVerificationData":{"rfc3161Timestamps":[{"signedTimestamp":"MIIE6jADAgEAMIIE4QYJKoZIhvcNAQcCoIIE0jCCBM4CAQMxDTALBglghkgBZQMEAgEwgcIGCyqGSIb3DQEJEAEEoIGyBIGvMIGsAgEBBgkrBgEEAYO/MAIwMTANBglghkgBZQMEAgEFAAQgDIdYmqikLId7vUz4P+XXeWBEP8Gq1HPoyTTa3lhDwAECFCZeeuHUEB0uUpKG/PojEUruI0unGA8yMDI2MDQyOTE0MDUxMVowAwIBAQIJAOx9JzKbbVcfoDKkMDAuMRUwEwYDVQQKEwxzaWdzdG9yZS5kZXYxFTATBgNVBAMTDHNpZ3N0b3JlLXRzYaCCAhQwggIQMIIBlqADAgECAhQ6E1QvDJBh7rzBQy/Lio6LKiOLDDAKBggqhkjOPQQDAzA5MRUwEwYDVQQKEwxzaWdzdG9yZS5kZXYxIDAeBgNVBAMTF3NpZ3N0b3JlLXRzYS1zZWxmc2lnbmVkMB4XDTI1MDQwODA2NTk0M1oXDTM1MDQwNjA2NTk0M1owLjEVMBMGA1UEChMMc2lnc3RvcmUuZGV2MRUwEwYDVQQDEwxzaWdzdG9yZS10c2EwdjAQBgcqhkjOPQIBBgUrgQQAIgNiAATitrZnyEo2KDZP2QWMIBOgYbfSOTL5ZC/cHMv6Yq+HVIo1H9TC7Cx80KDiyvKhgB3wTqKyi9UDczhqg12b1AOLnRnydMTK+qB8M+1MjBci1+Jb8AV/VXu7CRuQCiPTHFyjajBoMA4GA1UdDwEB/wQEAwIHgDAdBgNVHQ4EFgQUif15Q4fP0GVGwwJGxyxzW3206wMwHwYDVR0jBBgwFoAUmOwB73+7Uf/UlR5vioiYUweJzr8wFgYDVR0lAQH/BAwwCgYIKwYBBQUHAwgwCgYIKoZIzj0EAwMDaAAwZQIwO2mxX/opo7SrIX9QyxfZpJRcpAV2gZOm1AZzR+2rVyy6Uc8Ybp2ybIw13ckH4bcRAjEA5qO8FyOkmYpvg2/7ZNqiPxRzn5vqKHoVcIIqtpKq6l7TvOqzAxxclN7VwTG8e++XMYIB2zCCAdcCAQEwUTA5MRUwEwYDVQQKEwxzaWdzdG9yZS5kZXYxIDAeBgNVBAMTF3NpZ3N0b3JlLXRzYS1zZWxmc2lnbmVkAhQ6E1QvDJBh7rzBQy/Lio6LKiOLDDALBglghkgBZQMEAgGggfwwGgYJKoZIhvcNAQkDMQ0GCyqGSIb3DQEJEAEEMBwGCSqGSIb3DQEJBTEPFw0yNjA0MjkxNDA1MTFaMC8GCSqGSIb3DQEJBDEiBCAYLU3UeOTovAYLP6snqgyVvFTtfWwYfY4PKgftTSBVXzCBjgYLKoZIhvcNAQkQAi8xfzB9MHsweQQghfknvAerYsrDtENWwQ78gbLGiD/aernm2HDZ0TrNBbcwVTA9pDswOTEVMBMGA1UEChMMc2lnc3RvcmUuZGV2MSAwHgYDVQQDExdzaWdzdG9yZS10c2Etc2VsZnNpZ25lZAIUOhNULwyQYe68wUMvy4qOiyojiwwwCgYIKoZIzj0EAwIEZzBlAjEAwtBzMR4y3Kq0V601T3cLrORS/nWhmC2BuswpqvudbkQr2UOKja+YGu973r9GGOnGAjBxeZFlirrEGdcs/ZgKaTUH2nXoSlQBCwD6MY/az8h99i14ULNNlr4nNnCqpM4LH0E="}]}},"dsseEnvelope":{"payload":"ewogICJfdHlwZSI6ICJodHRwczovL2luLXRvdG8uaW8vU3RhdGVtZW50L3YxIiwKICAic3ViamVjdCI6IFsKICAgIHsKICAgICAgIm5hbWUiOiAiZ3Jhbml0ZS1zcGVlY2gtNC4xLTJiLXBsdXMiLAogICAgICAiZGlnZXN0IjogewogICAgICAgICJzaGEyNTYiOiAiZWM0YTA5MDdlNjVkZGI3N2JjZjA0MGZhNDJmMzMyNmI3NzdhNGFiYjVmYzFmMDRmMDg0MGFhNjA2OTAxMjNlNyIKICAgICAgfQogICAgfQogIF0sCiAgInByZWRpY2F0ZVR5cGUiOiAiaHR0cHM6Ly9tb2RlbF9zaWduaW5nL3NpZ25hdHVyZS92MS4wIiwKICAicHJlZGljYXRlIjogewogICAgInNlcmlhbGl6YXRpb24iOiB7CiAgICAgICJoYXNoX3R5cGUiOiAic2hhMjU2IiwKICAgICAgImlnbm9yZV9wYXRocyI6IFsKICAgICAgICAiLmdpdGh1YiIsCiAgICAgICAgIi5naXRhdHRyaWJ1dGVzIiwKICAgICAgICAiLmdpdGlnbm9yZSIsCiAgICAgICAgIm1vZGVsLnNpZyIsCiAgICAgICAgIi5jYWNoZSIsCiAgICAgICAgIi5naXQiCiAgICAgIF0sCiAgICAgICJhbGxvd19zeW1saW5rcyI6IGZhbHNlLAogICAgICAibWV0aG9kIjogImZpbGVzIgogICAgfSwKICAgICJyZXNvdXJjZXMiOiBbCiAgICAgIHsKICAgICAgICAiZGlnZXN0IjogIjc5N2JkZmZhOTFlNDdlMDE5ZWI3ZGQ3M2MzODY1NDliN2ZlZWVjOTMwMGQzOWJlZTk3ZGVkNzE1MmVmYWYxOTAiLAogICAgICAgICJuYW1lIjogIlJFQURNRS5tZCIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAiZGlnZXN0IjogImI3MmUyMTZmMDZiOTRhZmRjN2NjZTUwOWI3MThhNjMxNGFiYzQ3YzI4MWNiZDkwZDQwNWE1YzU4Nzg2ZGEzMmUiLAogICAgICAgICJuYW1lIjogImNoYXRfdGVtcGxhdGUuamluamEiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICJkM2ZjZmNjNzRkMjIyZjQ1MGQ5MzNjNTk4MjlhODE0ZmQ3ZjA3NjBmNDQzYmYwMjU2YTA2Y2FkODg5MTJjZGI1IiwKICAgICAgICAibmFtZSI6ICJjb25maWcuanNvbiIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAiZGlnZXN0IjogImUyNjE0ODY2ZmEyNzY0M2U4YzI2NWFhMmM3NTc5NDAwNzY5YWFlNThmNDkyNTFkNGJlMzZiZTYzMGY5YTFhZDYiLAogICAgICAgICJuYW1lIjogImdlbmVyYXRpb25fY29uZmlnLmpzb24iLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICJhZjQ1MTA1YmE5NTVlM2E3OTZmMzljM2NkZGM2ZmVhZTlmYjQ2OTZiNDZlOTlmMTgzNTVkZjlkN2M4YmRiMGJhIiwKICAgICAgICAibmFtZSI6ICJtb2RlbC0wMDAwMS1vZi0wMDAwMy5zYWZldGVuc29ycyIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAiZGlnZXN0IjogIjE3MmJkZGNiMGI5ZmU0ZTU5YjQzMDJlZWNjNDc4YmJlNWZiNDc3NzU5YjgwYTUyZTQ3NmI0M2I1NWM5NDkzYTciLAogICAgICAgICJuYW1lIjogIm1vZGVsLTAwMDAyLW9mLTAwMDAzLnNhZmV0ZW5zb3JzIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJkaWdlc3QiOiAiMTI5OTQ3NzZmN2M5ZTI0Y2RhMzMzOWVlOGE2Y2E2YTA3NjAwZjVhZTRhNGMzOGQ2NjcwM2RjZWZiOGZmNDYyNCIsCiAgICAgICAgIm5hbWUiOiAibW9kZWwtMDAwMDMtb2YtMDAwMDMuc2FmZXRlbnNvcnMiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICI1ZGM3YTY1YjBmMDFjM2ZhMDY5OTI4Y2YwMTZhZDhjMmEwNTlkNWQ3YjY2ZjEyNDgwODNkNDIzMTQ1YzMwZDhkIiwKICAgICAgICAibmFtZSI6ICJtb2RlbC5zYWZldGVuc29ycy5pbmRleC5qc29uIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJkaWdlc3QiOiAiYTlkYThlZjRmMzAxYzZkMTg5YmI1YWJhN2VjZjFmYTY0ZWVlNGE3NTI4NTNhZTM1MzNkMThkZWY3NDE0OWEyNyIsCiAgICAgICAgIm5hbWUiOiAicHJvY2Vzc29yX2NvbmZpZy5qc29uIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJkaWdlc3QiOiAiMmZlYjg1OWJkNzEyM2YwN2YyZmJmYjZlMjFjNjc2OGY1M2E5NDVhMTUyNzQ1ZDA5M2UzZjUyZDkyMGVmNjczNSIsCiAgICAgICAgIm5hbWUiOiAidG9rZW5pemVyLmpzb24iLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICJiYjNkYWNlMTdkNjI1NmMzM2I4M2Y0YWNlZjRkYjc3NzcxYzIwMjQ5N2M4NDJjMjI3NmU3OTQ0OTZlMTBiYTFmIiwKICAgICAgICAibmFtZSI6ICJ0b2tlbml6ZXJfY29uZmlnLmpzb24iLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9CiAgICBdCiAgfQp9","payloadType":"application/vnd.in-toto+json","signatures":[{"sig":"MEYCIQDtA52YNIN6d6sDLvvQy/o3x8nRR1q8H/2wA9mbQqaOuAIhANAFVSkDrm5RKLx3YSuNreQvl+In7rKv8tth47mH1SZ5"}]}}
processor_config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "audio_processor": {
3
+ "feature_extractor_type": "GraniteSpeechFeatureExtractor",
4
+ "melspec_kwargs": {
5
+ "hop_length": 160,
6
+ "n_fft": 512,
7
+ "n_mels": 80,
8
+ "sample_rate": 16000,
9
+ "win_length": 400
10
+ },
11
+ "projector_downsample_rate": 5,
12
+ "projector_window_size": 15,
13
+ "sampling_rate": 16000
14
+ },
15
+ "audio_token": "<|audio|>",
16
+ "processor_class": "GraniteSpeechProcessor"
17
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "audio_token": "<|audio|>",
4
+ "backend": "tokenizers",
5
+ "bos_token": "<|end_of_text|>",
6
+ "clean_up_tokenization_spaces": false,
7
+ "eos_token": "<|end_of_text|>",
8
+ "errors": "replace",
9
+ "is_local": true,
10
+ "local_files_only": false,
11
+ "model_max_length": 1000000000000000019884624838656,
12
+ "model_specific_special_tokens": {
13
+ "audio_token": "<|audio|>"
14
+ },
15
+ "pad_token": "<|pad|>",
16
+ "padding_side": "left",
17
+ "processor_class": "GraniteSpeechProcessor",
18
+ "tokenizer_class": "GPT2Tokenizer",
19
+ "unk_token": "<|unk|>"
20
+ }