End of training

Browse files

Files changed (7) hide show

README.md +146 -195
adapter.top.safetensors +3 -0
config.json +107 -0
model.safetensors +3 -0
preprocessor_config.json +9 -0
training_args.bin +3 -0
vocab.json +1 -83

README.md CHANGED Viewed

@@ -1,199 +1,150 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+license: cc-by-nc-4.0
+base_model: facebook/mms-1b-all
+tags:
+- generated_from_trainer
+metrics:
+- wer
+model-index:
+- name: ssc-top-mms-model-mix-adapt-max3-devtrain
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# ssc-top-mms-model-mix-adapt-max3-devtrain
+This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5486
+- Cer: 0.1274
+- Wer: 0.4912
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0005
+- train_batch_size: 8
+- eval_batch_size: 6
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 20
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch   | Step  | Validation Loss | Cer    | Wer    |
+|:-------------:|:-------:|:-----:|:---------------:|:------:|:------:|
+| 4.5395        | 0.2349  | 200   | 3.9902          | 0.8483 | 0.9864 |
+| 1.2516        | 0.4698  | 400   | 0.7499          | 0.1577 | 0.6582 |
+| 1.0661        | 0.7046  | 600   | 0.7080          | 0.1548 | 0.6384 |
+| 0.9816        | 0.9395  | 800   | 0.7173          | 0.1468 | 0.5960 |
+| 1.038         | 1.1738  | 1000  | 0.6984          | 0.1446 | 0.5940 |
+| 0.9052        | 1.4087  | 1200  | 0.6845          | 0.1446 | 0.5777 |
+| 0.9644        | 1.6436  | 1400  | 0.6873          | 0.1437 | 0.5722 |
+| 1.0056        | 1.8784  | 1600  | 0.6882          | 0.1438 | 0.5827 |
+| 0.8508        | 2.1127  | 1800  | 0.6833          | 0.1477 | 0.6014 |
+| 0.9788        | 2.3476  | 2000  | 0.6626          | 0.1394 | 0.5594 |
+| 1.0398        | 2.5825  | 2200  | 0.6562          | 0.1423 | 0.5792 |
+| 0.9474        | 2.8174  | 2400  | 0.6690          | 0.1397 | 0.5527 |
+| 0.8748        | 3.0517  | 2600  | 0.6417          | 0.1387 | 0.5516 |
+| 0.8954        | 3.2866  | 2800  | 0.6287          | 0.1394 | 0.5543 |
+| 0.8888        | 3.5214  | 3000  | 0.6554          | 0.1396 | 0.5520 |
+| 0.8773        | 3.7563  | 3200  | 0.6303          | 0.1393 | 0.5601 |
+| 0.9764        | 3.9912  | 3400  | 0.6943          | 0.1414 | 0.5520 |
+| 0.8526        | 4.2255  | 3600  | 0.6324          | 0.1385 | 0.5450 |
+| 0.8854        | 4.4604  | 3800  | 0.6152          | 0.1373 | 0.5434 |
+| 0.8665        | 4.6952  | 4000  | 0.6796          | 0.1399 | 0.5524 |
+| 0.9514        | 4.9301  | 4200  | 0.6399          | 0.1386 | 0.5539 |
+| 0.8874        | 5.1644  | 4400  | 0.6498          | 0.1376 | 0.5364 |
+| 0.8003        | 5.3993  | 4600  | 0.6179          | 0.1353 | 0.5344 |
+| 0.8767        | 5.6342  | 4800  | 0.6095          | 0.1360 | 0.5411 |
+| 0.8108        | 5.8691  | 5000  | 0.6238          | 0.1366 | 0.5446 |
+| 0.7463        | 6.1033  | 5200  | 0.6101          | 0.1356 | 0.5399 |
+| 0.8332        | 6.3382  | 5400  | 0.6387          | 0.1358 | 0.5387 |
+| 0.7857        | 6.5731  | 5600  | 0.6476          | 0.1370 | 0.5333 |
+| 0.8174        | 6.8080  | 5800  | 0.6367          | 0.1361 | 0.5399 |
+| 0.8541        | 7.0423  | 6000  | 0.5987          | 0.1345 | 0.5364 |
+| 0.8084        | 7.2772  | 6200  | 0.6049          | 0.1339 | 0.5251 |
+| 0.7698        | 7.5120  | 6400  | 0.5872          | 0.1344 | 0.5282 |
+| 0.7773        | 7.7469  | 6600  | 0.5751          | 0.1321 | 0.5298 |
+| 0.7841        | 7.9818  | 6800  | 0.6204          | 0.1357 | 0.5348 |
+| 0.7404        | 8.2161  | 7000  | 0.5911          | 0.1337 | 0.5309 |
+| 0.7761        | 8.4510  | 7200  | 0.6190          | 0.1328 | 0.5200 |
+| 0.7472        | 8.6858  | 7400  | 0.5623          | 0.1333 | 0.5290 |
+| 0.8122        | 8.9207  | 7600  | 0.6448          | 0.1338 | 0.5251 |
+| 0.7392        | 9.1550  | 7800  | 0.5716          | 0.1336 | 0.5251 |
+| 0.733         | 9.3899  | 8000  | 0.6182          | 0.1336 | 0.5259 |
+| 0.8222        | 9.6248  | 8200  | 0.5997          | 0.1310 | 0.5103 |
+| 0.7215        | 9.8597  | 8400  | 0.6208          | 0.1331 | 0.5189 |
+| 0.7273        | 10.0940 | 8600  | 0.5544          | 0.1316 | 0.5130 |
+| 0.7153        | 10.3288 | 8800  | 0.5911          | 0.1326 | 0.5056 |
+| 0.7107        | 10.5637 | 9000  | 0.5880          | 0.1335 | 0.5278 |
+| 0.6917        | 10.7986 | 9200  | 0.6183          | 0.1324 | 0.5267 |
+| 0.7065        | 11.0329 | 9400  | 0.5724          | 0.1312 | 0.5103 |
+| 0.7393        | 11.2678 | 9600  | 0.6037          | 0.1319 | 0.5130 |
+| 0.7501        | 11.5026 | 9800  | 0.5306          | 0.1299 | 0.5107 |
+| 0.7255        | 11.7375 | 10000 | 0.5773          | 0.1337 | 0.5239 |
+| 0.7202        | 11.9724 | 10200 | 0.5753          | 0.1308 | 0.5068 |
+| 0.6715        | 12.2067 | 10400 | 0.5536          | 0.1298 | 0.5099 |
+| 0.7266        | 12.4416 | 10600 | 0.5584          | 0.1298 | 0.5018 |
+| 0.6818        | 12.6765 | 10800 | 0.5770          | 0.1305 | 0.5091 |
+| 0.7021        | 12.9113 | 11000 | 0.5828          | 0.1307 | 0.5053 |
+| 0.6707        | 13.1456 | 11200 | 0.5881          | 0.1307 | 0.5018 |
+| 0.6441        | 13.3805 | 11400 | 0.5854          | 0.1300 | 0.5049 |
+| 0.6929        | 13.6154 | 11600 | 0.5789          | 0.1310 | 0.5088 |
+| 0.6742        | 13.8503 | 11800 | 0.5326          | 0.1295 | 0.5056 |
+| 0.6978        | 14.0846 | 12000 | 0.5668          | 0.1298 | 0.5041 |
+| 0.6741        | 14.3194 | 12200 | 0.5433          | 0.1295 | 0.5037 |
+| 0.6895        | 14.5543 | 12400 | 0.5489          | 0.1305 | 0.5080 |
+| 0.6458        | 14.7892 | 12600 | 0.5667          | 0.1300 | 0.5045 |
+| 0.6476        | 15.0235 | 12800 | 0.5614          | 0.1294 | 0.5025 |
+| 0.6351        | 15.2584 | 13000 | 0.5554          | 0.1283 | 0.4963 |
+| 0.6495        | 15.4932 | 13200 | 0.5333          | 0.1276 | 0.4947 |
+| 0.668         | 15.7281 | 13400 | 0.5517          | 0.1292 | 0.5006 |
+| 0.6249        | 15.9630 | 13600 | 0.5728          | 0.1288 | 0.4979 |
+| 0.6328        | 16.1973 | 13800 | 0.5503          | 0.1279 | 0.4916 |
+| 0.6642        | 16.4322 | 14000 | 0.5651          | 0.1286 | 0.4920 |
+| 0.6625        | 16.6671 | 14200 | 0.5732          | 0.1279 | 0.4909 |
+| 0.6453        | 16.9019 | 14400 | 0.5684          | 0.1291 | 0.5006 |
+| 0.6289        | 17.1362 | 14600 | 0.5617          | 0.1278 | 0.4963 |
+| 0.5953        | 17.3711 | 14800 | 0.5319          | 0.1269 | 0.4986 |
+| 0.6608        | 17.6060 | 15000 | 0.5409          | 0.1279 | 0.4951 |
+| 0.6275        | 17.8409 | 15200 | 0.5484          | 0.1280 | 0.4916 |
+| 0.647         | 18.0752 | 15400 | 0.5523          | 0.1279 | 0.4975 |
+| 0.639         | 18.3100 | 15600 | 0.5525          | 0.1275 | 0.4998 |
+| 0.614         | 18.5449 | 15800 | 0.5374          | 0.1272 | 0.4940 |
+| 0.6266        | 18.7798 | 16000 | 0.5463          | 0.1281 | 0.4951 |
+| 0.6486        | 19.0141 | 16200 | 0.5441          | 0.1271 | 0.4944 |
+| 0.6409        | 19.2490 | 16400 | 0.5552          | 0.1271 | 0.4920 |
+| 0.6283        | 19.4839 | 16600 | 0.5486          | 0.1272 | 0.4944 |
+| 0.6115        | 19.7187 | 16800 | 0.5507          | 0.1274 | 0.4955 |
+| 0.6125        | 19.9536 | 17000 | 0.5486          | 0.1274 | 0.4912 |
+### Framework versions
+- Transformers 4.52.1
+- Pytorch 2.9.1+cu128
+- Datasets 3.6.0
+- Tokenizers 0.21.4

adapter.top.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:560bf5a90a4bae1416fa07819ff6a1606ac974f66cb826ff6b93f6175de30716
+size 9054748

config.json ADDED Viewed

	@@ -0,0 +1,107 @@

+{
+  "activation_dropout": 0.05,
+  "adapter_attn_dim": 16,
+  "adapter_kernel_size": 3,
+  "adapter_stride": 2,
+  "add_adapter": false,
+  "apply_spec_augment": true,
+  "architectures": [
+    "Wav2Vec2ForCTC"
+  ],
+  "attention_dropout": 0.0,
+  "bos_token_id": 1,
+  "classifier_proj_size": 256,
+  "codevector_dim": 1024,
+  "contrastive_logits_temperature": 0.1,
+  "conv_bias": true,
+  "conv_dim": [
+    512,
+    512,
+    512,
+    512,
+    512,
+    512,
+    512
+  ],
+  "conv_kernel": [
+    10,
+    3,
+    3,
+    3,
+    3,
+    2,
+    2
+  ],
+  "conv_stride": [
+    5,
+    2,
+    2,
+    2,
+    2,
+    2,
+    2
+  ],
+  "ctc_loss_reduction": "mean",
+  "ctc_zero_infinity": false,
+  "diversity_loss_weight": 0.1,
+  "do_stable_layer_norm": true,
+  "eos_token_id": 2,
+  "feat_extract_activation": "gelu",
+  "feat_extract_dropout": 0.0,
+  "feat_extract_norm": "layer",
+  "feat_proj_dropout": 0.0,
+  "feat_quantizer_dropout": 0.0,
+  "final_dropout": 0.05,
+  "hidden_act": "gelu",
+  "hidden_dropout": 0.0,
+  "hidden_size": 1280,
+  "initializer_range": 0.02,
+  "intermediate_size": 5120,
+  "layer_norm_eps": 1e-05,
+  "layerdrop": 0.0,
+  "mask_feature_length": 10,
+  "mask_feature_min_masks": 0,
+  "mask_feature_prob": 0.0,
+  "mask_time_length": 10,
+  "mask_time_min_masks": 2,
+  "mask_time_prob": 0.05,
+  "model_type": "wav2vec2",
+  "num_adapter_layers": 3,
+  "num_attention_heads": 16,
+  "num_codevector_groups": 2,
+  "num_codevectors_per_group": 320,
+  "num_conv_pos_embedding_groups": 16,
+  "num_conv_pos_embeddings": 128,
+  "num_feat_extract_layers": 7,
+  "num_hidden_layers": 48,
+  "num_negatives": 100,
+  "output_hidden_size": 1280,
+  "pad_token_id": 78,
+  "proj_codevector_dim": 1024,
+  "tdnn_dilation": [
+    1,
+    2,
+    3,
+    1,
+    1
+  ],
+  "tdnn_dim": [
+    512,
+    512,
+    512,
+    512,
+    1500
+  ],
+  "tdnn_kernel": [
+    5,
+    3,
+    3,
+    1,
+    1
+  ],
+  "torch_dtype": "float32",
+  "transformers_version": "4.52.1",
+  "use_weighted_layer_sum": false,
+  "vocab_size": 81,
+  "xvector_output_dim": 512
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ca858d04f1e1a4337ecff6306b0a0c13460111ff526016f8a9d3b58afc4b2f64
+size 3859147124

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "do_normalize": true,
+  "feature_extractor_type": "Wav2Vec2FeatureExtractor",
+  "feature_size": 1,
+  "padding_side": "right",
+  "padding_value": 0.0,
+  "return_attention_mask": true,
+  "sampling_rate": 16000
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:57e5461c29eebea9fbf37c89b224a42886153abf34a53aefc4712c450e8fb6e0
+size 5969

vocab.json CHANGED Viewed

@@ -1,83 +1 @@
-{
-  "top": {
-    "\n": 51,
-    "'": 19,
-    "(": 15,
-    ")": 39,
-    "*": 13,
-    "+": 42,
-    ".": 3,
-    "...": 76,
-    "4": 56,
-    "8": 0,
-    "A": 22,
-    "B": 50,
-    "C": 58,
-    "D": 5,
-    "E": 43,
-    "F": 37,
-    "G": 25,
-    "H": 26,
-    "I": 8,
-    "J": 55,
-    "K": 23,
-    "L": 11,
-    "M": 59,
-    "N": 62,
-    "O": 12,
-    "P": 32,
-    "Q": 16,
-    "R": 34,
-    "S": 28,
-    "T": 38,
-    "U": 70,
-    "V": 69,
-    "W": 52,
-    "X": 30,
-    "Z": 2,
-    "[PAD]": 78,
-    "[UNK]": 77,
-    "a": 7,
-    "b": 6,
-    "c": 75,
-    "ch": 68,
-    "d": 48,
-    "e": 49,
-    "f": 71,
-    "g": 47,
-    "h": 4,
-    "i": 73,
-    "j": 1,
-    "k": 53,
-    "kg": 36,
-    "l": 35,
-    "m": 66,
-    "n": 18,
-    "o": 17,
-    "p": 67,
-    "q": 29,
-    "r": 63,
-    "s": 61,
-    "t": 40,
-    "tl": 60,
-    "u": 57,
-    "v": 74,
-    "w": 45,
-    "x": 46,
-    "y": 24,
-    "z": 65,
-    "|": 21,
-    "´": 14,
-    "Á": 31,
-    "É": 27,
-    "á": 9,
-    "é": 72,
-    "í": 10,
-    "ñ": 33,
-    "ó": 20,
-    "ú": 54,
-    "ý": 41,
-    "ʼ": 64,
-    " ": 44
-  }
-}


1	+ {"top": {"8": 0, "j": 1, "Z": 2, ".": 3, "h": 4, "D": 5, "b": 6, "a": 7, "I": 8, "\u00e1": 9, "\u00ed": 10, "L": 11, "O": 12, "*": 13, "\u00b4": 14, "(": 15, "Q": 16, "o": 17, "n": 18, "'": 19, "\u00f3": 20, "A": 22, "K": 23, "y": 24, "G": 25, "H": 26, "\u00c9": 27, "S": 28, "q": 29, "X": 30, "\u00c1": 31, "P": 32, "\u00f1": 33, "R": 34, "l": 35, "kg": 36, "F": 37, "T": 38, ")": 39, "t": 40, "\u00fd": 41, "+": 42, "E": 43, "\u202f": 44, "w": 45, "x": 46, "g": 47, "d": 48, "e": 49, "B": 50, "\n": 51, "W": 52, "k": 53, "\u00fa": 54, "J": 55, "4": 56, "u": 57, "C": 58, "M": 59, "tl": 60, "s": 61, "N": 62, "r": 63, "\u02bc": 64, "z": 65, "m": 66, "p": 67, "ch": 68, "V": 69, "U": 70, "f": 71, "\u00e9": 72, "i": 73, "v": 74, "c": 75, "\|": 21, "...": 76, "[UNK]": 77, "[PAD]": 78}}