End of training

Browse files

Files changed (7) hide show

README.md +137 -195
adapter.ttj.safetensors +3 -0
config.json +107 -0
model.safetensors +3 -0
preprocessor_config.json +9 -0
training_args.bin +3 -0
vocab.json +1 -73

README.md CHANGED Viewed

@@ -1,199 +1,141 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+license: cc-by-nc-4.0
+base_model: facebook/mms-1b-all
+tags:
+- generated_from_trainer
+metrics:
+- wer
+model-index:
+- name: ssc-ttj-mms-model-mix-adapt-max3-devtrain
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# ssc-ttj-mms-model-mix-adapt-max3-devtrain
+This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1190
+- Cer: 0.0590
+- Wer: 0.3410
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0005
+- train_batch_size: 8
+- eval_batch_size: 6
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 100
+- num_epochs: 20
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch   | Step  | Validation Loss | Cer    | Wer    |
+|:-------------:|:-------:|:-----:|:---------------:|:------:|:------:|
+| 0.7128        | 0.2602  | 200   | 0.2917          | 0.0890 | 0.4935 |
+| 0.5556        | 0.5205  | 400   | 0.2645          | 0.0906 | 0.5049 |
+| 0.5374        | 0.7807  | 600   | 0.2462          | 0.0864 | 0.4736 |
+| 0.485         | 1.0403  | 800   | 0.2209          | 0.0818 | 0.4473 |
+| 0.4974        | 1.3006  | 1000  | 0.2127          | 0.0797 | 0.4365 |
+| 0.4915        | 1.5608  | 1200  | 0.2094          | 0.0788 | 0.4319 |
+| 0.4822        | 1.8211  | 1400  | 0.2037          | 0.0767 | 0.4238 |
+| 0.4566        | 2.0807  | 1600  | 0.2007          | 0.0776 | 0.4306 |
+| 0.4499        | 2.3409  | 1800  | 0.1891          | 0.0740 | 0.4121 |
+| 0.4539        | 2.6012  | 2000  | 0.1904          | 0.0746 | 0.4131 |
+| 0.4726        | 2.8614  | 2200  | 0.1835          | 0.0737 | 0.4066 |
+| 0.4478        | 3.1210  | 2400  | 0.1822          | 0.0725 | 0.4034 |
+| 0.4316        | 3.3813  | 2600  | 0.1813          | 0.0729 | 0.4011 |
+| 0.4333        | 3.6415  | 2800  | 0.1782          | 0.0723 | 0.4018 |
+| 0.4448        | 3.9018  | 3000  | 0.1786          | 0.0724 | 0.4002 |
+| 0.3983        | 4.1614  | 3200  | 0.1747          | 0.0718 | 0.3945 |
+| 0.4093        | 4.4216  | 3400  | 0.1691          | 0.0705 | 0.3918 |
+| 0.4011        | 4.6818  | 3600  | 0.1625          | 0.0680 | 0.3827 |
+| 0.4155        | 4.9421  | 3800  | 0.1642          | 0.0693 | 0.3895 |
+| 0.396         | 5.2017  | 4000  | 0.1658          | 0.0693 | 0.3876 |
+| 0.391         | 5.4619  | 4200  | 0.1658          | 0.0692 | 0.3913 |
+| 0.3835        | 5.7222  | 4400  | 0.1624          | 0.0680 | 0.3793 |
+| 0.4102        | 5.9824  | 4600  | 0.1595          | 0.0679 | 0.3832 |
+| 0.3711        | 6.2420  | 4800  | 0.1542          | 0.0673 | 0.3765 |
+| 0.3836        | 6.5023  | 5000  | 0.1533          | 0.0673 | 0.3776 |
+| 0.3727        | 6.7625  | 5200  | 0.1566          | 0.0676 | 0.3804 |
+| 0.3726        | 7.0221  | 5400  | 0.1519          | 0.0665 | 0.3721 |
+| 0.3851        | 7.2824  | 5600  | 0.1482          | 0.0650 | 0.3715 |
+| 0.3535        | 7.5426  | 5800  | 0.1492          | 0.0661 | 0.3722 |
+| 0.3885        | 7.8029  | 6000  | 0.1501          | 0.0655 | 0.3683 |
+| 0.3744        | 8.0625  | 6200  | 0.1452          | 0.0655 | 0.3700 |
+| 0.3388        | 8.3227  | 6400  | 0.1466          | 0.0650 | 0.3683 |
+| 0.3751        | 8.5830  | 6600  | 0.1477          | 0.0658 | 0.3698 |
+| 0.347         | 8.8432  | 6800  | 0.1413          | 0.0643 | 0.3680 |
+| 0.3181        | 9.1028  | 7000  | 0.1416          | 0.0641 | 0.3626 |
+| 0.3619        | 9.3630  | 7200  | 0.1419          | 0.0638 | 0.3609 |
+| 0.3474        | 9.6233  | 7400  | 0.1438          | 0.0644 | 0.3677 |
+| 0.3326        | 9.8835  | 7600  | 0.1375          | 0.0632 | 0.3606 |
+| 0.3288        | 10.1431 | 7800  | 0.1394          | 0.0634 | 0.3595 |
+| 0.3368        | 10.4034 | 8000  | 0.1384          | 0.0638 | 0.3617 |
+| 0.342         | 10.6636 | 8200  | 0.1351          | 0.0629 | 0.3596 |
+| 0.3267        | 10.9239 | 8400  | 0.1342          | 0.0631 | 0.3634 |
+| 0.3062        | 11.1835 | 8600  | 0.1327          | 0.0626 | 0.3578 |
+| 0.3336        | 11.4437 | 8800  | 0.1312          | 0.0621 | 0.3573 |
+| 0.3268        | 11.7040 | 9000  | 0.1329          | 0.0628 | 0.3605 |
+| 0.3336        | 11.9642 | 9200  | 0.1329          | 0.0629 | 0.3573 |
+| 0.3173        | 12.2238 | 9400  | 0.1297          | 0.0622 | 0.3575 |
+| 0.3082        | 12.4841 | 9600  | 0.1301          | 0.0619 | 0.3550 |
+| 0.3188        | 12.7443 | 9800  | 0.1294          | 0.0618 | 0.3546 |
+| 0.3397        | 13.0039 | 10000 | 0.1281          | 0.0615 | 0.3566 |
+| 0.316         | 13.2642 | 10200 | 0.1315          | 0.0620 | 0.3576 |
+| 0.3084        | 13.5244 | 10400 | 0.1282          | 0.0609 | 0.3517 |
+| 0.3152        | 13.7846 | 10600 | 0.1292          | 0.0612 | 0.3533 |
+| 0.2819        | 14.0442 | 10800 | 0.1277          | 0.0610 | 0.3500 |
+| 0.3114        | 14.3045 | 11000 | 0.1255          | 0.0607 | 0.3502 |
+| 0.3106        | 14.5647 | 11200 | 0.1264          | 0.0606 | 0.3473 |
+| 0.2999        | 14.8250 | 11400 | 0.1243          | 0.0608 | 0.3499 |
+| 0.3029        | 15.0846 | 11600 | 0.1243          | 0.0605 | 0.3506 |
+| 0.3004        | 15.3448 | 11800 | 0.1255          | 0.0606 | 0.3515 |
+| 0.2935        | 15.6051 | 12000 | 0.1244          | 0.0608 | 0.3522 |
+| 0.3084        | 15.8653 | 12200 | 0.1232          | 0.0605 | 0.3495 |
+| 0.2796        | 16.1249 | 12400 | 0.1216          | 0.0597 | 0.3458 |
+| 0.2988        | 16.3852 | 12600 | 0.1213          | 0.0594 | 0.3458 |
+| 0.2863        | 16.6454 | 12800 | 0.1221          | 0.0598 | 0.3455 |
+| 0.3014        | 16.9057 | 13000 | 0.1233          | 0.0598 | 0.3436 |
+| 0.2972        | 17.1653 | 13200 | 0.1203          | 0.0589 | 0.3417 |
+| 0.3098        | 17.4255 | 13400 | 0.1214          | 0.0594 | 0.3430 |
+| 0.284         | 17.6858 | 13600 | 0.1206          | 0.0594 | 0.3438 |
+| 0.2847        | 17.9460 | 13800 | 0.1200          | 0.0595 | 0.3448 |
+| 0.2849        | 18.2056 | 14000 | 0.1192          | 0.0588 | 0.3414 |
+| 0.2744        | 18.4658 | 14200 | 0.1189          | 0.0590 | 0.3436 |
+| 0.2737        | 18.7261 | 14400 | 0.1207          | 0.0591 | 0.3423 |
+| 0.2859        | 18.9863 | 14600 | 0.1197          | 0.0590 | 0.3414 |
+| 0.2772        | 19.2459 | 14800 | 0.1187          | 0.0590 | 0.3409 |
+| 0.2844        | 19.5062 | 15000 | 0.1190          | 0.0590 | 0.3417 |
+| 0.2815        | 19.7664 | 15200 | 0.1190          | 0.0590 | 0.3410 |
+### Framework versions
+- Transformers 4.52.1
+- Pytorch 2.9.1+cu128
+- Datasets 3.6.0
+- Tokenizers 0.21.4

adapter.ttj.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d723162f29e3fe48a78f07fcf60bdc13dae57b26e9acf8203c7557e8cadd73fe
+size 9003508

config.json ADDED Viewed

	@@ -0,0 +1,107 @@

+{
+  "activation_dropout": 0.05,
+  "adapter_attn_dim": 16,
+  "adapter_kernel_size": 3,
+  "adapter_stride": 2,
+  "add_adapter": false,
+  "apply_spec_augment": true,
+  "architectures": [
+    "Wav2Vec2ForCTC"
+  ],
+  "attention_dropout": 0.0,
+  "bos_token_id": 1,
+  "classifier_proj_size": 256,
+  "codevector_dim": 1024,
+  "contrastive_logits_temperature": 0.1,
+  "conv_bias": true,
+  "conv_dim": [
+    512,
+    512,
+    512,
+    512,
+    512,
+    512,
+    512
+  ],
+  "conv_kernel": [
+    10,
+    3,
+    3,
+    3,
+    3,
+    2,
+    2
+  ],
+  "conv_stride": [
+    5,
+    2,
+    2,
+    2,
+    2,
+    2,
+    2
+  ],
+  "ctc_loss_reduction": "mean",
+  "ctc_zero_infinity": false,
+  "diversity_loss_weight": 0.1,
+  "do_stable_layer_norm": true,
+  "eos_token_id": 2,
+  "feat_extract_activation": "gelu",
+  "feat_extract_dropout": 0.0,
+  "feat_extract_norm": "layer",
+  "feat_proj_dropout": 0.0,
+  "feat_quantizer_dropout": 0.0,
+  "final_dropout": 0.05,
+  "hidden_act": "gelu",
+  "hidden_dropout": 0.0,
+  "hidden_size": 1280,
+  "initializer_range": 0.02,
+  "intermediate_size": 5120,
+  "layer_norm_eps": 1e-05,
+  "layerdrop": 0.0,
+  "mask_feature_length": 10,
+  "mask_feature_min_masks": 0,
+  "mask_feature_prob": 0.0,
+  "mask_time_length": 10,
+  "mask_time_min_masks": 2,
+  "mask_time_prob": 0.05,
+  "model_type": "wav2vec2",
+  "num_adapter_layers": 3,
+  "num_attention_heads": 16,
+  "num_codevector_groups": 2,
+  "num_codevectors_per_group": 320,
+  "num_conv_pos_embedding_groups": 16,
+  "num_conv_pos_embeddings": 128,
+  "num_feat_extract_layers": 7,
+  "num_hidden_layers": 48,
+  "num_negatives": 100,
+  "output_hidden_size": 1280,
+  "pad_token_id": 68,
+  "proj_codevector_dim": 1024,
+  "tdnn_dilation": [
+    1,
+    2,
+    3,
+    1,
+    1
+  ],
+  "tdnn_dim": [
+    512,
+    512,
+    512,
+    512,
+    1500
+  ],
+  "tdnn_kernel": [
+    5,
+    3,
+    3,
+    1,
+    1
+  ],
+  "torch_dtype": "float32",
+  "transformers_version": "4.52.1",
+  "use_weighted_layer_sum": false,
+  "vocab_size": 71,
+  "xvector_output_dim": 512
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b1897f6f4ddec5ec458281da7522ff3dbf291635b9de7745b6fb50e2da4539d8
+size 3859095884

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+  "do_normalize": true,
+  "feature_extractor_type": "Wav2Vec2FeatureExtractor",
+  "feature_size": 1,
+  "padding_side": "right",
+  "padding_value": 0.0,
+  "return_attention_mask": true,
+  "sampling_rate": 16000
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e6b2a22adc9e078f38755893451a83e88c13278511a2768eb00a7545551059ba
+size 5969

vocab.json CHANGED Viewed

@@ -1,73 +1 @@
-{
-  "ttj": {
-    "&": 43,
-    "'": 22,
-    "(": 44,
-    ")": 29,
-    "-": 2,
-    "...": 66,
-    "0": 19,
-    "1": 57,
-    "2": 23,
-    "8": 53,
-    "A": 33,
-    "B": 24,
-    "C": 20,
-    "D": 62,
-    "E": 65,
-    "F": 0,
-    "G": 28,
-    "H": 55,
-    "I": 5,
-    "K": 58,
-    "L": 41,
-    "M": 1,
-    "N": 52,
-    "O": 10,
-    "P": 16,
-    "Q": 26,
-    "R": 47,
-    "S": 27,
-    "T": 50,
-    "U": 48,
-    "W": 12,
-    "Y": 6,
-    "Z": 21,
-    "[PAD]": 68,
-    "[UNK]": 67,
-    "a": 4,
-    "b": 14,
-    "bb": 17,
-    "c": 56,
-    "d": 39,
-    "e": 32,
-    "f": 34,
-    "g": 51,
-    "h": 35,
-    "i": 13,
-    "j": 18,
-    "k": 36,
-    "l": 40,
-    "m": 25,
-    "n": 59,
-    "ny": 3,
-    "o": 37,
-    "p": 7,
-    "q": 31,
-    "r": 8,
-    "s": 11,
-    "t": 60,
-    "u": 15,
-    "v": 45,
-    "w": 63,
-    "y": 46,
-    "z": 38,
-    "|": 54,
-    "ö": 64,
-    "ü": 9,
-    "ę": 42,
-    "ʼ": 49,
-    "": 61,
-    "‘": 30
-  }
-}


1	+ {"ttj": {"F": 0, "M": 1, "-": 2, "ny": 3, "a": 4, "I": 5, "Y": 6, "p": 7, "r": 8, "\u00fc": 9, "O": 10, "s": 11, "W": 12, "i": 13, "b": 14, "u": 15, "P": 16, "bb": 17, "j": 18, "0": 19, "C": 20, "Z": 21, "'": 22, "2": 23, "B": 24, "m": 25, "Q": 26, "S": 27, "G": 28, ")": 29, "\u2018": 30, "q": 31, "e": 32, "A": 33, "f": 34, "h": 35, "k": 36, "o": 37, "z": 38, "d": 39, "l": 40, "L": 41, "\u0119": 42, "&": 43, "(": 44, "v": 45, "y": 46, "R": 47, "U": 48, "\u02bc": 49, "T": 50, "g": 51, "N": 52, "8": 53, "H": 55, "c": 56, "1": 57, "K": 58, "n": 59, "t": 60, "\u200b": 61, "D": 62, "w": 63, "\u00f6": 64, "E": 65, "\|": 54, "...": 66, "[UNK]": 67, "[PAD]": 68}}