Model save
Browse files
README.md
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: transformers
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
base_model: sharjeel103/whisper-base-urdu
|
| 5 |
+
tags:
|
| 6 |
+
- generated_from_trainer
|
| 7 |
+
metrics:
|
| 8 |
+
- wer
|
| 9 |
+
model-index:
|
| 10 |
+
- name: exp_004_base_multistage_urdu
|
| 11 |
+
results: []
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
| 15 |
+
should probably proofread and complete it, then remove this comment. -->
|
| 16 |
+
|
| 17 |
+
# exp_004_base_multistage_urdu
|
| 18 |
+
|
| 19 |
+
This model is a fine-tuned version of [sharjeel103/whisper-base-urdu](https://huggingface.co/sharjeel103/whisper-base-urdu) on an unknown dataset.
|
| 20 |
+
It achieves the following results on the evaluation set:
|
| 21 |
+
- Loss: 0.3947
|
| 22 |
+
- Wer: 65.7765
|
| 23 |
+
- Wer Ortho: 68.6697
|
| 24 |
+
- Cer: 21.6281
|
| 25 |
+
|
| 26 |
+
## Model description
|
| 27 |
+
|
| 28 |
+
More information needed
|
| 29 |
+
|
| 30 |
+
## Intended uses & limitations
|
| 31 |
+
|
| 32 |
+
More information needed
|
| 33 |
+
|
| 34 |
+
## Training and evaluation data
|
| 35 |
+
|
| 36 |
+
More information needed
|
| 37 |
+
|
| 38 |
+
## Training procedure
|
| 39 |
+
|
| 40 |
+
### Training hyperparameters
|
| 41 |
+
|
| 42 |
+
The following hyperparameters were used during training:
|
| 43 |
+
- learning_rate: 5e-06
|
| 44 |
+
- train_batch_size: 32
|
| 45 |
+
- eval_batch_size: 16
|
| 46 |
+
- seed: 42
|
| 47 |
+
- gradient_accumulation_steps: 2
|
| 48 |
+
- total_train_batch_size: 64
|
| 49 |
+
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 50 |
+
- lr_scheduler_type: linear
|
| 51 |
+
- lr_scheduler_warmup_steps: 1000
|
| 52 |
+
- training_steps: 10000
|
| 53 |
+
|
| 54 |
+
### Training results
|
| 55 |
+
|
| 56 |
+
| Training Loss | Epoch | Step | Validation Loss | Wer | Wer Ortho | Cer |
|
| 57 |
+
|:-------------:|:-------:|:-----:|:---------------:|:-------:|:---------:|:-------:|
|
| 58 |
+
| 5.0263 | 0.6410 | 500 | 1.5195 | 95.4066 | 95.7445 | 33.6187 |
|
| 59 |
+
| 3.5885 | 1.2821 | 1000 | 0.9330 | 89.3228 | 90.2797 | 30.3680 |
|
| 60 |
+
| 3.0302 | 1.9231 | 1500 | 0.7431 | 82.6594 | 84.2995 | 27.0128 |
|
| 61 |
+
| 2.6802 | 2.5641 | 2000 | 0.6501 | 79.3803 | 81.3448 | 26.0944 |
|
| 62 |
+
| 2.4353 | 3.2051 | 2500 | 0.5907 | 76.0507 | 78.3277 | 24.5965 |
|
| 63 |
+
| 2.2853 | 3.8462 | 3000 | 0.5492 | 74.3293 | 76.6987 | 24.3492 |
|
| 64 |
+
| 2.1399 | 4.4872 | 3500 | 0.5184 | 72.7590 | 75.1236 | 23.5756 |
|
| 65 |
+
| 2.0097 | 5.1282 | 4000 | 0.4915 | 70.1810 | 72.8504 | 22.8648 |
|
| 66 |
+
| 1.9407 | 5.7692 | 4500 | 0.4697 | 69.6855 | 72.3975 | 22.6328 |
|
| 67 |
+
| 1.8625 | 6.4103 | 5000 | 0.4527 | 68.6736 | 71.3502 | 22.4363 |
|
| 68 |
+
| 1.7877 | 7.0513 | 5500 | 0.4359 | 68.1740 | 70.8474 | 22.0212 |
|
| 69 |
+
| 1.7349 | 7.6923 | 6000 | 0.4235 | 68.3125 | 71.0136 | 22.4257 |
|
| 70 |
+
| 1.6679 | 8.3333 | 6500 | 0.4130 | 66.2846 | 69.0105 | 21.4575 |
|
| 71 |
+
| 1.6108 | 8.9744 | 7000 | 0.4019 | 66.2804 | 69.1560 | 21.8611 |
|
| 72 |
+
| 1.5329 | 9.6154 | 7500 | 0.3947 | 65.7765 | 68.6697 | 21.6281 |
|
| 73 |
+
| 1.5234 | 10.2564 | 8000 | 0.3889 | 66.0621 | 68.8651 | 21.4800 |
|
| 74 |
+
| 1.5123 | 10.8974 | 8500 | 0.3841 | 66.1796 | 69.0396 | 21.5778 |
|
| 75 |
+
| 1.4779 | 11.5385 | 9000 | 0.3809 | 65.9949 | 68.8900 | 21.5524 |
|
| 76 |
+
| 1.4631 | 12.1795 | 9500 | 0.3787 | 65.9739 | 68.8277 | 21.4915 |
|
| 77 |
+
| 1.4256 | 12.8205 | 10000 | 0.3778 | 65.8857 | 68.7238 | 21.5002 |
|
| 78 |
+
|
| 79 |
+
|
| 80 |
+
### Framework versions
|
| 81 |
+
|
| 82 |
+
- Transformers 5.0.0
|
| 83 |
+
- Pytorch 2.10.0+cu128
|
| 84 |
+
- Datasets 3.6.0
|
| 85 |
+
- Tokenizers 0.22.2
|
experiment_log.txt
CHANGED
|
@@ -3876,3 +3876,186 @@ A custom logits processor of type <class 'transformers.generation.logits_process
|
|
| 3876 |
|
| 3877 |
There were missing keys in the checkpoint model loaded: ['proj_out.weight'].
|
| 3878 |
|
| 3879 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3880 |
0%| | 0/176 [00:00<?, ?it/s]
|
| 3881 |
1%| | 2/176 [00:02<03:01, 1.04s/it]
|
| 3882 |
2%|β | 3/176 [00:04<04:50, 1.68s/it]
|
| 3883 |
2%|β | 4/176 [00:06<05:19, 1.86s/it]
|
| 3884 |
3%|β | 5/176 [00:08<05:29, 1.93s/it]
|
| 3885 |
3%|β | 6/176 [00:10<05:34, 1.97s/it]
|
| 3886 |
4%|β | 7/176 [00:13<05:39, 2.01s/it]
|
| 3887 |
5%|β | 8/176 [00:14<05:35, 2.00s/it]
|
| 3888 |
5%|β | 9/176 [00:17<05:34, 2.00s/it]
|
| 3889 |
6%|β | 10/176 [00:19<05:34, 2.02s/it]
|
| 3890 |
6%|β | 11/176 [00:21<05:36, 2.04s/it]
|
| 3891 |
7%|β | 12/176 [00:23<05:35, 2.04s/it]
|
| 3892 |
7%|β | 13/176 [00:25<05:31, 2.04s/it]
|
| 3893 |
8%|β | 14/176 [00:27<05:26, 2.02s/it]
|
| 3894 |
9%|β | 15/176 [00:29<05:25, 2.02s/it]
|
| 3895 |
9%|β | 16/176 [00:31<05:24, 2.03s/it]
|
| 3896 |
10%|β | 17/176 [00:33<05:26, 2.05s/it]
|
| 3897 |
10%|β | 18/176 [00:35<05:25, 2.06s/it]
|
| 3898 |
11%|β | 19/176 [00:37<05:26, 2.08s/it]
|
| 3899 |
11%|ββ | 20/176 [00:39<05:23, 2.08s/it]
|
| 3900 |
12%|ββ | 21/176 [00:41<05:22, 2.08s/it]
|
| 3901 |
12%|ββ | 22/176 [00:43<05:20, 2.08s/it]
|
| 3902 |
13%|ββ | 23/176 [00:45<05:14, 2.06s/it]
|
| 3903 |
14%|ββ | 24/176 [00:47<05:11, 2.05s/it]
|
| 3904 |
14%|ββ | 25/176 [00:49<05:08, 2.04s/it]
|
| 3905 |
15%|ββ | 26/176 [00:51<05:06, 2.05s/it]
|
| 3906 |
15%|ββ | 27/176 [00:53<05:04, 2.04s/it]
|
| 3907 |
16%|ββ | 28/176 [00:55<05:01, 2.03s/it]
|
| 3908 |
16%|ββ | 29/176 [00:58<05:02, 2.06s/it]
|
| 3909 |
17%|ββ | 30/176 [01:00<05:00, 2.06s/it]
|
| 3910 |
18%|ββ | 31/176 [01:02<04:56, 2.05s/it]
|
| 3911 |
18%|ββ | 32/176 [01:04<04:53, 2.04s/it]
|
| 3912 |
19%|ββ | 33/176 [01:06<04:49, 2.03s/it]
|
| 3913 |
19%|ββ | 34/176 [01:08<04:50, 2.04s/it]
|
| 3914 |
20%|ββ | 35/176 [01:10<04:48, 2.04s/it]
|
| 3915 |
20%|ββ | 36/176 [01:12<04:44, 2.03s/it]
|
| 3916 |
21%|ββ | 37/176 [01:14<04:45, 2.06s/it]
|
| 3917 |
22%|βββ | 38/176 [01:16<04:43, 2.05s/it]
|
| 3918 |
22%|βββ | 39/176 [01:18<04:43, 2.07s/it]
|
| 3919 |
23%|βββ | 40/176 [01:20<04:38, 2.05s/it]
|
| 3920 |
23%|βββ | 41/176 [01:22<04:36, 2.04s/it]
|
| 3921 |
24%|βββ | 42/176 [01:24<04:36, 2.06s/it]
|
| 3922 |
24%|βββ | 43/176 [01:26<04:35, 2.07s/it]
|
| 3923 |
25%|βββ | 44/176 [01:28<04:32, 2.07s/it]
|
| 3924 |
26%|βββ | 45/176 [01:30<04:28, 2.05s/it]
|
| 3925 |
26%|βββ | 46/176 [01:32<04:24, 2.04s/it]
|
| 3926 |
27%|βββ | 47/176 [01:34<04:20, 2.02s/it]
|
| 3927 |
27%|βββ | 48/176 [01:36<04:19, 2.03s/it]
|
| 3928 |
28%|βββ | 49/176 [01:39<04:19, 2.04s/it]
|
| 3929 |
28%|βββ | 50/176 [01:40<04:15, 2.03s/it]
|
| 3930 |
29%|βββ | 51/176 [01:43<04:13, 2.02s/it]
|
| 3931 |
30%|βββ | 52/176 [01:45<04:11, 2.02s/it]
|
| 3932 |
30%|βββ | 53/176 [01:47<04:15, 2.08s/it]
|
| 3933 |
31%|βββ | 54/176 [01:49<04:12, 2.07s/it]
|
| 3934 |
31%|ββββ | 55/176 [01:51<04:10, 2.07s/it]
|
| 3935 |
32%|ββββ | 56/176 [01:53<04:09, 2.08s/it]
|
| 3936 |
32%|ββββ | 57/176 [01:55<04:05, 2.06s/it]
|
| 3937 |
33%|ββββ | 58/176 [01:57<04:02, 2.05s/it]
|
| 3938 |
34%|ββββ | 59/176 [01:59<04:01, 2.06s/it]
|
| 3939 |
34%|ββββ | 60/176 [02:01<03:58, 2.06s/it]
|
| 3940 |
35%|ββββ | 61/176 [02:03<03:55, 2.05s/it]
|
| 3941 |
35%|ββββ | 62/176 [02:05<03:53, 2.05s/it]
|
| 3942 |
36%|ββββ | 63/176 [02:07<03:51, 2.04s/it]
|
| 3943 |
36%|ββββ | 64/176 [02:09<03:48, 2.04s/it]
|
| 3944 |
37%|ββββ | 65/176 [02:11<03:47, 2.05s/it]
|
| 3945 |
38%|ββββ | 66/176 [02:13<03:44, 2.04s/it]
|
| 3946 |
38%|ββββ | 67/176 [02:15<03:42, 2.04s/it]
|
| 3947 |
39%|ββββ | 68/176 [02:18<03:42, 2.06s/it]
|
| 3948 |
39%|ββββ | 69/176 [02:20<03:40, 2.06s/it]
|
| 3949 |
40%|ββββ | 70/176 [02:22<03:34, 2.02s/it]
|
| 3950 |
40%|ββββ | 71/176 [02:24<03:32, 2.02s/it]
|
| 3951 |
41%|ββββ | 72/176 [02:26<03:29, 2.01s/it]
|
| 3952 |
41%|βββββ | 73/176 [02:28<03:31, 2.05s/it]
|
| 3953 |
42%|βββββ | 74/176 [02:30<03:27, 2.03s/it]
|
| 3954 |
43%|βββββ | 75/176 [02:32<03:24, 2.03s/it]
|
| 3955 |
43%|βββββ | 76/176 [02:34<03:23, 2.04s/it]
|
| 3956 |
44%|βββββ | 77/176 [02:36<03:20, 2.03s/it]
|
| 3957 |
44%|βββββ | 78/176 [02:38<03:17, 2.02s/it]
|
| 3958 |
45%|βββββ | 79/176 [02:40<03:16, 2.02s/it]
|
| 3959 |
45%|βββββ | 80/176 [02:42<03:14, 2.03s/it]
|
| 3960 |
46%|βββββ | 81/176 [02:44<03:10, 2.01s/it]
|
| 3961 |
47%|βββββ | 82/176 [02:46<03:09, 2.01s/it]
|
| 3962 |
47%|βββββ | 83/176 [02:48<03:09, 2.04s/it]
|
| 3963 |
48%|βββββ | 84/176 [02:50<03:07, 2.04s/it]
|
| 3964 |
48%|βββββ | 85/176 [02:52<03:04, 2.02s/it]
|
| 3965 |
49%|βββββ | 86/176 [02:54<03:00, 2.01s/it]
|
| 3966 |
49%|βββββ | 87/176 [02:56<02:59, 2.02s/it]
|
| 3967 |
50%|βββββ | 88/176 [02:58<02:58, 2.03s/it]
|
| 3968 |
51%|βββββ | 89/176 [03:00<02:57, 2.04s/it]
|
| 3969 |
51%|βββββ | 90/176 [03:02<02:55, 2.04s/it]
|
| 3970 |
52%|ββββββ | 91/176 [03:04<02:53, 2.04s/it]
|
| 3971 |
52%|ββββββ | 92/176 [03:06<02:49, 2.02s/it]
|
| 3972 |
53%|ββββββ | 93/176 [03:08<02:48, 2.03s/it]
|
| 3973 |
53%|ββββββ | 94/176 [03:10<02:46, 2.03s/it]
|
| 3974 |
54%|ββββββ | 95/176 [03:12<02:46, 2.05s/it]
|
| 3975 |
55%|ββββββ | 96/176 [03:14<02:43, 2.05s/it]
|
| 3976 |
55%|ββββββ | 97/176 [03:16<02:43, 2.07s/it]
|
| 3977 |
56%|ββββββ | 98/176 [03:19<02:40, 2.06s/it]
|
| 3978 |
56%|ββββββ | 99/176 [03:21<02:38, 2.06s/it]
|
| 3979 |
57%|ββββββ | 100/176 [03:23<02:35, 2.04s/it]
|
| 3980 |
57%|ββββββ | 101/176 [03:25<02:32, 2.04s/it]
|
| 3981 |
58%|ββββββ | 102/176 [03:28<03:03, 2.48s/it]
|
| 3982 |
59%|ββββββ | 103/176 [03:30<02:50, 2.34s/it]
|
| 3983 |
59%|ββββββ | 104/176 [03:32<02:42, 2.26s/it]
|
| 3984 |
60%|ββββββ | 105/176 [03:34<02:35, 2.19s/it]
|
| 3985 |
60%|ββββββ | 106/176 [03:36<02:31, 2.17s/it]
|
| 3986 |
61%|ββββββ | 107/176 [03:38<02:26, 2.13s/it]
|
| 3987 |
61%|βββββββ | 108/176 [03:40<02:22, 2.09s/it]
|
| 3988 |
62%|βββββββ | 109/176 [03:42<02:18, 2.07s/it]
|
| 3989 |
62%|βββββββ | 110/176 [03:44<02:15, 2.06s/it]
|
| 3990 |
63%|βββββββ | 111/176 [03:46<02:12, 2.04s/it]
|
| 3991 |
64%|βββββββ | 112/176 [03:48<02:09, 2.02s/it]
|
| 3992 |
64%|βββββββ | 113/176 [03:50<02:07, 2.02s/it]
|
| 3993 |
65%|βββββββ | 114/176 [03:52<02:05, 2.02s/it]
|
| 3994 |
65%|βββββββ | 115/176 [03:54<02:03, 2.02s/it]
|
| 3995 |
66%|βββββββ | 116/176 [03:56<02:01, 2.02s/it]
|
| 3996 |
66%|βββββββ | 117/176 [03:59<01:59, 2.02s/it]
|
| 3997 |
67%|βββββββ | 118/176 [04:01<01:58, 2.04s/it]
|
| 3998 |
68%|βββββββ | 119/176 [04:03<01:56, 2.05s/it]
|
| 3999 |
68%|βββββββ | 120/176 [04:05<01:55, 2.06s/it]
|
| 4000 |
69%|βββββββ | 121/176 [04:07<01:52, 2.05s/it]
|
| 4001 |
69%|βββββββ | 122/176 [04:09<01:50, 2.05s/it]
|
| 4002 |
70%|βββββββ | 123/176 [04:11<01:48, 2.05s/it]
|
| 4003 |
70%|βββββββ | 124/176 [04:13<01:46, 2.05s/it]
|
| 4004 |
71%|βββββββ | 125/176 [04:15<01:44, 2.06s/it]
|
| 4005 |
72%|ββββββββ | 126/176 [04:17<01:42, 2.05s/it]
|
| 4006 |
72%|ββββββββ | 127/176 [04:19<01:40, 2.04s/it]
|
| 4007 |
73%|ββββββββ | 128/176 [04:21<01:38, 2.04s/it]
|
| 4008 |
73%|ββββββββ | 129/176 [04:23<01:35, 2.03s/it]
|
| 4009 |
74%|ββββββββ | 130/176 [04:25<01:34, 2.05s/it]
|
| 4010 |
74%|ββββββββ | 131/176 [04:27<01:32, 2.05s/it]
|
| 4011 |
75%|ββββββββ | 132/176 [04:29<01:30, 2.06s/it]
|
| 4012 |
76%|ββββββββ | 133/176 [04:31<01:28, 2.05s/it]
|
| 4013 |
76%|ββββββββ | 134/176 [04:33<01:26, 2.05s/it]
|
| 4014 |
77%|ββββββββ | 135/176 [04:36<01:25, 2.08s/it]
|
| 4015 |
77%|ββββββββ | 136/176 [04:38<01:22, 2.07s/it]
|
| 4016 |
78%|ββββββββ | 137/176 [04:40<01:20, 2.06s/it]
|
| 4017 |
78%|ββββββββ | 138/176 [04:42<01:18, 2.07s/it]
|
| 4018 |
79%|ββββββββ | 139/176 [04:44<01:15, 2.05s/it]
|
| 4019 |
80%|ββββββββ | 140/176 [04:46<01:13, 2.05s/it]
|
| 4020 |
80%|ββββββββ | 141/176 [04:48<01:11, 2.04s/it]
|
| 4021 |
81%|ββββββββ | 142/176 [04:50<01:09, 2.03s/it]
|
| 4022 |
81%|βββββββββ | 143/176 [04:52<01:06, 2.02s/it]
|
| 4023 |
82%|βββββββββ | 144/176 [04:54<01:04, 2.02s/it]
|
| 4024 |
82%|βββββββββ | 145/176 [04:56<01:02, 2.01s/it]
|
| 4025 |
83%|βββββββββ | 146/176 [04:58<01:00, 2.02s/it]
|
| 4026 |
84%|βββββββββ | 147/176 [05:00<00:59, 2.04s/it]
|
| 4027 |
84%|βββββββββ | 148/176 [05:02<00:57, 2.04s/it]
|
| 4028 |
85%|βββββββββ | 149/176 [05:04<00:55, 2.06s/it]
|
| 4029 |
85%|βββββββββ | 150/176 [05:06<00:53, 2.06s/it]
|
| 4030 |
86%|βββββββββ | 151/176 [05:08<00:51, 2.07s/it]
|
| 4031 |
86%|βββββββββ | 152/176 [05:10<00:49, 2.08s/it]
|
| 4032 |
87%|βββββββββ | 153/176 [05:13<00:48, 2.11s/it]
|
| 4033 |
88%|βββββββββ | 154/176 [05:15<00:45, 2.09s/it]
|
| 4034 |
88%|βββββββββ | 155/176 [05:17<00:43, 2.09s/it]
|
| 4035 |
89%|βββββββββ | 156/176 [05:19<00:41, 2.08s/it]
|
| 4036 |
89%|βββββββββ | 157/176 [05:21<00:39, 2.06s/it]
|
| 4037 |
90%|βββββββββ | 158/176 [05:23<00:37, 2.07s/it]
|
| 4038 |
90%|βββββββββ | 159/176 [05:25<00:35, 2.09s/it]
|
| 4039 |
91%|βββββββββ | 160/176 [05:27<00:33, 2.08s/it]
|
| 4040 |
91%|βββοΏ½οΏ½ββββββ| 161/176 [05:29<00:31, 2.08s/it]
|
| 4041 |
92%|ββββββββββ| 162/176 [05:31<00:29, 2.07s/it]
|
| 4042 |
93%|ββββββββββ| 163/176 [05:33<00:26, 2.07s/it]
|
| 4043 |
93%|ββββββββββ| 164/176 [05:35<00:24, 2.06s/it]
|
| 4044 |
94%|ββββββββββ| 165/176 [05:37<00:23, 2.11s/it]
|
| 4045 |
94%|ββββββββββ| 166/176 [05:40<00:21, 2.12s/it]
|
| 4046 |
95%|ββββββββββ| 167/176 [05:42<00:19, 2.12s/it]
|
| 4047 |
95%|ββββββββββ| 168/176 [05:44<00:16, 2.11s/it]
|
| 4048 |
96%|ββββββββββ| 169/176 [05:46<00:14, 2.09s/it]
|
| 4049 |
97%|ββββββββββ| 170/176 [05:48<00:12, 2.08s/it]
|
| 4050 |
97%|ββββββββββ| 171/176 [05:50<00:10, 2.07s/it]
|
| 4051 |
98%|ββββββββββ| 172/176 [05:52<00:08, 2.06s/it]
|
| 4052 |
98%|ββββββββββ| 173/176 [05:54<00:06, 2.08s/it]
|
| 4053 |
99%|ββββββββββ| 174/176 [05:56<00:04, 2.07s/it]
|
| 4054 |
99%|ββββββββββ| 175/176 [05:57<00:01, 1.66s/it]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4055 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4056 |
...ge_urdu/model.safetensors: 14%|ββ | 41.9MB / 290MB [A[A[A
|
|
|
|
|
|
|
| 4057 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4058 |
...ge_urdu/model.safetensors: 14%|ββ | 41.9MB / 290MB [A[A[A
|
|
|
|
|
|
|
| 4059 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4060 |
...ge_urdu/model.safetensors: 26%|βββ | 75.4MB / 290MB [A[A[A
|
|
|
|
|
|
|
| 4061 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4062 |
...ge_urdu/model.safetensors: 38%|ββββ | 109MB / 290MB [A[A[A
|
|
|
|
|
|
|
| 4063 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4064 |
...ge_urdu/model.safetensors: 49%|βββββ | 143MB / 290MB [A[A[A
|
|
|
|
|
|
|
| 4065 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4066 |
...ge_urdu/model.safetensors: 58%|ββββββ | 168MB / 290MB [A[A[A
|
|
|
|
|
|
|
| 4067 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4068 |
...ge_urdu/model.safetensors: 69%|βββββββ | 201MB / 290MB [A[A[A
|
|
|
|
|
|
|
| 4069 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4070 |
...ge_urdu/model.safetensors: 81%|ββββββββ | 235MB / 290MB [A[A[A
|
|
|
|
|
|
|
| 4071 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4072 |
...ge_urdu/model.safetensors: 92%|ββββββββββ| 268MB / 290MB [A[A[A
|
|
|
|
|
|
|
| 4073 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4074 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4075 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4076 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
|
|
|
|
|
|
| 4077 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4078 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4079 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4080 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
|
|
|
|
|
|
|
|
|
| 4081 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4082 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4083 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4084 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
|
|
|
|
|
|
| 4085 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4086 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4087 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4088 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
|
|
|
|
|
|
| 4089 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4090 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4091 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4092 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
|
|
|
|
|
|
| 4093 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4094 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4095 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4096 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
|
|
|
|
|
|
| 4097 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4098 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4099 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4100 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
|
|
|
|
|
|
| 4101 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
|
|
|
|
|
|
|
|
|
| 4102 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4103 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4104 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
|
|
|
|
|
|
| 4105 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB
|
|
|
|
| 4106 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB
|
|
|
|
| 4107 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB
|
|
|
|
| 4108 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B
|
|
|
|
| 3876 |
|
| 3877 |
There were missing keys in the checkpoint model loaded: ['proj_out.weight'].
|
| 3878 |
|
| 3879 |
|
| 3880 |
+
{'train_runtime': '7.646e+04', 'train_samples_per_second': '8.371', 'train_steps_per_second': '0.131', 'train_loss': '2.422', 'epoch': '12.82'}
|
| 3881 |
+
|
| 3882 |
+
Running final evaluation...
|
| 3883 |
+
|
| 3884 |
0%| | 0/176 [00:00<?, ?it/s]
|
| 3885 |
1%| | 2/176 [00:02<03:01, 1.04s/it]
|
| 3886 |
2%|β | 3/176 [00:04<04:50, 1.68s/it]
|
| 3887 |
2%|β | 4/176 [00:06<05:19, 1.86s/it]
|
| 3888 |
3%|β | 5/176 [00:08<05:29, 1.93s/it]
|
| 3889 |
3%|β | 6/176 [00:10<05:34, 1.97s/it]
|
| 3890 |
4%|β | 7/176 [00:13<05:39, 2.01s/it]
|
| 3891 |
5%|β | 8/176 [00:14<05:35, 2.00s/it]
|
| 3892 |
5%|β | 9/176 [00:17<05:34, 2.00s/it]
|
| 3893 |
6%|β | 10/176 [00:19<05:34, 2.02s/it]
|
| 3894 |
6%|β | 11/176 [00:21<05:36, 2.04s/it]
|
| 3895 |
7%|β | 12/176 [00:23<05:35, 2.04s/it]
|
| 3896 |
7%|β | 13/176 [00:25<05:31, 2.04s/it]
|
| 3897 |
8%|β | 14/176 [00:27<05:26, 2.02s/it]
|
| 3898 |
9%|β | 15/176 [00:29<05:25, 2.02s/it]
|
| 3899 |
9%|β | 16/176 [00:31<05:24, 2.03s/it]
|
| 3900 |
10%|β | 17/176 [00:33<05:26, 2.05s/it]
|
| 3901 |
10%|β | 18/176 [00:35<05:25, 2.06s/it]
|
| 3902 |
11%|β | 19/176 [00:37<05:26, 2.08s/it]
|
| 3903 |
11%|ββ | 20/176 [00:39<05:23, 2.08s/it]
|
| 3904 |
12%|ββ | 21/176 [00:41<05:22, 2.08s/it]
|
| 3905 |
12%|ββ | 22/176 [00:43<05:20, 2.08s/it]
|
| 3906 |
13%|ββ | 23/176 [00:45<05:14, 2.06s/it]
|
| 3907 |
14%|ββ | 24/176 [00:47<05:11, 2.05s/it]
|
| 3908 |
14%|ββ | 25/176 [00:49<05:08, 2.04s/it]
|
| 3909 |
15%|ββ | 26/176 [00:51<05:06, 2.05s/it]
|
| 3910 |
15%|ββ | 27/176 [00:53<05:04, 2.04s/it]
|
| 3911 |
16%|ββ | 28/176 [00:55<05:01, 2.03s/it]
|
| 3912 |
16%|ββ | 29/176 [00:58<05:02, 2.06s/it]
|
| 3913 |
17%|ββ | 30/176 [01:00<05:00, 2.06s/it]
|
| 3914 |
18%|ββ | 31/176 [01:02<04:56, 2.05s/it]
|
| 3915 |
18%|ββ | 32/176 [01:04<04:53, 2.04s/it]
|
| 3916 |
19%|ββ | 33/176 [01:06<04:49, 2.03s/it]
|
| 3917 |
19%|ββ | 34/176 [01:08<04:50, 2.04s/it]
|
| 3918 |
20%|ββ | 35/176 [01:10<04:48, 2.04s/it]
|
| 3919 |
20%|ββ | 36/176 [01:12<04:44, 2.03s/it]
|
| 3920 |
21%|ββ | 37/176 [01:14<04:45, 2.06s/it]
|
| 3921 |
22%|βββ | 38/176 [01:16<04:43, 2.05s/it]
|
| 3922 |
22%|βββ | 39/176 [01:18<04:43, 2.07s/it]
|
| 3923 |
23%|βββ | 40/176 [01:20<04:38, 2.05s/it]
|
| 3924 |
23%|βββ | 41/176 [01:22<04:36, 2.04s/it]
|
| 3925 |
24%|βββ | 42/176 [01:24<04:36, 2.06s/it]
|
| 3926 |
24%|βββ | 43/176 [01:26<04:35, 2.07s/it]
|
| 3927 |
25%|βββ | 44/176 [01:28<04:32, 2.07s/it]
|
| 3928 |
26%|βββ | 45/176 [01:30<04:28, 2.05s/it]
|
| 3929 |
26%|βββ | 46/176 [01:32<04:24, 2.04s/it]
|
| 3930 |
27%|βββ | 47/176 [01:34<04:20, 2.02s/it]
|
| 3931 |
27%|βββ | 48/176 [01:36<04:19, 2.03s/it]
|
| 3932 |
28%|βββ | 49/176 [01:39<04:19, 2.04s/it]
|
| 3933 |
28%|βββ | 50/176 [01:40<04:15, 2.03s/it]
|
| 3934 |
29%|βββ | 51/176 [01:43<04:13, 2.02s/it]
|
| 3935 |
30%|βββ | 52/176 [01:45<04:11, 2.02s/it]
|
| 3936 |
30%|βββ | 53/176 [01:47<04:15, 2.08s/it]
|
| 3937 |
31%|βββ | 54/176 [01:49<04:12, 2.07s/it]
|
| 3938 |
31%|ββββ | 55/176 [01:51<04:10, 2.07s/it]
|
| 3939 |
32%|ββββ | 56/176 [01:53<04:09, 2.08s/it]
|
| 3940 |
32%|ββββ | 57/176 [01:55<04:05, 2.06s/it]
|
| 3941 |
33%|ββββ | 58/176 [01:57<04:02, 2.05s/it]
|
| 3942 |
34%|ββββ | 59/176 [01:59<04:01, 2.06s/it]
|
| 3943 |
34%|ββββ | 60/176 [02:01<03:58, 2.06s/it]
|
| 3944 |
35%|ββββ | 61/176 [02:03<03:55, 2.05s/it]
|
| 3945 |
35%|ββββ | 62/176 [02:05<03:53, 2.05s/it]
|
| 3946 |
36%|ββββ | 63/176 [02:07<03:51, 2.04s/it]
|
| 3947 |
36%|ββββ | 64/176 [02:09<03:48, 2.04s/it]
|
| 3948 |
37%|ββββ | 65/176 [02:11<03:47, 2.05s/it]
|
| 3949 |
38%|ββββ | 66/176 [02:13<03:44, 2.04s/it]
|
| 3950 |
38%|ββββ | 67/176 [02:15<03:42, 2.04s/it]
|
| 3951 |
39%|ββββ | 68/176 [02:18<03:42, 2.06s/it]
|
| 3952 |
39%|ββββ | 69/176 [02:20<03:40, 2.06s/it]
|
| 3953 |
40%|ββββ | 70/176 [02:22<03:34, 2.02s/it]
|
| 3954 |
40%|ββββ | 71/176 [02:24<03:32, 2.02s/it]
|
| 3955 |
41%|ββββ | 72/176 [02:26<03:29, 2.01s/it]
|
| 3956 |
41%|βββββ | 73/176 [02:28<03:31, 2.05s/it]
|
| 3957 |
42%|βββββ | 74/176 [02:30<03:27, 2.03s/it]
|
| 3958 |
43%|βββββ | 75/176 [02:32<03:24, 2.03s/it]
|
| 3959 |
43%|βββββ | 76/176 [02:34<03:23, 2.04s/it]
|
| 3960 |
44%|βββββ | 77/176 [02:36<03:20, 2.03s/it]
|
| 3961 |
44%|βββββ | 78/176 [02:38<03:17, 2.02s/it]
|
| 3962 |
45%|βββββ | 79/176 [02:40<03:16, 2.02s/it]
|
| 3963 |
45%|βββββ | 80/176 [02:42<03:14, 2.03s/it]
|
| 3964 |
46%|βββββ | 81/176 [02:44<03:10, 2.01s/it]
|
| 3965 |
47%|βββββ | 82/176 [02:46<03:09, 2.01s/it]
|
| 3966 |
47%|βββββ | 83/176 [02:48<03:09, 2.04s/it]
|
| 3967 |
48%|βββββ | 84/176 [02:50<03:07, 2.04s/it]
|
| 3968 |
48%|βββββ | 85/176 [02:52<03:04, 2.02s/it]
|
| 3969 |
49%|βββββ | 86/176 [02:54<03:00, 2.01s/it]
|
| 3970 |
49%|βββββ | 87/176 [02:56<02:59, 2.02s/it]
|
| 3971 |
50%|βββββ | 88/176 [02:58<02:58, 2.03s/it]
|
| 3972 |
51%|βββββ | 89/176 [03:00<02:57, 2.04s/it]
|
| 3973 |
51%|βββββ | 90/176 [03:02<02:55, 2.04s/it]
|
| 3974 |
52%|ββββββ | 91/176 [03:04<02:53, 2.04s/it]
|
| 3975 |
52%|ββββββ | 92/176 [03:06<02:49, 2.02s/it]
|
| 3976 |
53%|ββββββ | 93/176 [03:08<02:48, 2.03s/it]
|
| 3977 |
53%|ββββββ | 94/176 [03:10<02:46, 2.03s/it]
|
| 3978 |
54%|ββββββ | 95/176 [03:12<02:46, 2.05s/it]
|
| 3979 |
55%|ββββββ | 96/176 [03:14<02:43, 2.05s/it]
|
| 3980 |
55%|ββββββ | 97/176 [03:16<02:43, 2.07s/it]
|
| 3981 |
56%|ββββββ | 98/176 [03:19<02:40, 2.06s/it]
|
| 3982 |
56%|ββββββ | 99/176 [03:21<02:38, 2.06s/it]
|
| 3983 |
57%|ββββββ | 100/176 [03:23<02:35, 2.04s/it]
|
| 3984 |
57%|ββββββ | 101/176 [03:25<02:32, 2.04s/it]
|
| 3985 |
58%|ββββββ | 102/176 [03:28<03:03, 2.48s/it]
|
| 3986 |
59%|ββββββ | 103/176 [03:30<02:50, 2.34s/it]
|
| 3987 |
59%|ββββββ | 104/176 [03:32<02:42, 2.26s/it]
|
| 3988 |
60%|ββββββ | 105/176 [03:34<02:35, 2.19s/it]
|
| 3989 |
60%|ββββββ | 106/176 [03:36<02:31, 2.17s/it]
|
| 3990 |
61%|ββββββ | 107/176 [03:38<02:26, 2.13s/it]
|
| 3991 |
61%|βββββββ | 108/176 [03:40<02:22, 2.09s/it]
|
| 3992 |
62%|βββββββ | 109/176 [03:42<02:18, 2.07s/it]
|
| 3993 |
62%|βββββββ | 110/176 [03:44<02:15, 2.06s/it]
|
| 3994 |
63%|βββββββ | 111/176 [03:46<02:12, 2.04s/it]
|
| 3995 |
64%|βββββββ | 112/176 [03:48<02:09, 2.02s/it]
|
| 3996 |
64%|βββββββ | 113/176 [03:50<02:07, 2.02s/it]
|
| 3997 |
65%|βββββββ | 114/176 [03:52<02:05, 2.02s/it]
|
| 3998 |
65%|βββββββ | 115/176 [03:54<02:03, 2.02s/it]
|
| 3999 |
66%|βββββββ | 116/176 [03:56<02:01, 2.02s/it]
|
| 4000 |
66%|βββββββ | 117/176 [03:59<01:59, 2.02s/it]
|
| 4001 |
67%|βββββββ | 118/176 [04:01<01:58, 2.04s/it]
|
| 4002 |
68%|βββββββ | 119/176 [04:03<01:56, 2.05s/it]
|
| 4003 |
68%|βββββββ | 120/176 [04:05<01:55, 2.06s/it]
|
| 4004 |
69%|βββββββ | 121/176 [04:07<01:52, 2.05s/it]
|
| 4005 |
69%|βββββββ | 122/176 [04:09<01:50, 2.05s/it]
|
| 4006 |
70%|βββββββ | 123/176 [04:11<01:48, 2.05s/it]
|
| 4007 |
70%|βββββββ | 124/176 [04:13<01:46, 2.05s/it]
|
| 4008 |
71%|βββββββ | 125/176 [04:15<01:44, 2.06s/it]
|
| 4009 |
72%|ββββββββ | 126/176 [04:17<01:42, 2.05s/it]
|
| 4010 |
72%|ββββββββ | 127/176 [04:19<01:40, 2.04s/it]
|
| 4011 |
73%|ββββββββ | 128/176 [04:21<01:38, 2.04s/it]
|
| 4012 |
73%|ββββββββ | 129/176 [04:23<01:35, 2.03s/it]
|
| 4013 |
74%|ββββββββ | 130/176 [04:25<01:34, 2.05s/it]
|
| 4014 |
74%|ββββββββ | 131/176 [04:27<01:32, 2.05s/it]
|
| 4015 |
75%|ββββββββ | 132/176 [04:29<01:30, 2.06s/it]
|
| 4016 |
76%|ββββββββ | 133/176 [04:31<01:28, 2.05s/it]
|
| 4017 |
76%|ββββββββ | 134/176 [04:33<01:26, 2.05s/it]
|
| 4018 |
77%|ββββββββ | 135/176 [04:36<01:25, 2.08s/it]
|
| 4019 |
77%|ββββββββ | 136/176 [04:38<01:22, 2.07s/it]
|
| 4020 |
78%|ββββββββ | 137/176 [04:40<01:20, 2.06s/it]
|
| 4021 |
78%|ββββββββ | 138/176 [04:42<01:18, 2.07s/it]
|
| 4022 |
79%|ββββββββ | 139/176 [04:44<01:15, 2.05s/it]
|
| 4023 |
80%|ββββββββ | 140/176 [04:46<01:13, 2.05s/it]
|
| 4024 |
80%|ββββββββ | 141/176 [04:48<01:11, 2.04s/it]
|
| 4025 |
81%|ββββββββ | 142/176 [04:50<01:09, 2.03s/it]
|
| 4026 |
81%|βββββββββ | 143/176 [04:52<01:06, 2.02s/it]
|
| 4027 |
82%|βββββββββ | 144/176 [04:54<01:04, 2.02s/it]
|
| 4028 |
82%|βββββββββ | 145/176 [04:56<01:02, 2.01s/it]
|
| 4029 |
83%|βββββββββ | 146/176 [04:58<01:00, 2.02s/it]
|
| 4030 |
84%|βββββββββ | 147/176 [05:00<00:59, 2.04s/it]
|
| 4031 |
84%|βββββββββ | 148/176 [05:02<00:57, 2.04s/it]
|
| 4032 |
85%|βββββββββ | 149/176 [05:04<00:55, 2.06s/it]
|
| 4033 |
85%|βββββββββ | 150/176 [05:06<00:53, 2.06s/it]
|
| 4034 |
86%|βββββββββ | 151/176 [05:08<00:51, 2.07s/it]
|
| 4035 |
86%|βββββββββ | 152/176 [05:10<00:49, 2.08s/it]
|
| 4036 |
87%|βββββββββ | 153/176 [05:13<00:48, 2.11s/it]
|
| 4037 |
88%|βββββββββ | 154/176 [05:15<00:45, 2.09s/it]
|
| 4038 |
88%|βββββββββ | 155/176 [05:17<00:43, 2.09s/it]
|
| 4039 |
89%|βββββββββ | 156/176 [05:19<00:41, 2.08s/it]
|
| 4040 |
89%|βββββββββ | 157/176 [05:21<00:39, 2.06s/it]
|
| 4041 |
90%|βββββββββ | 158/176 [05:23<00:37, 2.07s/it]
|
| 4042 |
90%|βββββββββ | 159/176 [05:25<00:35, 2.09s/it]
|
| 4043 |
91%|βββββββββ | 160/176 [05:27<00:33, 2.08s/it]
|
| 4044 |
91%|βββοΏ½οΏ½ββββββ| 161/176 [05:29<00:31, 2.08s/it]
|
| 4045 |
92%|ββββββββββ| 162/176 [05:31<00:29, 2.07s/it]
|
| 4046 |
93%|ββββββββββ| 163/176 [05:33<00:26, 2.07s/it]
|
| 4047 |
93%|ββββββββββ| 164/176 [05:35<00:24, 2.06s/it]
|
| 4048 |
94%|ββββββββββ| 165/176 [05:37<00:23, 2.11s/it]
|
| 4049 |
94%|ββββββββββ| 166/176 [05:40<00:21, 2.12s/it]
|
| 4050 |
95%|ββββββββββ| 167/176 [05:42<00:19, 2.12s/it]
|
| 4051 |
95%|ββββββββββ| 168/176 [05:44<00:16, 2.11s/it]
|
| 4052 |
96%|ββββββββββ| 169/176 [05:46<00:14, 2.09s/it]
|
| 4053 |
97%|ββββββββββ| 170/176 [05:48<00:12, 2.08s/it]
|
| 4054 |
97%|ββββββββββ| 171/176 [05:50<00:10, 2.07s/it]
|
| 4055 |
98%|ββββββββββ| 172/176 [05:52<00:08, 2.06s/it]
|
| 4056 |
98%|ββββββββββ| 173/176 [05:54<00:06, 2.08s/it]
|
| 4057 |
99%|ββββββββββ| 174/176 [05:56<00:04, 2.07s/it]
|
| 4058 |
99%|ββββββββββ| 175/176 [05:57<00:01, 1.66s/it]
|
| 4059 |
+
|
| 4060 |
+
Final Evaluation Results:
|
| 4061 |
+
eval_loss: 0.3947
|
| 4062 |
+
eval_wer: 65.7765
|
| 4063 |
+
eval_wer_ortho: 68.6697
|
| 4064 |
+
eval_cer: 21.6281
|
| 4065 |
+
eval_runtime: 367.2512
|
| 4066 |
+
eval_samples_per_second: 7.6270
|
| 4067 |
+
eval_steps_per_second: 0.4790
|
| 4068 |
+
epoch: 12.8205
|
| 4069 |
+
|
| 4070 |
+
Saving final model to /workspace/experiments/exp_004_base_multistage_urdu...
|
| 4071 |
+
|
| 4072 |
+
|
| 4073 |
+
|
| 4074 |
+
|
| 4075 |
+
|
| 4076 |
+
|
| 4077 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4078 |
+
|
| 4079 |
+
|
| 4080 |
+
|
| 4081 |
...ge_urdu/model.safetensors: 14%|ββ | 41.9MB / 290MB [A[A[A
|
| 4082 |
+
|
| 4083 |
+
|
| 4084 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4085 |
+
|
| 4086 |
+
|
| 4087 |
+
|
| 4088 |
...ge_urdu/model.safetensors: 14%|ββ | 41.9MB / 290MB [A[A[A
|
| 4089 |
+
|
| 4090 |
+
|
| 4091 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4092 |
+
|
| 4093 |
+
|
| 4094 |
+
|
| 4095 |
...ge_urdu/model.safetensors: 26%|βββ | 75.4MB / 290MB [A[A[A
|
| 4096 |
+
|
| 4097 |
+
|
| 4098 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4099 |
+
|
| 4100 |
+
|
| 4101 |
+
|
| 4102 |
...ge_urdu/model.safetensors: 38%|ββββ | 109MB / 290MB [A[A[A
|
| 4103 |
+
|
| 4104 |
+
|
| 4105 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4106 |
+
|
| 4107 |
+
|
| 4108 |
+
|
| 4109 |
...ge_urdu/model.safetensors: 49%|βββββ | 143MB / 290MB [A[A[A
|
| 4110 |
+
|
| 4111 |
+
|
| 4112 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4113 |
+
|
| 4114 |
+
|
| 4115 |
+
|
| 4116 |
...ge_urdu/model.safetensors: 58%|ββββββ | 168MB / 290MB [A[A[A
|
| 4117 |
+
|
| 4118 |
+
|
| 4119 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4120 |
+
|
| 4121 |
+
|
| 4122 |
+
|
| 4123 |
...ge_urdu/model.safetensors: 69%|βββββββ | 201MB / 290MB [A[A[A
|
| 4124 |
+
|
| 4125 |
+
|
| 4126 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4127 |
+
|
| 4128 |
+
|
| 4129 |
+
|
| 4130 |
...ge_urdu/model.safetensors: 81%|ββββββββ | 235MB / 290MB [A[A[A
|
| 4131 |
+
|
| 4132 |
+
|
| 4133 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4134 |
+
|
| 4135 |
+
|
| 4136 |
+
|
| 4137 |
...ge_urdu/model.safetensors: 92%|ββββββββββ| 268MB / 290MB [A[A[A
|
| 4138 |
+
|
| 4139 |
+
|
| 4140 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4141 |
+
|
| 4142 |
+
|
| 4143 |
+
|
| 4144 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
| 4145 |
+
|
| 4146 |
+
|
| 4147 |
+
|
| 4148 |
+
|
| 4149 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
| 4150 |
+
|
| 4151 |
+
|
| 4152 |
+
|
| 4153 |
+
|
| 4154 |
+
|
| 4155 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
| 4156 |
+
|
| 4157 |
+
|
| 4158 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4159 |
+
|
| 4160 |
+
|
| 4161 |
+
|
| 4162 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
| 4163 |
+
|
| 4164 |
+
|
| 4165 |
+
|
| 4166 |
+
|
| 4167 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
| 4168 |
+
|
| 4169 |
+
|
| 4170 |
+
|
| 4171 |
+
|
| 4172 |
+
|
| 4173 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
| 4174 |
+
|
| 4175 |
+
|
| 4176 |
+
|
| 4177 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4178 |
+
|
| 4179 |
+
|
| 4180 |
+
|
| 4181 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
| 4182 |
+
|
| 4183 |
+
|
| 4184 |
+
|
| 4185 |
+
|
| 4186 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
| 4187 |
+
|
| 4188 |
+
|
| 4189 |
+
|
| 4190 |
+
|
| 4191 |
+
|
| 4192 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
| 4193 |
+
|
| 4194 |
+
|
| 4195 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4196 |
+
|
| 4197 |
+
|
| 4198 |
+
|
| 4199 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
| 4200 |
+
|
| 4201 |
+
|
| 4202 |
+
|
| 4203 |
+
|
| 4204 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
| 4205 |
+
|
| 4206 |
+
|
| 4207 |
+
|
| 4208 |
+
|
| 4209 |
+
|
| 4210 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
| 4211 |
+
|
| 4212 |
+
|
| 4213 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4214 |
+
|
| 4215 |
+
|
| 4216 |
+
|
| 4217 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
| 4218 |
+
|
| 4219 |
+
|
| 4220 |
+
|
| 4221 |
+
|
| 4222 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
| 4223 |
+
|
| 4224 |
+
|
| 4225 |
+
|
| 4226 |
+
|
| 4227 |
+
|
| 4228 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
| 4229 |
+
|
| 4230 |
+
|
| 4231 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4232 |
+
|
| 4233 |
+
|
| 4234 |
+
|
| 4235 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
| 4236 |
+
|
| 4237 |
+
|
| 4238 |
+
|
| 4239 |
+
|
| 4240 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
| 4241 |
+
|
| 4242 |
+
|
| 4243 |
+
|
| 4244 |
+
|
| 4245 |
+
|
| 4246 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
| 4247 |
+
|
| 4248 |
+
|
| 4249 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4250 |
+
|
| 4251 |
+
|
| 4252 |
+
|
| 4253 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
| 4254 |
+
|
| 4255 |
+
|
| 4256 |
+
|
| 4257 |
+
|
| 4258 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
| 4259 |
+
|
| 4260 |
+
|
| 4261 |
+
|
| 4262 |
+
|
| 4263 |
+
|
| 4264 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
| 4265 |
+
|
| 4266 |
+
|
| 4267 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB [A[A
|
| 4268 |
+
|
| 4269 |
+
|
| 4270 |
+
|
| 4271 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB [A[A[A
|
| 4272 |
+
|
| 4273 |
+
|
| 4274 |
+
|
| 4275 |
+
|
| 4276 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB [A[A[A[A
|
| 4277 |
+
|
| 4278 |
+
|
| 4279 |
+
|
| 4280 |
+
|
| 4281 |
+
|
| 4282 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B [A[A[A[A[A
|
| 4283 |
+
|
| 4284 |
+
|
| 4285 |
...ge_urdu/training_args.bin: 100%|ββββββββββ| 5.39kB / 5.39kB
|
| 4286 |
+
|
| 4287 |
...ge_urdu/model.safetensors: 100%|ββββββββββ| 290MB / 290MB
|
| 4288 |
+
|
| 4289 |
...8512.74464e924814.18921.0: 100%|ββββββββββ| 34.9kB / 34.9kB
|
| 4290 |
+
|
| 4291 |
...5346.74464e924814.18921.1: 100%|ββββββββββ| 506B / 506B
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 290403936
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ddb132e0e3c83eb9f6c7f7499d2a292ade450bef2e0d595dc20c6fe92527f159
|
| 3 |
size 290403936
|
runs/Feb05_13-35-12_74464e924814/events.out.tfevents.1770298512.74464e924814.18921.0
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5356062481f1f5d102bd1c439537a5aee557953498ab4ede404d1c44f20e5713
|
| 3 |
+
size 34938
|
runs/Feb05_13-35-12_74464e924814/events.out.tfevents.1770375346.74464e924814.18921.1
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f9167cf4c2364af919c6dbd5a0944bd271c62a5c0dd2ebe376f79f380fe66f2a
|
| 3 |
+
size 506
|