CYONG commited on
Commit
58adac4
·
1 Parent(s): 75cec41

Training in progress epoch 0

Browse files
Files changed (3) hide show
  1. README.md +8 -50
  2. config.json +1 -1
  3. tf_model.h5 +1 -1
README.md CHANGED
@@ -1,4 +1,6 @@
1
  ---
 
 
2
  tags:
3
  - generated_from_keras_callback
4
  model-index:
@@ -11,11 +13,11 @@ probably proofread and complete it, then remove this comment. -->
11
 
12
  # CYONG/v1
13
 
14
- This model was trained from scratch on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
- - Train Loss: 0.0304
17
- - Validation Loss: 3.2014
18
- - Epoch: 44
19
 
20
  ## Model description
21
 
@@ -34,58 +36,14 @@ More information needed
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
- - optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 6798000, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
38
  - training_precision: float32
39
 
40
  ### Training results
41
 
42
  | Train Loss | Validation Loss | Epoch |
43
  |:----------:|:---------------:|:-----:|
44
- | 1.4488 | 1.3082 | 0 |
45
- | 1.0474 | 1.3459 | 1 |
46
- | 0.7909 | 1.3772 | 2 |
47
- | 0.5886 | 1.5130 | 3 |
48
- | 0.4379 | 1.8039 | 4 |
49
- | 0.3335 | 1.9218 | 5 |
50
- | 0.2640 | 2.2600 | 6 |
51
- | 0.2151 | 2.2474 | 7 |
52
- | 0.1811 | 2.3162 | 8 |
53
- | 0.1582 | 2.4016 | 9 |
54
- | 0.1383 | 2.5507 | 10 |
55
- | 0.1235 | 2.6501 | 11 |
56
- | 0.1125 | 2.6810 | 12 |
57
- | 0.1033 | 2.7546 | 13 |
58
- | 0.0948 | 2.6557 | 14 |
59
- | 0.0870 | 2.8685 | 15 |
60
- | 0.0826 | 2.5886 | 16 |
61
- | 0.0775 | 2.7697 | 17 |
62
- | 0.0726 | 2.9925 | 18 |
63
- | 0.0687 | 2.9328 | 19 |
64
- | 0.0650 | 2.9124 | 20 |
65
- | 0.0623 | 2.9412 | 21 |
66
- | 0.0592 | 3.0854 | 22 |
67
- | 0.0575 | 2.9573 | 23 |
68
- | 0.0543 | 3.0900 | 24 |
69
- | 0.0526 | 2.8826 | 25 |
70
- | 0.0497 | 3.2169 | 26 |
71
- | 0.0485 | 3.1990 | 27 |
72
- | 0.0470 | 3.1993 | 28 |
73
- | 0.0456 | 2.9837 | 29 |
74
- | 0.0439 | 3.3015 | 30 |
75
- | 0.0431 | 3.3529 | 31 |
76
- | 0.0410 | 3.2827 | 32 |
77
- | 0.0403 | 3.0694 | 33 |
78
- | 0.0389 | 3.5464 | 34 |
79
- | 0.0378 | 3.3715 | 35 |
80
- | 0.0374 | 2.9658 | 36 |
81
- | 0.0364 | 3.3008 | 37 |
82
- | 0.0348 | 3.2827 | 38 |
83
- | 0.0345 | 3.2501 | 39 |
84
- | 0.0338 | 3.2528 | 40 |
85
- | 0.0320 | 3.3565 | 41 |
86
- | 0.0325 | 3.2310 | 42 |
87
- | 0.0306 | 3.2631 | 43 |
88
- | 0.0304 | 3.2014 | 44 |
89
 
90
 
91
  ### Framework versions
 
1
  ---
2
+ license: apache-2.0
3
+ base_model: distilbert-base-multilingual-cased
4
  tags:
5
  - generated_from_keras_callback
6
  model-index:
 
13
 
14
  # CYONG/v1
15
 
16
+ This model is a fine-tuned version of [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Train Loss: 1.3818
19
+ - Validation Loss: 0.8440
20
+ - Epoch: 0
21
 
22
  ## Model description
23
 
 
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
+ - optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 1510100, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
40
  - training_precision: float32
41
 
42
  ### Training results
43
 
44
  | Train Loss | Validation Loss | Epoch |
45
  |:----------:|:---------------:|:-----:|
46
+ | 1.3818 | 0.8440 | 0 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
 
49
  ### Framework versions
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "models/distilbert-base-multilingual-cased/tf/news/v1",
3
  "activation": "gelu",
4
  "architectures": [
5
  "DistilBertForQuestionAnswering"
 
1
  {
2
+ "_name_or_path": "distilbert-base-multilingual-cased",
3
  "activation": "gelu",
4
  "architectures": [
5
  "DistilBertForQuestionAnswering"
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cab8605d6d3a564c842037720f2e0d2a24beb4e76cb793a287f8e518b2d2a279
3
  size 539068456
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:09d8225d6a34531f6da389f7cf1f2d41b46bf3c68481d27c48f030c780c247f7
3
  size 539068456