CYONG
/

v1

@@ -1,4 +1,6 @@
 ---
 tags:
 - generated_from_keras_callback
 model-index:
@@ -11,11 +13,11 @@ probably proofread and complete it, then remove this comment. -->
 # CYONG/v1
-This model was trained from scratch on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Train Loss: 0.0304
-- Validation Loss: 3.2014
-- Epoch: 44
 ## Model description
@@ -34,58 +36,14 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 6798000, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
 - training_precision: float32
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
-| 1.4488     | 1.3082          | 0     |
-| 1.0474     | 1.3459          | 1     |
-| 0.7909     | 1.3772          | 2     |
-| 0.5886     | 1.5130          | 3     |
-| 0.4379     | 1.8039          | 4     |
-| 0.3335     | 1.9218          | 5     |
-| 0.2640     | 2.2600          | 6     |
-| 0.2151     | 2.2474          | 7     |
-| 0.1811     | 2.3162          | 8     |
-| 0.1582     | 2.4016          | 9     |
-| 0.1383     | 2.5507          | 10    |
-| 0.1235     | 2.6501          | 11    |
-| 0.1125     | 2.6810          | 12    |
-| 0.1033     | 2.7546          | 13    |
-| 0.0948     | 2.6557          | 14    |
-| 0.0870     | 2.8685          | 15    |
-| 0.0826     | 2.5886          | 16    |
-| 0.0775     | 2.7697          | 17    |
-| 0.0726     | 2.9925          | 18    |
-| 0.0687     | 2.9328          | 19    |
-| 0.0650     | 2.9124          | 20    |
-| 0.0623     | 2.9412          | 21    |
-| 0.0592     | 3.0854          | 22    |
-| 0.0575     | 2.9573          | 23    |
-| 0.0543     | 3.0900          | 24    |
-| 0.0526     | 2.8826          | 25    |
-| 0.0497     | 3.2169          | 26    |
-| 0.0485     | 3.1990          | 27    |
-| 0.0470     | 3.1993          | 28    |
-| 0.0456     | 2.9837          | 29    |
-| 0.0439     | 3.3015          | 30    |
-| 0.0431     | 3.3529          | 31    |
-| 0.0410     | 3.2827          | 32    |
-| 0.0403     | 3.0694          | 33    |
-| 0.0389     | 3.5464          | 34    |
-| 0.0378     | 3.3715          | 35    |
-| 0.0374     | 2.9658          | 36    |
-| 0.0364     | 3.3008          | 37    |
-| 0.0348     | 3.2827          | 38    |
-| 0.0345     | 3.2501          | 39    |
-| 0.0338     | 3.2528          | 40    |
-| 0.0320     | 3.3565          | 41    |
-| 0.0325     | 3.2310          | 42    |
-| 0.0306     | 3.2631          | 43    |
-| 0.0304     | 3.2014          | 44    |
 ### Framework versions

 ---
+license: apache-2.0
+base_model: distilbert-base-multilingual-cased
 tags:
 - generated_from_keras_callback
 model-index:
 # CYONG/v1
+This model is a fine-tuned version of [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Train Loss: 1.3818
+- Validation Loss: 0.8440
+- Epoch: 0
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 1510100, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
 - training_precision: float32
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
+| 1.3818     | 0.8440          | 0     |
 ### Framework versions

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "models/distilbert-base-multilingual-cased/tf/news/v1",
   "activation": "gelu",
   "architectures": [
     "DistilBertForQuestionAnswering"

 {
+  "_name_or_path": "distilbert-base-multilingual-cased",
   "activation": "gelu",
   "architectures": [
     "DistilBertForQuestionAnswering"

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cab8605d6d3a564c842037720f2e0d2a24beb4e76cb793a287f8e518b2d2a279
 size 539068456

 version https://git-lfs.github.com/spec/v1
+oid sha256:09d8225d6a34531f6da389f7cf1f2d41b46bf3c68481d27c48f030c780c247f7
 size 539068456