liewchooichin
/

distilbert-base-uncased-tiny-imdb

Model card Files Files and versions

liewchooichin commited on May 14, 2024

Commit

e618099

·

verified ·

1 Parent(s): 7a1e60f

First commit

Files changed (1) hide show

README.md +12 -5

README.md CHANGED Viewed

@@ -2,10 +2,15 @@
 license: apache-2.0
 base_model: distilbert-base-uncased
 tags:
-- generated_from_keras_callback
 model-index:
 - name: liewchooichin/distilbert-base-uncased-tiny-imdb
   results: []
 ---
 <!-- This model card has been generated automatically according to the information Keras had access to. You should
@@ -21,15 +26,17 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
@@ -53,4 +60,4 @@ The following hyperparameters were used during training:
 - Transformers 4.40.2
 - TensorFlow 2.15.0
 - Datasets 2.19.1
-- Tokenizers 0.19.1

 license: apache-2.0
 base_model: distilbert-base-uncased
 tags:
+- general
 model-index:
 - name: liewchooichin/distilbert-base-uncased-tiny-imdb
   results: []
+datasets:
+- stanfordnlp/imdb
+language:
+- en
+pipeline_tag: fill-mask
 ---
 <!-- This model card has been generated automatically according to the information Keras had access to. You should
 ## Model description
+This model is created from following the lesson in Hugging Face Learn.
+NLP -- Main NLP Tasks -- [Fine-tuning a masked language model](https://huggingface.co/learn/nlp-course/chapter7/3?fw=tf#the-dataset).
 ## Intended uses & limitations
+This is only a small scale fine-tuning of the `standfordnlp/imbd` datasets. Only 1000 rows of the `unsupervised` dataset is used for training.
+The exercise is carried on Google Colab - T4 gpu.
 ## Training and evaluation data
+1000 rows from the `standfordnlp/imbd` datasets.
 ## Training procedure
 - Transformers 4.40.2
 - TensorFlow 2.15.0
 - Datasets 2.19.1
+- Tokenizers 0.19.1