Text Classification
Transformers
TensorFlow
distilbert
generated_from_keras_callback
text-embeddings-inference
Instructions to use lgfunderburk/distilbert-truncated with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lgfunderburk/distilbert-truncated with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="lgfunderburk/distilbert-truncated")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("lgfunderburk/distilbert-truncated") model = AutoModelForSequenceClassification.from_pretrained("lgfunderburk/distilbert-truncated") - Notebooks
- Google Colab
- Kaggle
distilbert-truncated
This model is a fine-tuned version of distilbert-base-uncased on the 20 Newsgroups dataset. It achieves the following results on the evaluation set:
Training and evaluation data
The data was split into training and testing: model trained on 90% of the data, and had a testing data size of 10% of the original dataset.
Training procedure
DistilBERT has a maximum input length of 512, so with this in mind the following was performed:
- I used the
distilbert-base-uncasedpretrained model to initialize anAutoTokenizer. - Setting a maximum length of 256, each entry in the training, testing and validation data was truncated if it exceeded the limit and padded if it didn't reach the limit.
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 1908, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
- training_precision: float32
Training results
EPOCHS = 3 batches_per_epoch = 636 total_train_steps = 1908
Model accuracy 0.8337758779525757
Model loss 0.568471074104309
Framework versions
- Transformers 4.28.0
- TensorFlow 2.12.0
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 2