FakevsRealNews / README.md
Shadman-Rohan's picture
Update README.md
70c747c
|
raw
history blame
2.13 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - f1
  - precision
  - recall
model-index:
  - name: FakevsRealNews
    results: []

Coding challenge

The challenge involved building a fake news classifier using the huggingface library.

This final model is a fine-tuned version of distilbert-base-uncased on an fake-and-real-news dataset. The link to the dataset is https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset.

It achieves the following results on the evaluation set:

  • Loss: 0.0000
  • Accuracy: 1.0
  • F1: 1.0
  • Precision: 1.0
  • Recall: 1.0

Model description

Finetuned Distilbert

Training and evaluation data

The training data was split into train-dev-test in the ratio 80-10-10.

Training procedure

The title and text of each news story was concatenated to form each datapoint. Then a model was finetuned to perform single label classification on each datapoint. The final prediction is the class with the highest probability.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Precision Recall
0.0503 1.0 1956 0.0025 0.9995 0.9995 0.9995 0.9995
0.001 2.0 3912 0.0001 1.0 1.0 1.0 1.0
0.0007 3.0 5868 0.0000 1.0 1.0 1.0 1.0

Framework versions

  • Transformers 4.18.0
  • Pytorch 1.10.0+cu111
  • Datasets 2.1.0
  • Tokenizers 0.12.1