| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - AyoubChLin/CNN_News_Articles_2011-2022 |
| | language: |
| | - en |
| | metrics: |
| | - accuracy |
| | pipeline_tag: text-classification |
| | --- |
| | # BertForSequenceClassification on CNN News Dataset |
| |
|
| | This repository contains a fine-tuned Bert base model for sequence classification on the CNN News dataset. The model is able to classify news articles into one of six categories: business, entertainment, health, news, politics, and sport. |
| |
|
| | The model was fine-tuned for four epochs achieving a training loss of 0.077900, a validation loss of 0.190814 |
| |
|
| | - accuracy : 0.956690. |
| | - f1 : 0.956144. |
| | - precision : 0.956393 |
| | - recall : 0.956690 |
| |
|
| | ## Model Description |
| |
|
| |
|
| | - **Developed by:** [CHERGUELAINE Ayoub](https://www.linkedin.com/in/ayoub-cherguelaine/) |
| | - **Shared by :** HuggingFace |
| | - **Model type:** Language model |
| | - **Language(s) (NLP):** en |
| | - **Finetuned from model :** [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) |
| |
|
| |
|
| | ## Usage |
| |
|
| | You can use this model with the Hugging Face Transformers library for a variety of natural language processing tasks, such as text classification, sentiment analysis, and more. |
| |
|
| | Here's an example of how to use this model for text classification in Python: |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, BertForSequenceClassification |
| | |
| | model_name = "AyoubChLin/bert_cnn_news" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = TFAutoModelForSequenceClassification.from_pretrained(model_name) |
| | |
| | text = "This is a news article about politics." |
| | inputs = tokenizer(text, padding=True, truncation=True, return_tensors="tf") |
| | |
| | outputs = model(inputs) |
| | predicted_class_id = tf.argmax(outputs.logits, axis=-1).numpy()[0] |
| | |
| | labels = ["business", "entertainment", "health", "news", "politics", "sport"] |
| | predicted_label = labels[predicted_class_id] |
| | ``` |
| |
|
| | In this example, we first load the tokenizer and the model using their respective `from_pretrained` methods. We then encode a news article using the tokenizer, pass the inputs through the model, and extract the predicted label using the `argmax` function. Finally, we map the predicted label to its corresponding category using a list of labels. |
| |
|
| | ## Contributors |
| |
|
| | This model was fine-tuned by [CHERGUELAINE Ayoub](https://www.linkedin.com/in/ayoub-cherguelaine/). |