AyoubChLin
/

bert_cnn_news

Text Classification

text-embeddings-inference

Model card Files Files and versions

bert_cnn_news / README.md

AyoubChLin's picture

Update README.md

9658a2a almost 3 years ago

|

history blame contribute delete

2.27 kB

	---
	license: apache-2.0
	datasets:
	- AyoubChLin/CNN_News_Articles_2011-2022
	language:
	- en
	metrics:
	- accuracy
	pipeline_tag: text-classification
	---
	# BertForSequenceClassification on CNN News Dataset

	This repository contains a fine-tuned Bert base model for sequence classification on the CNN News dataset. The model is able to classify news articles into one of six categories: business, entertainment, health, news, politics, and sport.

	The model was fine-tuned for four epochs achieving a training loss of 0.077900, a validation loss of 0.190814

	- accuracy : 0.956690.
	- f1 : 0.956144.
	- precision : 0.956393
	- recall : 0.956690

	## Model Description


	- Developed by: [CHERGUELAINE Ayoub](https://www.linkedin.com/in/ayoub-cherguelaine/)
	- Shared by : HuggingFace
	- Model type: Language model
	- Language(s) (NLP): en
	- Finetuned from model : [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased)


	## Usage

	You can use this model with the Hugging Face Transformers library for a variety of natural language processing tasks, such as text classification, sentiment analysis, and more.

	Here's an example of how to use this model for text classification in Python:

	```python
	from transformers import AutoTokenizer, BertForSequenceClassification

	model_name = "AyoubChLin/bert_cnn_news"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = TFAutoModelForSequenceClassification.from_pretrained(model_name)

	text = "This is a news article about politics."
	inputs = tokenizer(text, padding=True, truncation=True, return_tensors="tf")

	outputs = model(inputs)
	predicted_class_id = tf.argmax(outputs.logits, axis=-1).numpy()[0]

	labels = ["business", "entertainment", "health", "news", "politics", "sport"]
	predicted_label = labels[predicted_class_id]
	```

	In this example, we first load the tokenizer and the model using their respective `from_pretrained` methods. We then encode a news article using the tokenizer, pass the inputs through the model, and extract the predicted label using the `argmax` function. Finally, we map the predicted label to its corresponding category using a list of labels.

	## Contributors

	This model was fine-tuned by [CHERGUELAINE Ayoub](https://www.linkedin.com/in/ayoub-cherguelaine/).