shahxeebhassan
/

bert_base_ai_content_detector

Text Classification

Model card Files Files and versions

bert_base_ai_content_detector / README.md

shahxeebhassan's picture

Update README.md

62bbea7 verified over 1 year ago

|

history blame contribute delete

1.51 kB

	---
	license: mit
	metrics:
	- accuracy
	base_model:
	- google-bert/bert-base-uncased
	datasets:
	- shahxeebhassan/human_vs_ai_sentences
	pipeline_tag: text-classification
	library_name: transformers
	---

	## Model Description
	This model is a fine-tuned BERT model for AI content detection.

	## Training Data
	The model was trained on a [<span style="color: blue;">dataset</span>
	](https://huggingface.co/datasets/shahxeebhassan/human_vs_ai_sentences) of over 100,000 sentences, each labeled as either AI-generated or human-written. This approach allows the model to predict the nature of each individual sentence, which is particularly useful for highlighting AI-written content within larger texts.

	## Evaluation Metrics
	The model achieved an accuracy of 90% on the validation & test set.

	## Usage
	```python
	import torch
	from transformers import BertTokenizer, BertForSequenceClassification

	tokenizer = BertTokenizer.from_pretrained("shahxeebhassan/bert_base_ai_content_detector")
	model = BertForSequenceClassification.from_pretrained("shahxeebhassan/bert_base_ai_content_detector")

	inputs = tokenizer("Distance learning will not benefit students because the students are not able to develop as good of a relationship with their teachers.", return_tensors="pt")

	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits

	probabilities = torch.softmax(logits, dim=1).cpu().numpy()

	predicted_label = probabilities.argmax(axis=1)

	print(f"Predicted label for the input text: {predicted_label[0]}")