lannelin
/

bert-imdb-1hidden

Text Classification

Model card Files Files and versions

bert-imdb-1hidden / README.md

James Bishop

model card

7225122 over 5 years ago

|

1.36 kB

	---
	language:
	- en
	datasets:
	- imdb
	metrics:
	- accuracy
	---

	# bert-imdb-1hidden

	## Model description

	A `bert-base-uncased` model was restricted to 1 hidden layer and
	fine-tuned for sequence classification on the
	imdb dataset loaded using the `datasets` library.

	## Intended uses & limitations

	#### How to use

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	pretrained = "lannelin/bert-imdb-1hidden"

	tokenizer = AutoTokenizer.from_pretrained(pretrained)

	model = AutoModelForSequenceClassification.from_pretrained(pretrained)

	LABELS = ["negative", "positive"]

	def get_sentiment(text: str):
	inputs = tokenizer.encode_plus(text, return_tensors='pt')

	output = model(**inputs)[0].squeeze()

	return LABELS[(output.argmax())]

	print(get_sentiment("What a terrible film!"))
	```

	#### Limitations and bias

	No special consideration given to limitations and bias.

	Any bias held by the imdb dataset may be reflected in the model's output.

	## Training data

	Initialised with [bert-base-uncased](https://huggingface.co/bert-base-uncased)

	Fine tuned on [imdb](https://huggingface.co/datasets/imdb)


	## Training procedure

	The model was fine-tuned for 1 epoch with a batch size of 64,
	a learning rate of 5e-5, and a maximum sequence length of 512.

	## Eval results

	Accuracy on imdb test set: 0.87132