AbstractQbit
/

electra_large_imdb_htsplice

Text Classification

Model card Files Files and versions

electra_large_imdb_htsplice / README.md

AbstractQbit's picture

Rename readme.md to README.md

5b303da over 2 years ago

|

597 Bytes

	`google/electra-large-discriminator` finetuned on imdb dataset for 2 epoches.

	Large examples tokenized with head and tail parts of a review, as described in [How to Fine-Tune BERT for Text Classification?](https://arxiv.org/abs/1905.05583)

	```python
	def preprocess_function(example):
	tokens = tokenizer(example["text"], truncation=False)
	if len(tokens['input_ids']) > 512:
	tokens['input_ids'] = tokens['input_ids'][:129] + \
	[102] + tokens['input_ids'][-382:]
	tokens['token_type_ids'] = [0]*512
	tokens['attention_mask'] = [1]*512
	return tokens
	```