cglez
/

bert-s140-uncased

Model card Files Files and versions

Metrics Training metrics Community

bert-s140-uncased / README.md

cglez's picture

Update README.md

70a453b verified 3 months ago

|

history blame contribute delete

2.87 kB

	---
	library_name: transformers
	language: en
	license: apache-2.0
	datasets:
	- stanfordnlp/sentiment140
	base_model:
	- google-bert/bert-base-uncased
	---

	# Model Card: BERT-Sentiment140

	An in-domain BERT-base model, pre-trained from scratch on the Sentiment140 dataset text.

	## Model Details

	### Description

	This model is based on the [BERT base (uncased)](https://huggingface.co/google-bert/bert-base-uncased)
	architecture and was pre-trained from scratch (in-domain) using the text in Sentiment140 dataset, excluding its test split.
	Only the masked language modeling (MLM) objective was used during pre-training.

	- Developed by: [Cesar Gonzalez-Gutierrez](https://ceguel.es)
	- Funded by: [ERC](https://erc.europa.eu)
	- Architecture: BERT-base
	- Language: English
	- License: Apache 2.0
	- Base model: [BERT base model (uncased)](https://huggingface.co/google-bert/bert-base-uncased)

	### Checkpoints

	Intermediate checkpoints from the pre-training process are available and can be accessed using specific tags,
	which correspond to training epochs and steps:

	\| Epoch \| Step \| Tags \| \|
	\|---\|---\|---\|---\|
	\| 1 \| 15000 \| epoch-1 \| step-15000 \|
	\| 2 \| 30000 \| epoch-2 \| step-30000 \|
	\| 3 \| 45000 \| epoch-3 \| step-45000 \|
	\| 5 \| 75000 \| epoch-5 \| step-75000 \|
	\| 10 \| 150000 \| epoch-10 \| step-150000 \|
	\| 15 \| 225000 \| epoch-15 \| step-225000 \|
	\| 20 \| 300000 \| epoch-20 \| step-300000 \|
	\| 25 \| 375000 \| epoch-25 \| step-375000 \|

	To load a model from a specific intermediate checkpoint, use the `revision` parameter with the corresponding tag:
	```python
	from transformers import AutoModelForMaskedLM

	model = AutoModelForMaskedLM.from_pretrained("<model-name>", revision="<checkpoint-tag>")
	```

	### Sources

	- Paper: [Information pending]

	## Training Details

	For more details on the training procedure, please refer to the base model's documentation:
	[Training procedure](https://huggingface.co/google-bert/bert-base-uncased#training-procedure).

	### Training Data

	All texts from Sentiment140 dataset, excluding the test partition.

	#### Training Hyperparameters

	- Precision: fp16
	- Batch size: 32
	- Gradient accumulation steps: 3

	## Uses

	For typical use cases and limitations, please refer to the base model's guidance:
	[Inteded uses & limitations](https://huggingface.co/google-bert/bert-base-uncased#intended-uses--limitations).

	## Bias, Risks, and Limitations

	This model inherits potential risks and limitations from the base model. Refer to:
	[Limitations and bias](https://huggingface.co/google-bert/bert-base-uncased#limitations-and-bias).

	## Environmental Impact

	- Hardware Type: NVIDIA Tesla V100 PCIE 32GB
	- Runtime: 36.5 h
	- Cluster Provider: [Artemisa](https://artemisa.ific.uv.es/web/)
	- Compute Region: EU
	- Carbon Emitted: 6.79 kg CO2 eq.

	## Citation

	BibTeX:

	[More Information Needed]