WeightWatcher
/

albert-large-v2-mrpc

Text Classification

Model card Files Files and versions

cdhinrichs commited on Aug 2, 2023

Commit

bae232f

·

1 Parent(s): da87e0a

Added a model card

Files changed (1) hide show

README.md +99 -0

README.md CHANGED Viewed

@@ -1,3 +1,102 @@
 ---
 license: mit
 ---

 ---
+language:
+  - "en"
 license: mit
+datasets:
+  - https://huggingface.co/datasets/glue#mrpc
+metrics:
+  - F1 score
 ---
+# Model Card for cdhinrichs/albert-large-v2-mrpc
+This model was finetuned on the GLUE/mrpc task, based on the pretrained
+albert-large-v2 model. Hyperparameters were (largely) taken from the following
+publication, with some minor exceptions.
+ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
+https://arxiv.org/abs/1909.11942
+## Model Details
+### Model Description
+- **Developed by:** https://huggingface.co/cdhinrichs
+- **Model type:** Text Sequence Classification
+- **Language(s) (NLP):** English
+- **License:** MIT
+- **Finetuned from model:** https://huggingface.co/albert-large-v2
+## Uses
+Text classification, research and development.
+### Out-of-Scope Use
+Not intended for production use.
+See https://huggingface.co/albert-large-v2
+## Bias, Risks, and Limitations
+See https://huggingface.co/albert-large-v2
+### Recommendations
+See https://huggingface.co/albert-large-v2
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+from transformers import AlbertForSequenceClassification
+model = AlbertForSequenceClassification.from_pretrained("cdhinrichs/albert-large-v2-mrpc")
+```
+## Training Details
+### Training Data
+See https://huggingface.co/datasets/glue#mrpc
+MRPC is a classification task, and a part of the GLUE benchmark.
+### Training Procedure
+Adam optimization was used on the pretrained ALBERT model at
+https://huggingface.co/albert-large-v2.
+A checkpoint from MNLI was NOT used, differing from footnote 4 in,
+ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
+https://arxiv.org/abs/1909.11942
+#### Training Hyperparameters
+Training hyperparameters, (Learning Rate, Batch Size, ALBERT dropout rate,
+Classifier Dropout Rate, Warmup Steps, Training Steps,) were taken from Table
+A.4 in,
+ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
+https://arxiv.org/abs/1909.11942
+Max sequence length (MSL) was set to 128, differing from the above.
+## Evaluation
+F1 score is used to evaluate model performance.
+### Testing Data, Factors & Metrics
+#### Testing Data
+See https://huggingface.co/datasets/glue#mrpc
+#### Metrics
+F1 score
+### Results
+Training F1 score: 0.9963621665319321
+Evaluation F1 score: 0.9176882661996497
+## Environmental Impact
+The model was finetuned on a single user workstation with a single GPU. CO2
+impact is expected to be minimal.