MENG21
/

stud-fac-eval-bert-large-uncased_v2

Text Classification

text-embeddings-inference

Model card Files Files and versions

MENG21 commited on Jun 14, 2024

Commit

aaffea4

·

verified ·

1 Parent(s): a080ec4

Update README.md

Files changed (1) hide show

README.md +63 -0

README.md CHANGED Viewed

@@ -93,6 +93,69 @@ Use the code below to get started with the model.
 ## Training Details
 ### Training Data
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

 ## Training Details
+<!-- <###################################################################> -->
+# results_bert-large-uncased
+This model is a fine-tuned version of [bert-large-uncased](https://huggingface.co/bert-large-uncased) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.2128
+- Accuracy: 0.9141
+- Precision: 0.9182
+- Recall: 0.9421
+- F1: 0.9300
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1     |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
+| 0.6415        | 0.09  | 50   | 0.5315          | 0.7175   | 0.6981    | 0.9394 | 0.8010 |
+| 0.4007        | 0.18  | 100  | 0.7702          | 0.7243   | 0.9892    | 0.5505 | 0.7074 |
+| 0.5158        | 0.28  | 150  | 0.4075          | 0.8591   | 0.8904    | 0.8748 | 0.8825 |
+| 0.3934        | 0.37  | 200  | 0.2809          | 0.8763   | 0.9354    | 0.8546 | 0.8932 |
+| 0.2691        | 0.46  | 250  | 0.3406          | 0.8832   | 0.8837    | 0.9294 | 0.9060 |
+| 0.2814        | 0.55  | 300  | 0.2582          | 0.8768   | 0.8512    | 0.9651 | 0.9046 |
+| 0.2735        | 0.64  | 350  | 0.2715          | 0.8953   | 0.8708    | 0.9711 | 0.9182 |
+| 0.2411        | 0.74  | 400  | 0.2389          | 0.9103   | 0.9242    | 0.9279 | 0.9260 |
+| 0.2371        | 0.83  | 450  | 0.2081          | 0.9104   | 0.9212    | 0.9316 | 0.9264 |
+| 0.1974        | 0.92  | 500  | 0.2128          | 0.9141   | 0.9182    | 0.9421 | 0.9300 |
+### Framework versions
+- Transformers 4.37.2
+- Pytorch 2.1.0+cu121
+- Datasets 2.17.0
+- Tokenizers 0.15.2
+<!-- <###################################################################> -->
 ### Training Data
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->