eternis
/

eternis_router_encoder_sft_5Sep

Transformers

Safetensors

Generated from Trainer

Model card Files Files and versions

xet

Community

eternis commited on Sep 6, 2025

Commit

176e890

verified ·

1 Parent(s): 8fd0ad1

Model save

Browse files

Files changed (1) hide show

README.md +36 -16

README.md CHANGED Viewed

@@ -16,9 +16,16 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.2443
-- Mse: 0.2443
-- Model Accuracy: 0.2018
 ## Model description
@@ -37,29 +44,42 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
 - train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 32
 - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.06
-- num_epochs: 3
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss | Mse    | Model Accuracy |
-|:-------------:|:------:|:----:|:---------------:|:------:|:--------------:|
-| 1.432         | 0.3429 | 300  | 0.6756          | 0.6756 | 0.031          |
-| 0.7353        | 0.6857 | 600  | 0.3514          | 0.3514 | 0.0825         |
-| 0.541         | 1.0286 | 900  | 0.2742          | 0.2742 | 0.1725         |
-| 0.5159        | 1.3714 | 1200 | 0.2563          | 0.2563 | 0.1578         |
-| 0.4806        | 1.7143 | 1500 | 0.2495          | 0.2495 | 0.1805         |
-| 0.4732        | 2.0571 | 1800 | 0.2462          | 0.2462 | 0.2035         |
-| 0.4781        | 2.4    | 2100 | 0.2447          | 0.2447 | 0.1988         |
-| 0.466         | 2.7429 | 2400 | 0.2443          | 0.2443 | 0.2018         |
 ### Framework versions

 This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1954
+- Mse: 0.1954
+- Mae: 0.1976
+- Vector Accuracy: 0.2235
+- Complexity Accuracy: 0.8013
+- Accuracy Accuracy: 0.9885
+- Completeness Accuracy: 0.9928
+- Clarity Accuracy: 0.997
+- Relevance Accuracy: 0.9978
+- Model Accuracy: 0.2898
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.002
 - train_batch_size: 16
+- eval_batch_size: 32
 - seed: 42
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 32
 - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_ratio: 0.06
+- num_epochs: 6
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Mse    | Mae    | Vector Accuracy | Complexity Accuracy | Accuracy Accuracy | Completeness Accuracy | Clarity Accuracy | Relevance Accuracy | Model Accuracy |
+|:-------------:|:------:|:----:|:---------------:|:------:|:------:|:---------------:|:-------------------:|:-----------------:|:---------------------:|:----------------:|:------------------:|:--------------:|
+| 0.425         | 0.2857 | 250  | 0.2167          | 0.2167 | 0.2256 | 0.164           | 0.7642              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2157         |
+| 0.4162        | 0.5714 | 500  | 0.2129          | 0.2129 | 0.2096 | 0.2405          | 0.7745              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.3235         |
+| 0.3955        | 0.8571 | 750  | 0.2135          | 0.2135 | 0.2140 | 0.1708          | 0.782               | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.246          |
+| 0.3864        | 1.1429 | 1000 | 0.2014          | 0.2014 | 0.2046 | 0.195           | 0.8035              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.254          |
+| 0.4043        | 1.4286 | 1250 | 0.2029          | 0.2029 | 0.2086 | 0.1893          | 0.806               | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2507         |
+| 0.3942        | 1.7143 | 1500 | 0.2046          | 0.2046 | 0.2022 | 0.233           | 0.804               | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2935         |
+| 0.3952        | 2.0    | 1750 | 0.2103          | 0.2103 | 0.2196 | 0.1762          | 0.721               | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2622         |
+| 0.3929        | 2.2857 | 2000 | 0.2011          | 0.2011 | 0.2014 | 0.2305          | 0.788               | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.3023         |
+| 0.3921        | 2.5714 | 2250 | 0.1986          | 0.1986 | 0.2019 | 0.2258          | 0.7778              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.3045         |
+| 0.3924        | 2.8571 | 2500 | 0.1981          | 0.1981 | 0.1980 | 0.235           | 0.8043              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2988         |
+| 0.3819        | 3.1429 | 2750 | 0.2035          | 0.2035 | 0.2084 | 0.218           | 0.7638              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.294          |
+| 0.3874        | 3.4286 | 3000 | 0.1970          | 0.1970 | 0.1963 | 0.2233          | 0.8073              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.286          |
+| 0.3934        | 3.7143 | 3250 | 0.1994          | 0.1994 | 0.2079 | 0.184           | 0.786               | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2487         |
+| 0.3813        | 4.0    | 3500 | 0.1985          | 0.1985 | 0.1942 | 0.245           | 0.8005              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.314          |
+| 0.3939        | 4.2857 | 3750 | 0.1986          | 0.1986 | 0.2017 | 0.1905          | 0.8033              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2507         |
+| 0.3985        | 4.5714 | 4000 | 0.1956          | 0.1956 | 0.1993 | 0.2062          | 0.797               | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.273          |
+| 0.378         | 4.8571 | 4250 | 0.1960          | 0.1960 | 0.1991 | 0.227           | 0.7887              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2983         |
+| 0.3853        | 5.1429 | 4500 | 0.1957          | 0.1957 | 0.1982 | 0.2122          | 0.803               | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2747         |
+| 0.3727        | 5.4286 | 4750 | 0.1955          | 0.1955 | 0.1989 | 0.2122          | 0.8025              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2745         |
+| 0.3826        | 5.7143 | 5000 | 0.1956          | 0.1956 | 0.1975 | 0.2278          | 0.8007              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2945         |
+| 0.3746        | 6.0    | 5250 | 0.1954          | 0.1954 | 0.1976 | 0.2235          | 0.8013              | 0.9885            | 0.9928                | 0.997            | 0.9978             | 0.2898         |
 ### Framework versions