davanstrien HF Staff commited on
Commit
9951f65
·
verified ·
1 Parent(s): dc92d83

Model save

Browse files
Files changed (1) hide show
  1. README.md +14 -23
README.md CHANGED
@@ -1,8 +1,7 @@
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
- base_model:
5
- - lightonai/LightOnOCR-1B-1025
6
  tags:
7
  - generated_from_trainer
8
  model-index:
@@ -17,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [lightonai/LightOnOCR-1B-1025](https://huggingface.co/lightonai/LightOnOCR-1B-1025) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 0.1898
21
 
22
  ## Model description
23
 
@@ -37,11 +36,11 @@ More information needed
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 6e-05
40
- - train_batch_size: 4
41
  - eval_batch_size: 8
42
  - seed: 42
43
- - gradient_accumulation_steps: 4
44
- - total_train_batch_size: 16
45
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
46
  - lr_scheduler_type: linear
47
  - lr_scheduler_warmup_steps: 10
@@ -51,22 +50,14 @@ The following hyperparameters were used during training:
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:------:|:----:|:---------------:|
54
- | 0.9005 | 0.0619 | 50 | 0.2190 |
55
- | 0.7568 | 0.1239 | 100 | 0.2133 |
56
- | 0.7556 | 0.1858 | 150 | 0.2129 |
57
- | 0.8055 | 0.2478 | 200 | 0.2108 |
58
- | 0.7609 | 0.3097 | 250 | 0.2021 |
59
- | 0.6656 | 0.3716 | 300 | 0.1993 |
60
- | 0.7036 | 0.4336 | 350 | 0.1979 |
61
- | 0.8552 | 0.4955 | 400 | 0.1929 |
62
- | 0.8087 | 0.5574 | 450 | 0.1944 |
63
- | 0.7815 | 0.6194 | 500 | 0.1913 |
64
- | 0.6349 | 0.6813 | 550 | 0.1930 |
65
- | 0.7542 | 0.7433 | 600 | 0.1901 |
66
- | 0.6332 | 0.8052 | 650 | 0.1913 |
67
- | 0.6949 | 0.8671 | 700 | 0.1904 |
68
- | 0.7974 | 0.9291 | 750 | 0.1896 |
69
- | 0.5963 | 0.9910 | 800 | 0.1898 |
70
 
71
 
72
  ### Framework versions
@@ -74,4 +65,4 @@ The following hyperparameters were used during training:
74
  - Transformers 5.0.0.dev0
75
  - Pytorch 2.9.0+cu128
76
  - Datasets 4.4.0
77
- - Tokenizers 0.22.1
 
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
+ base_model: lightonai/LightOnOCR-1B-1025
 
5
  tags:
6
  - generated_from_trainer
7
  model-index:
 
16
 
17
  This model is a fine-tuned version of [lightonai/LightOnOCR-1B-1025](https://huggingface.co/lightonai/LightOnOCR-1B-1025) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.1956
20
 
21
  ## Model description
22
 
 
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 6e-05
39
+ - train_batch_size: 16
40
  - eval_batch_size: 8
41
  - seed: 42
42
+ - gradient_accumulation_steps: 2
43
+ - total_train_batch_size: 32
44
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
  - lr_scheduler_type: linear
46
  - lr_scheduler_warmup_steps: 10
 
50
 
51
  | Training Loss | Epoch | Step | Validation Loss |
52
  |:-------------:|:------:|:----:|:---------------:|
53
+ | 0.4379 | 0.1238 | 50 | 0.2103 |
54
+ | 0.4013 | 0.2475 | 100 | 0.2058 |
55
+ | 0.3717 | 0.3713 | 150 | 0.1978 |
56
+ | 0.4181 | 0.4950 | 200 | 0.2009 |
57
+ | 0.4247 | 0.6188 | 250 | 0.2007 |
58
+ | 0.3612 | 0.7426 | 300 | 0.1963 |
59
+ | 0.3554 | 0.8663 | 350 | 0.1971 |
60
+ | 0.3778 | 0.9901 | 400 | 0.1956 |
 
 
 
 
 
 
 
 
61
 
62
 
63
  ### Framework versions
 
65
  - Transformers 5.0.0.dev0
66
  - Pytorch 2.9.0+cu128
67
  - Datasets 4.4.0
68
+ - Tokenizers 0.22.1