jordiferrero commited on
Commit
6bce6da
·
verified ·
1 Parent(s): 47b00aa

Add files using upload-large-folder tool

Browse files
Files changed (50) hide show
  1. README.md +120 -0
  2. metadata.json +2522 -0
  3. run_purpose.txt +4 -0
  4. test_smiles.txt +5 -0
  5. visualizations/predictions/predictions_bytes_152,000,000.pkl.gz +3 -0
  6. visualizations/predictions/predictions_bytes_153,000,000.pkl.gz +3 -0
  7. visualizations/predictions/predictions_bytes_154,000,000.pkl.gz +3 -0
  8. visualizations/predictions/predictions_bytes_155,000,000.pkl.gz +3 -0
  9. visualizations/predictions/predictions_bytes_156,000,000.pkl.gz +3 -0
  10. visualizations/predictions/predictions_bytes_157,000,000.pkl.gz +3 -0
  11. visualizations/predictions/predictions_bytes_158,000,000.pkl.gz +3 -0
  12. visualizations/predictions/predictions_bytes_159,000,000.pkl.gz +3 -0
  13. visualizations/predictions/predictions_bytes_160,000,000.pkl.gz +3 -0
  14. visualizations/predictions/predictions_bytes_161,000,000.pkl.gz +3 -0
  15. visualizations/predictions/predictions_bytes_162,000,000.pkl.gz +3 -0
  16. visualizations/predictions/predictions_bytes_163,000,000.pkl.gz +3 -0
  17. visualizations/predictions/predictions_bytes_164,000,000.pkl.gz +3 -0
  18. visualizations/predictions/predictions_bytes_165,000,000.pkl.gz +3 -0
  19. visualizations/predictions/predictions_bytes_166,000,000.pkl.gz +3 -0
  20. visualizations/predictions/predictions_bytes_167,000,000.pkl.gz +3 -0
  21. visualizations/predictions/predictions_bytes_168,000,000.pkl.gz +3 -0
  22. visualizations/predictions/predictions_bytes_169,000,000.pkl.gz +3 -0
  23. visualizations/predictions/predictions_bytes_170,000,000.pkl.gz +3 -0
  24. visualizations/predictions/predictions_bytes_171,000,000.pkl.gz +3 -0
  25. visualizations/predictions/predictions_bytes_172,000,000.pkl.gz +3 -0
  26. visualizations/predictions/predictions_bytes_173,000,000.pkl.gz +3 -0
  27. visualizations/predictions/predictions_bytes_174,000,000.pkl.gz +3 -0
  28. visualizations/predictions/predictions_bytes_175,000,000.pkl.gz +3 -0
  29. visualizations/predictions/predictions_bytes_176,000,000.pkl.gz +3 -0
  30. visualizations/predictions/predictions_bytes_177,000,000.pkl.gz +3 -0
  31. visualizations/predictions/predictions_bytes_178,000,000.pkl.gz +3 -0
  32. visualizations/predictions/predictions_bytes_179,000,000.pkl.gz +3 -0
  33. visualizations/predictions/predictions_bytes_180,000,000.pkl.gz +3 -0
  34. visualizations/predictions/predictions_bytes_181,000,000.pkl.gz +3 -0
  35. visualizations/predictions/predictions_bytes_182,000,000.pkl.gz +3 -0
  36. visualizations/predictions/predictions_bytes_183,000,000.pkl.gz +3 -0
  37. visualizations/predictions/predictions_bytes_184,000,000.pkl.gz +3 -0
  38. visualizations/predictions/predictions_bytes_185,000,000.pkl.gz +3 -0
  39. visualizations/predictions/predictions_bytes_186,000,000.pkl.gz +3 -0
  40. visualizations/predictions/predictions_bytes_187,000,000.pkl.gz +3 -0
  41. visualizations/predictions/predictions_bytes_188,000,000.pkl.gz +3 -0
  42. visualizations/predictions/predictions_bytes_189,000,000.pkl.gz +3 -0
  43. visualizations/predictions/predictions_bytes_190,000,000.pkl.gz +3 -0
  44. visualizations/predictions/predictions_bytes_191,000,000.pkl.gz +3 -0
  45. visualizations/predictions/predictions_bytes_192,000,000.pkl.gz +3 -0
  46. visualizations/predictions/predictions_bytes_193,000,000.pkl.gz +3 -0
  47. visualizations/predictions/predictions_bytes_194,000,000.pkl.gz +3 -0
  48. visualizations/predictions/predictions_bytes_195,000,000.pkl.gz +3 -0
  49. visualizations/predictions/predictions_bytes_196,000,000.pkl.gz +3 -0
  50. visualizations/predictions/predictions_bytes_197,000,000.pkl.gz +3 -0
README.md ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - chemistry
5
+ - smiles
6
+ - tokenization
7
+ - dynamic-tokenization
8
+ - h-net
9
+ - hierarchical-networks
10
+ - molecular-representation
11
+ - polymer
12
+ - mamba
13
+ - transformer
14
+ datasets:
15
+ - PI1M
16
+ language:
17
+ - en
18
+ pipeline_tag: feature-extraction
19
+ ---
20
+
21
+ # PI1M-2stg
22
+
23
+ **H-Net model for dynamic SMILES tokenization**
24
+
25
+ PI1M polymer dataset, 340M bytes (~5 epochs), 10x concatenation, 2-stage hierarchical architecture
26
+
27
+ ## Model Details
28
+
29
+ | Property | Value |
30
+ |----------|-------|
31
+ | **Architecture** | H-Net (Hierarchical Network) |
32
+ | **Parameters** | ~350M |
33
+ | **Dataset** | PI1M |
34
+ | **Training Bytes** | 340M |
35
+ | **Training Epochs** | 5 |
36
+ | **Concatenation** | 10x SMILES per example |
37
+ | **Architecture Variant** | 2-stage |
38
+
39
+ ### Architecture Layout
40
+
41
+ 2-stage: `['m4', ['T1m4', ['T22'], 'm4T1'], 'm4']`
42
+
43
+ - **Encoder**: 4 Mamba blocks for byte-level encoding
44
+ - **Core**: 2-level hierarchical: Stage 0 (T1+4 Mamba) + Stage 1 (22 Transformer blocks)
45
+ - **Decoder**: 4 Mamba blocks for final decoding
46
+
47
+ ## Files
48
+
49
+ - `checkpoints/checkpoint_bytes_best.pt` - Best checkpoint (lowest validation loss)
50
+ - `checkpoints/checkpoint_epoch_*.pt` - Epoch checkpoints
51
+ - `metadata.json` - Training configuration and history
52
+ - `test_smiles.txt` - Test SMILES used during training
53
+ - `visualizations/` - Training evolution GIFs and prediction files
54
+
55
+ ## Usage
56
+
57
+ ```python
58
+ import torch
59
+ from pathlib import Path
60
+
61
+ # Load checkpoint
62
+ checkpoint_path = "checkpoints/checkpoint_bytes_best.pt"
63
+ checkpoint = torch.load(checkpoint_path, map_location="cpu")
64
+
65
+ # The checkpoint contains:
66
+ # - 'model_state_dict': Model weights
67
+ # - 'optimizer_state_dict': Optimizer state
68
+ # - 'epoch': Training epoch
69
+ # - 'metrics': Training metrics
70
+ # - 'cumulative_training_bytes': Total bytes processed
71
+
72
+ # Load into your H-Net model
73
+ # model.load_state_dict(checkpoint['model_state_dict'])
74
+ ```
75
+
76
+ ## Performance
77
+
78
+ ### Tokenization Metrics (from paper)
79
+
80
+ | Metric | Value |
81
+ |--------|-------|
82
+ | Bits-per-byte (BPB) | 0.83 |
83
+ | Mean token length | 2.6 |
84
+
85
+ ### Property Prediction (embeddings)
86
+
87
+ H-Net embeddings outperform RDKit descriptors on classification tasks:
88
+ - BBBP: 0.950 AUC (vs 0.927 for RDKit)
89
+ - HIV: 0.788 AUC (vs 0.760 for RDKit)
90
+
91
+ ## Citation
92
+
93
+ ```bibtex
94
+ @inproceedings{hnet_smiles_2026,
95
+ title={Learning Chemical Grammar: Dynamic Tokenization for SMILES with Hierarchical Networks},
96
+ author={Anonymous},
97
+ booktitle={International Conference on Machine Learning (ICML)},
98
+ year={2026}
99
+ }
100
+ ```
101
+
102
+ ## Related Models
103
+
104
+ All models from the paper are available:
105
+
106
+ **Polymer (PI1M) Models:**
107
+ - [PI1M-68M](https://huggingface.co/jordiferrero/PI1M-68M) - 1 epoch, with concatenation
108
+ - [PI1M-340M](https://huggingface.co/jordiferrero/PI1M-340M) - 5 epochs, with concatenation
109
+ - [PI1M-1B](https://huggingface.co/jordiferrero/PI1M-1B) - 22 epochs, with concatenation (best compression)
110
+ - [PI1M-nocat](https://huggingface.co/jordiferrero/PI1M-nocat) - 5 epochs, no concatenation
111
+ - [PI1M-2stg](https://huggingface.co/jordiferrero/PI1M-2stg) - 5 epochs, 2-stage architecture
112
+
113
+ **Molecular (MOSES) Models:**
114
+ - [MOSES-340M](https://huggingface.co/jordiferrero/MOSES-340M) - 5 epochs, with concatenation
115
+ - [MOSES-nocat](https://huggingface.co/jordiferrero/MOSES-nocat) - 5 epochs, no concatenation
116
+ - [MOSES-2stg](https://huggingface.co/jordiferrero/MOSES-2stg) - 5 epochs, 2-stage architecture
117
+
118
+ ## License
119
+
120
+ MIT License
metadata.json ADDED
@@ -0,0 +1,2522 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "run_name": "run_large_20260115_191350",
3
+ "timestamp": "20260115_191350",
4
+ "phase": "large",
5
+ "config": {
6
+ "arch_layout": [
7
+ "m4",
8
+ [
9
+ "T1m4",
10
+ [
11
+ "T22"
12
+ ],
13
+ "m4T1"
14
+ ],
15
+ "m4"
16
+ ],
17
+ "d_model": [
18
+ 1024,
19
+ 1024,
20
+ 1536
21
+ ],
22
+ "d_intermediate": [
23
+ 0,
24
+ 2816,
25
+ 4096
26
+ ],
27
+ "vocab_size": 256,
28
+ "ssm_cfg": {
29
+ "chunk_size": 256,
30
+ "d_conv": 4,
31
+ "d_state": 128,
32
+ "expand": 2
33
+ },
34
+ "attn_cfg": {
35
+ "num_heads": [
36
+ 16,
37
+ 16,
38
+ 16
39
+ ],
40
+ "rotary_emb_dim": [
41
+ 32,
42
+ 32,
43
+ 48
44
+ ],
45
+ "window_size": [
46
+ 1023,
47
+ 1023,
48
+ -1
49
+ ]
50
+ },
51
+ "tie_embeddings": false
52
+ },
53
+ "training_args": {
54
+ "data": "datasets/PI1M/PI1M_v2.csv",
55
+ "max_samples": null,
56
+ "batch_size": 16,
57
+ "epochs": 5,
58
+ "lr": 0.0001,
59
+ "weight_decay": 0.1,
60
+ "gradient_accumulation": 8,
61
+ "concatenate": true,
62
+ "num_concatenate": 10,
63
+ "concatenate_separator": " ",
64
+ "checkpoint_bytes": 1000000,
65
+ "num_test_samples": 5,
66
+ "num_visualize": 5,
67
+ "skip_visualization": false
68
+ },
69
+ "dataset_info": {
70
+ "train_size": 99574,
71
+ "test_size": 5,
72
+ "test_smiles_file": "checkpoints/run_large_20260115_191350/test_smiles.txt"
73
+ },
74
+ "model_info": {
75
+ "num_parameters": 622923776,
76
+ "device": "cuda",
77
+ "dtype": "torch.bfloat16",
78
+ "use_amp": true
79
+ },
80
+ "training_history": [
81
+ {
82
+ "checkpoint_type": "bytes",
83
+ "bytes_threshold": 1000000,
84
+ "cumulative_training_bytes": 1000166,
85
+ "metrics": {
86
+ "loss": 3.0352404484382043,
87
+ "ce_loss": 3.0252403846153846,
88
+ "lb_loss": 0.9999999889960656
89
+ }
90
+ },
91
+ {
92
+ "checkpoint_type": "bytes",
93
+ "bytes_threshold": 2000000,
94
+ "cumulative_training_bytes": 2000240,
95
+ "metrics": {
96
+ "loss": 2.107340772335346,
97
+ "ce_loss": 2.097340745192308,
98
+ "lb_loss": 0.9999999871620765
99
+ }
100
+ },
101
+ {
102
+ "checkpoint_type": "bytes",
103
+ "bytes_threshold": 3000000,
104
+ "cumulative_training_bytes": 3001794,
105
+ "metrics": {
106
+ "loss": 1.7094185730380476,
107
+ "ce_loss": 1.6994185581841432,
108
+ "lb_loss": 0.9999999873473516
109
+ }
110
+ },
111
+ {
112
+ "checkpoint_type": "bytes",
113
+ "bytes_threshold": 4000000,
114
+ "cumulative_training_bytes": 4002359,
115
+ "metrics": {
116
+ "loss": 1.47650072853762,
117
+ "ce_loss": 1.4665007197696738,
118
+ "lb_loss": 0.9999999890171863
119
+ }
120
+ },
121
+ {
122
+ "checkpoint_type": "bytes",
123
+ "bytes_threshold": 5000000,
124
+ "cumulative_training_bytes": 5005670,
125
+ "metrics": {
126
+ "loss": 1.3171558716545808,
127
+ "ce_loss": 1.3071558665644172,
128
+ "lb_loss": 0.9999999897611653
129
+ }
130
+ },
131
+ {
132
+ "checkpoint_type": "bytes",
133
+ "bytes_threshold": 6000000,
134
+ "cumulative_training_bytes": 6001321,
135
+ "metrics": {
136
+ "loss": 1.2017559169808312,
137
+ "ce_loss": 1.1917559143222507,
138
+ "lb_loss": 0.9999999908535072
139
+ }
140
+ },
141
+ {
142
+ "checkpoint_type": "bytes",
143
+ "bytes_threshold": 7000000,
144
+ "cumulative_training_bytes": 7001673,
145
+ "metrics": {
146
+ "loss": 1.1151093587948484,
147
+ "ce_loss": 1.1051093578860898,
148
+ "lb_loss": 0.9999999904684795
149
+ }
150
+ },
151
+ {
152
+ "checkpoint_type": "bytes",
153
+ "bytes_threshold": 8000000,
154
+ "cumulative_training_bytes": 8004669,
155
+ "metrics": {
156
+ "loss": 1.0468063034773787,
157
+ "ce_loss": 1.0368063038793103,
158
+ "lb_loss": 0.9999999897804297
159
+ }
160
+ },
161
+ {
162
+ "checkpoint_type": "bytes",
163
+ "bytes_threshold": 9000000,
164
+ "cumulative_training_bytes": 9006752,
165
+ "metrics": {
166
+ "loss": 0.9919913549626127,
167
+ "ce_loss": 0.9819913563829787,
168
+ "lb_loss": 0.9999999897023465
169
+ }
170
+ },
171
+ {
172
+ "checkpoint_type": "bytes",
173
+ "bytes_threshold": 10000000,
174
+ "cumulative_training_bytes": 10007281,
175
+ "metrics": {
176
+ "loss": 0.9471440684010387,
177
+ "ce_loss": 0.9371440706355283,
178
+ "lb_loss": 0.9999999893660932
179
+ }
180
+ },
181
+ {
182
+ "checkpoint_type": "bytes",
183
+ "bytes_threshold": 11000000,
184
+ "cumulative_training_bytes": 11001365,
185
+ "metrics": {
186
+ "loss": 0.9100927569407938,
187
+ "ce_loss": 0.900092759836351,
188
+ "lb_loss": 0.999999989540132
189
+ }
190
+ },
191
+ {
192
+ "checkpoint_type": "bytes",
193
+ "bytes_threshold": 12000000,
194
+ "cumulative_training_bytes": 12005386,
195
+ "metrics": {
196
+ "loss": 0.8784949809940438,
197
+ "ce_loss": 0.868494984444799,
198
+ "lb_loss": 0.999999989882045
199
+ }
200
+ },
201
+ {
202
+ "checkpoint_type": "bytes",
203
+ "bytes_threshold": 13000000,
204
+ "cumulative_training_bytes": 13001269,
205
+ "metrics": {
206
+ "loss": 0.8592479796569771,
207
+ "ce_loss": 0.849247983573954,
208
+ "lb_loss": 0.999999989954668
209
+ }
210
+ },
211
+ {
212
+ "checkpoint_type": "bytes",
213
+ "bytes_threshold": 14000000,
214
+ "cumulative_training_bytes": 14005280,
215
+ "metrics": {
216
+ "loss": 0.8378439935604906,
217
+ "ce_loss": 0.8278439978801969,
218
+ "lb_loss": 0.9999999899245978
219
+ }
220
+ },
221
+ {
222
+ "checkpoint_type": "bytes",
223
+ "bytes_threshold": 15000000,
224
+ "cumulative_training_bytes": 15001797,
225
+ "metrics": {
226
+ "loss": 0.8179623213681307,
227
+ "ce_loss": 0.8079623260342186,
228
+ "lb_loss": 0.9999999895889742
229
+ }
230
+ },
231
+ {
232
+ "checkpoint_type": "bytes",
233
+ "bytes_threshold": 16000000,
234
+ "cumulative_training_bytes": 16003308,
235
+ "metrics": {
236
+ "loss": 0.7999628585397256,
237
+ "ce_loss": 0.7899628635112494,
238
+ "lb_loss": 0.999999989471463
239
+ }
240
+ },
241
+ {
242
+ "checkpoint_type": "bytes",
243
+ "bytes_threshold": 17000000,
244
+ "cumulative_training_bytes": 17001780,
245
+ "metrics": {
246
+ "loss": 0.783798369592028,
247
+ "ce_loss": 0.773798374831005,
248
+ "lb_loss": 0.9999999887720858
249
+ }
250
+ },
251
+ {
252
+ "checkpoint_type": "bytes",
253
+ "bytes_threshold": 18000000,
254
+ "cumulative_training_bytes": 18002585,
255
+ "metrics": {
256
+ "loss": 0.7691971354788922,
257
+ "ce_loss": 0.7591971409574468,
258
+ "lb_loss": 0.9999999888399814
259
+ }
260
+ },
261
+ {
262
+ "checkpoint_type": "bytes",
263
+ "bytes_threshold": 19000000,
264
+ "cumulative_training_bytes": 19004388,
265
+ "metrics": {
266
+ "loss": 0.7562685100266358,
267
+ "ce_loss": 0.746268515719468,
268
+ "lb_loss": 0.9999999887325359
269
+ }
270
+ },
271
+ {
272
+ "checkpoint_type": "bytes",
273
+ "bytes_threshold": 20000000,
274
+ "cumulative_training_bytes": 20001795,
275
+ "metrics": {
276
+ "loss": 0.7443181650561906,
277
+ "ce_loss": 0.7343181709418071,
278
+ "lb_loss": 0.9999999887043265
279
+ }
280
+ },
281
+ {
282
+ "checkpoint_type": "bytes",
283
+ "bytes_threshold": 21000000,
284
+ "cumulative_training_bytes": 21006219,
285
+ "metrics": {
286
+ "loss": 0.7334088699425653,
287
+ "ce_loss": 0.723408876002552,
288
+ "lb_loss": 0.9999999888743791
289
+ }
290
+ },
291
+ {
292
+ "checkpoint_type": "bytes",
293
+ "bytes_threshold": 22000000,
294
+ "cumulative_training_bytes": 22003647,
295
+ "metrics": {
296
+ "loss": 0.7233542565306926,
297
+ "ce_loss": 0.7133542627479986,
298
+ "lb_loss": 0.9999999891080966
299
+ }
300
+ },
301
+ {
302
+ "checkpoint_type": "bytes",
303
+ "bytes_threshold": 23000000,
304
+ "cumulative_training_bytes": 23000855,
305
+ "metrics": {
306
+ "loss": 0.7141935865044633,
307
+ "ce_loss": 0.7041935928654679,
308
+ "lb_loss": 0.9999999891627919
309
+ }
310
+ },
311
+ {
312
+ "checkpoint_type": "bytes",
313
+ "bytes_threshold": 24000000,
314
+ "cumulative_training_bytes": 24007583,
315
+ "metrics": {
316
+ "loss": 0.7056202586567953,
317
+ "ce_loss": 0.6956202651515152,
318
+ "lb_loss": 0.9999999891818045
319
+ }
320
+ },
321
+ {
322
+ "checkpoint_type": "bytes",
323
+ "bytes_threshold": 25000000,
324
+ "cumulative_training_bytes": 25004319,
325
+ "metrics": {
326
+ "loss": 0.6978230217149393,
327
+ "ce_loss": 0.687823028330781,
328
+ "lb_loss": 0.9999999895577774
329
+ }
330
+ },
331
+ {
332
+ "checkpoint_type": "bytes",
333
+ "bytes_threshold": 26000000,
334
+ "cumulative_training_bytes": 26000600,
335
+ "metrics": {
336
+ "loss": 0.6906206210337261,
337
+ "ce_loss": 0.6806206277614139,
338
+ "lb_loss": 0.9999999897293911
339
+ }
340
+ },
341
+ {
342
+ "checkpoint_type": "bytes",
343
+ "bytes_threshold": 27000000,
344
+ "cumulative_training_bytes": 27007515,
345
+ "metrics": {
346
+ "loss": 0.6838098439610576,
347
+ "ce_loss": 0.6738098507938758,
348
+ "lb_loss": 0.9999999897926835
349
+ }
350
+ },
351
+ {
352
+ "checkpoint_type": "bytes",
353
+ "bytes_threshold": 28000000,
354
+ "cumulative_training_bytes": 28003023,
355
+ "metrics": {
356
+ "loss": 0.6774992880874688,
357
+ "ce_loss": 0.6674992950164069,
358
+ "lb_loss": 0.9999999895687797
359
+ }
360
+ },
361
+ {
362
+ "checkpoint_type": "bytes",
363
+ "bytes_threshold": 29000000,
364
+ "cumulative_training_bytes": 29003935,
365
+ "metrics": {
366
+ "loss": 0.6715684946638226,
367
+ "ce_loss": 0.6615685016829461,
368
+ "lb_loss": 0.9999999895046732
369
+ }
370
+ },
371
+ {
372
+ "checkpoint_type": "bytes",
373
+ "bytes_threshold": 30000000,
374
+ "cumulative_training_bytes": 30001066,
375
+ "metrics": {
376
+ "loss": 0.6660281601701846,
377
+ "ce_loss": 0.6560281672728433,
378
+ "lb_loss": 0.9999999894573715
379
+ }
380
+ },
381
+ {
382
+ "checkpoint_type": "bytes",
383
+ "bytes_threshold": 31000000,
384
+ "cumulative_training_bytes": 31004436,
385
+ "metrics": {
386
+ "loss": 0.6609612641201458,
387
+ "ce_loss": 0.6509612713015559,
388
+ "lb_loss": 0.9999999894746058
389
+ }
390
+ },
391
+ {
392
+ "checkpoint_type": "bytes",
393
+ "bytes_threshold": 32000000,
394
+ "cumulative_training_bytes": 32006649,
395
+ "metrics": {
396
+ "loss": 0.6561554203763533,
397
+ "ce_loss": 0.646155427631579,
398
+ "lb_loss": 0.9999999895050194
399
+ }
400
+ },
401
+ {
402
+ "checkpoint_type": "bytes",
403
+ "bytes_threshold": 33000000,
404
+ "cumulative_training_bytes": 33004203,
405
+ "metrics": {
406
+ "loss": 0.6516305574961438,
407
+ "ce_loss": 0.6416305648201857,
408
+ "lb_loss": 0.9999999895588151
409
+ }
410
+ },
411
+ {
412
+ "checkpoint_type": "bytes",
413
+ "bytes_threshold": 34000000,
414
+ "cumulative_training_bytes": 34006104,
415
+ "metrics": {
416
+ "loss": 0.6472530922785559,
417
+ "ce_loss": 0.6372530996678676,
418
+ "lb_loss": 0.9999999896520646
419
+ }
420
+ },
421
+ {
422
+ "checkpoint_type": "bytes",
423
+ "bytes_threshold": 35000000,
424
+ "cumulative_training_bytes": 35005618,
425
+ "metrics": {
426
+ "loss": 0.6431124474281974,
427
+ "ce_loss": 0.6331124548785824,
428
+ "lb_loss": 0.9999999896725271
429
+ }
430
+ },
431
+ {
432
+ "checkpoint_type": "bytes",
433
+ "bytes_threshold": 36000000,
434
+ "cumulative_training_bytes": 36002823,
435
+ "metrics": {
436
+ "loss": 0.6391829455870056,
437
+ "ce_loss": 0.6291829530950862,
438
+ "lb_loss": 0.9999999896918579
439
+ }
440
+ },
441
+ {
442
+ "checkpoint_type": "bytes",
443
+ "bytes_threshold": 37000000,
444
+ "cumulative_training_bytes": 37006427,
445
+ "metrics": {
446
+ "loss": 0.6354130913090232,
447
+ "ce_loss": 0.6254130988721026,
448
+ "lb_loss": 0.9999999896752716
449
+ }
450
+ },
451
+ {
452
+ "checkpoint_type": "bytes",
453
+ "bytes_threshold": 38000000,
454
+ "cumulative_training_bytes": 38005922,
455
+ "metrics": {
456
+ "loss": 0.6318843585695924,
457
+ "ce_loss": 0.6218843661847673,
458
+ "lb_loss": 0.9999999897196099
459
+ }
460
+ },
461
+ {
462
+ "checkpoint_type": "bytes",
463
+ "bytes_threshold": 39000000,
464
+ "cumulative_training_bytes": 39004443,
465
+ "metrics": {
466
+ "loss": 0.6285198655931632,
467
+ "ce_loss": 0.6185198732577543,
468
+ "lb_loss": 0.9999999895276488
469
+ }
470
+ },
471
+ {
472
+ "checkpoint_type": "bytes",
473
+ "bytes_threshold": 40000000,
474
+ "cumulative_training_bytes": 40005613,
475
+ "metrics": {
476
+ "loss": 0.6254313996155814,
477
+ "ce_loss": 0.615431407326761,
478
+ "lb_loss": 0.9999999897083863
479
+ }
480
+ },
481
+ {
482
+ "checkpoint_type": "bytes",
483
+ "bytes_threshold": 41000000,
484
+ "cumulative_training_bytes": 41003596,
485
+ "metrics": {
486
+ "loss": 0.6224746753085582,
487
+ "ce_loss": 0.6124746830640643,
488
+ "lb_loss": 0.9999999896242941
489
+ }
490
+ },
491
+ {
492
+ "checkpoint_type": "bytes",
493
+ "bytes_threshold": 42000000,
494
+ "cumulative_training_bytes": 42004130,
495
+ "metrics": {
496
+ "loss": 0.619576180100767,
497
+ "ce_loss": 0.609576187898815,
498
+ "lb_loss": 0.9999999894482935
499
+ }
500
+ },
501
+ {
502
+ "checkpoint_type": "bytes",
503
+ "bytes_threshold": 43000000,
504
+ "cumulative_training_bytes": 43002856,
505
+ "metrics": {
506
+ "loss": 0.6168661168497852,
507
+ "ce_loss": 0.6068661246883903,
508
+ "lb_loss": 0.9999999894715442
509
+ }
510
+ },
511
+ {
512
+ "checkpoint_type": "bytes",
513
+ "bytes_threshold": 44000000,
514
+ "cumulative_training_bytes": 44000615,
515
+ "metrics": {
516
+ "loss": 0.6142508432585481,
517
+ "ce_loss": 0.6042508511355725,
518
+ "lb_loss": 0.9999999894192938
519
+ }
520
+ },
521
+ {
522
+ "checkpoint_type": "bytes",
523
+ "bytes_threshold": 45000000,
524
+ "cumulative_training_bytes": 45002728,
525
+ "metrics": {
526
+ "loss": 0.6117183565789184,
527
+ "ce_loss": 0.6017183644929386,
528
+ "lb_loss": 0.9999999893305962
529
+ }
530
+ },
531
+ {
532
+ "checkpoint_type": "bytes",
533
+ "bytes_threshold": 46000000,
534
+ "cumulative_training_bytes": 46000713,
535
+ "metrics": {
536
+ "loss": 0.6093004826243594,
537
+ "ce_loss": 0.5993004905734975,
538
+ "lb_loss": 0.9999999892538988
539
+ }
540
+ },
541
+ {
542
+ "checkpoint_type": "bytes",
543
+ "bytes_threshold": 47000000,
544
+ "cumulative_training_bytes": 47001586,
545
+ "metrics": {
546
+ "loss": 0.6069603338424916,
547
+ "ce_loss": 0.5969603418255132,
548
+ "lb_loss": 0.999999989075395
549
+ }
550
+ },
551
+ {
552
+ "epoch": 1,
553
+ "checkpoint_type": "epoch",
554
+ "metrics": {
555
+ "loss": 0.6054869050538325,
556
+ "ce_loss": 0.5954869130583226,
557
+ "lb_loss": 0.9999999890922734,
558
+ "training_bytes": 47653409
559
+ },
560
+ "cumulative_training_bytes": 47653409,
561
+ "training_bytes_this_epoch": 47653409
562
+ },
563
+ {
564
+ "checkpoint_type": "bytes",
565
+ "bytes_threshold": 48000000,
566
+ "cumulative_training_bytes": 48006676,
567
+ "metrics": {
568
+ "loss": 0.49496941981108294,
569
+ "ce_loss": 0.4849694293478261,
570
+ "lb_loss": 0.9999999935212343
571
+ }
572
+ },
573
+ {
574
+ "checkpoint_type": "bytes",
575
+ "bytes_threshold": 49000000,
576
+ "cumulative_training_bytes": 49000759,
577
+ "metrics": {
578
+ "loss": 0.49630592086098413,
579
+ "ce_loss": 0.4863059303977273,
580
+ "lb_loss": 0.9999999932267449
581
+ }
582
+ },
583
+ {
584
+ "checkpoint_type": "bytes",
585
+ "bytes_threshold": 50000000,
586
+ "cumulative_training_bytes": 50005240,
587
+ "metrics": {
588
+ "loss": 0.4959718451049506,
589
+ "ce_loss": 0.4859718546416938,
590
+ "lb_loss": 0.9999999914573148
591
+ }
592
+ },
593
+ {
594
+ "checkpoint_type": "bytes",
595
+ "bytes_threshold": 51000000,
596
+ "cumulative_training_bytes": 51007539,
597
+ "metrics": {
598
+ "loss": 0.49752317824864495,
599
+ "ce_loss": 0.4875231877853881,
600
+ "lb_loss": 0.9999999910184781
601
+ }
602
+ },
603
+ {
604
+ "checkpoint_type": "bytes",
605
+ "bytes_threshold": 52000000,
606
+ "cumulative_training_bytes": 52002554,
607
+ "metrics": {
608
+ "loss": 0.4988107849174822,
609
+ "ce_loss": 0.4888107944542254,
610
+ "lb_loss": 0.9999999891914112
611
+ }
612
+ },
613
+ {
614
+ "checkpoint_type": "bytes",
615
+ "bytes_threshold": 53000000,
616
+ "cumulative_training_bytes": 53005306,
617
+ "metrics": {
618
+ "loss": 0.49884286868214095,
619
+ "ce_loss": 0.4888428782188841,
620
+ "lb_loss": 0.9999999886589159
621
+ }
622
+ },
623
+ {
624
+ "checkpoint_type": "bytes",
625
+ "bytes_threshold": 54000000,
626
+ "cumulative_training_bytes": 54000123,
627
+ "metrics": {
628
+ "loss": 0.49843673654287085,
629
+ "ce_loss": 0.488436746079614,
630
+ "lb_loss": 0.9999999882803895
631
+ }
632
+ },
633
+ {
634
+ "checkpoint_type": "bytes",
635
+ "bytes_threshold": 55000000,
636
+ "cumulative_training_bytes": 55003152,
637
+ "metrics": {
638
+ "loss": 0.4980025132497152,
639
+ "ce_loss": 0.48800252278645834,
640
+ "lb_loss": 0.9999999890724818
641
+ }
642
+ },
643
+ {
644
+ "checkpoint_type": "bytes",
645
+ "bytes_threshold": 56000000,
646
+ "cumulative_training_bytes": 56002937,
647
+ "metrics": {
648
+ "loss": 0.4978086235979956,
649
+ "ce_loss": 0.48780863313473877,
650
+ "lb_loss": 0.9999999890733924
651
+ }
652
+ },
653
+ {
654
+ "checkpoint_type": "bytes",
655
+ "bytes_threshold": 57000000,
656
+ "cumulative_training_bytes": 57004703,
657
+ "metrics": {
658
+ "loss": 0.4975252436342879,
659
+ "ce_loss": 0.48752525317103107,
660
+ "lb_loss": 0.9999999889765551
661
+ }
662
+ },
663
+ {
664
+ "checkpoint_type": "bytes",
665
+ "bytes_threshold": 58000000,
666
+ "cumulative_training_bytes": 58002959,
667
+ "metrics": {
668
+ "loss": 0.49715732681680713,
669
+ "ce_loss": 0.4871573363535503,
670
+ "lb_loss": 0.9999999886698271
671
+ }
672
+ },
673
+ {
674
+ "checkpoint_type": "bytes",
675
+ "bytes_threshold": 59000000,
676
+ "cumulative_training_bytes": 59000108,
677
+ "metrics": {
678
+ "loss": 0.4970432515893526,
679
+ "ce_loss": 0.48704326112609575,
680
+ "lb_loss": 0.9999999883443378
681
+ }
682
+ },
683
+ {
684
+ "checkpoint_type": "bytes",
685
+ "bytes_threshold": 60000000,
686
+ "cumulative_training_bytes": 60007478,
687
+ "metrics": {
688
+ "loss": 0.4969303793951454,
689
+ "ce_loss": 0.48693038893188856,
690
+ "lb_loss": 0.9999999884481401
691
+ }
692
+ },
693
+ {
694
+ "checkpoint_type": "bytes",
695
+ "bytes_threshold": 61000000,
696
+ "cumulative_training_bytes": 61002660,
697
+ "metrics": {
698
+ "loss": 0.49673105242600757,
699
+ "ce_loss": 0.48673106196275073,
700
+ "lb_loss": 0.9999999883864875
701
+ }
702
+ },
703
+ {
704
+ "checkpoint_type": "bytes",
705
+ "bytes_threshold": 62000000,
706
+ "cumulative_training_bytes": 62003465,
707
+ "metrics": {
708
+ "loss": 0.49654987219300095,
709
+ "ce_loss": 0.4865498817297441,
710
+ "lb_loss": 0.9999999883713753
711
+ }
712
+ },
713
+ {
714
+ "checkpoint_type": "bytes",
715
+ "bytes_threshold": 63000000,
716
+ "cumulative_training_bytes": 63000868,
717
+ "metrics": {
718
+ "loss": 0.4964099013555799,
719
+ "ce_loss": 0.48640991089232305,
720
+ "lb_loss": 0.9999999887089905
721
+ }
722
+ },
723
+ {
724
+ "checkpoint_type": "bytes",
725
+ "bytes_threshold": 64000000,
726
+ "cumulative_training_bytes": 64003546,
727
+ "metrics": {
728
+ "loss": 0.49635096437528303,
729
+ "ce_loss": 0.4863509739120262,
730
+ "lb_loss": 0.9999999889827633
731
+ }
732
+ },
733
+ {
734
+ "checkpoint_type": "bytes",
735
+ "bytes_threshold": 65000000,
736
+ "cumulative_training_bytes": 65001846,
737
+ "metrics": {
738
+ "loss": 0.4962221452934289,
739
+ "ce_loss": 0.48622215483017206,
740
+ "lb_loss": 0.9999999886680185
741
+ }
742
+ },
743
+ {
744
+ "checkpoint_type": "bytes",
745
+ "bytes_threshold": 66000000,
746
+ "cumulative_training_bytes": 66004938,
747
+ "metrics": {
748
+ "loss": 0.4961587034532485,
749
+ "ce_loss": 0.48615871298999164,
750
+ "lb_loss": 0.9999999882679765
751
+ }
752
+ },
753
+ {
754
+ "checkpoint_type": "bytes",
755
+ "bytes_threshold": 67000000,
756
+ "cumulative_training_bytes": 67000216,
757
+ "metrics": {
758
+ "loss": 0.49601907669743406,
759
+ "ce_loss": 0.4860190862341772,
760
+ "lb_loss": 0.9999999884704623
761
+ }
762
+ },
763
+ {
764
+ "checkpoint_type": "bytes",
765
+ "bytes_threshold": 68000000,
766
+ "cumulative_training_bytes": 68000224,
767
+ "metrics": {
768
+ "loss": 0.4964207015242049,
769
+ "ce_loss": 0.4864207110609481,
770
+ "lb_loss": 0.9999999881822244
771
+ }
772
+ },
773
+ {
774
+ "checkpoint_type": "bytes",
775
+ "bytes_threshold": 69000000,
776
+ "cumulative_training_bytes": 69005372,
777
+ "metrics": {
778
+ "loss": 0.49684213258408866,
779
+ "ce_loss": 0.4868421421208318,
780
+ "lb_loss": 0.9999999881602821
781
+ }
782
+ },
783
+ {
784
+ "checkpoint_type": "bytes",
785
+ "bytes_threshold": 70000000,
786
+ "cumulative_training_bytes": 70001864,
787
+ "metrics": {
788
+ "loss": 0.497037369488608,
789
+ "ce_loss": 0.48703737902535116,
790
+ "lb_loss": 0.9999999881770848
791
+ }
792
+ },
793
+ {
794
+ "checkpoint_type": "bytes",
795
+ "bytes_threshold": 71000000,
796
+ "cumulative_training_bytes": 71000907,
797
+ "metrics": {
798
+ "loss": 0.49706029712117744,
799
+ "ce_loss": 0.4870603066579206,
800
+ "lb_loss": 0.9999999880360634
801
+ }
802
+ },
803
+ {
804
+ "checkpoint_type": "bytes",
805
+ "bytes_threshold": 72000000,
806
+ "cumulative_training_bytes": 72005398,
807
+ "metrics": {
808
+ "loss": 0.49712042088778513,
809
+ "ce_loss": 0.4871204304245283,
810
+ "lb_loss": 0.9999999880790711
811
+ }
812
+ },
813
+ {
814
+ "checkpoint_type": "bytes",
815
+ "bytes_threshold": 73000000,
816
+ "cumulative_training_bytes": 73003962,
817
+ "metrics": {
818
+ "loss": 0.49715716096929913,
819
+ "ce_loss": 0.4871571705060423,
820
+ "lb_loss": 0.9999999879890338
821
+ }
822
+ },
823
+ {
824
+ "checkpoint_type": "bytes",
825
+ "bytes_threshold": 74000000,
826
+ "cumulative_training_bytes": 74006324,
827
+ "metrics": {
828
+ "loss": 0.4971806565123705,
829
+ "ce_loss": 0.48718066604911364,
830
+ "lb_loss": 0.9999999879612822
831
+ }
832
+ },
833
+ {
834
+ "checkpoint_type": "bytes",
835
+ "bytes_threshold": 75000000,
836
+ "cumulative_training_bytes": 75002178,
837
+ "metrics": {
838
+ "loss": 0.4972360369138309,
839
+ "ce_loss": 0.48723604645057406,
840
+ "lb_loss": 0.999999987898805
841
+ }
842
+ },
843
+ {
844
+ "checkpoint_type": "bytes",
845
+ "bytes_threshold": 76000000,
846
+ "cumulative_training_bytes": 76006119,
847
+ "metrics": {
848
+ "loss": 0.49723345379388895,
849
+ "ce_loss": 0.4872334633306321,
850
+ "lb_loss": 0.9999999879728066
851
+ }
852
+ },
853
+ {
854
+ "checkpoint_type": "bytes",
855
+ "bytes_threshold": 77000000,
856
+ "cumulative_training_bytes": 77005284,
857
+ "metrics": {
858
+ "loss": 0.4972499007815454,
859
+ "ce_loss": 0.48724991031828857,
860
+ "lb_loss": 0.9999999881039516
861
+ }
862
+ },
863
+ {
864
+ "checkpoint_type": "bytes",
865
+ "bytes_threshold": 78000000,
866
+ "cumulative_training_bytes": 78007177,
867
+ "metrics": {
868
+ "loss": 0.4972263361683527,
869
+ "ce_loss": 0.4872263457050959,
870
+ "lb_loss": 0.9999999881362097
871
+ }
872
+ },
873
+ {
874
+ "checkpoint_type": "bytes",
875
+ "bytes_threshold": 79000000,
876
+ "cumulative_training_bytes": 79001491,
877
+ "metrics": {
878
+ "loss": 0.4971963830499691,
879
+ "ce_loss": 0.48719639258671227,
880
+ "lb_loss": 0.9999999881780725
881
+ }
882
+ },
883
+ {
884
+ "checkpoint_type": "bytes",
885
+ "bytes_threshold": 80000000,
886
+ "cumulative_training_bytes": 80002957,
887
+ "metrics": {
888
+ "loss": 0.49715744238633375,
889
+ "ce_loss": 0.4871574519230769,
890
+ "lb_loss": 0.9999999881778243
891
+ }
892
+ },
893
+ {
894
+ "checkpoint_type": "bytes",
895
+ "bytes_threshold": 81000000,
896
+ "cumulative_training_bytes": 81002131,
897
+ "metrics": {
898
+ "loss": 0.4970846991314543,
899
+ "ce_loss": 0.4870847086681975,
900
+ "lb_loss": 0.9999999881201305
901
+ }
902
+ },
903
+ {
904
+ "checkpoint_type": "bytes",
905
+ "bytes_threshold": 82000000,
906
+ "cumulative_training_bytes": 82000379,
907
+ "metrics": {
908
+ "loss": 0.497049108552869,
909
+ "ce_loss": 0.48704911808961215,
910
+ "lb_loss": 0.9999999881481625
911
+ }
912
+ },
913
+ {
914
+ "checkpoint_type": "bytes",
915
+ "bytes_threshold": 83000000,
916
+ "cumulative_training_bytes": 83002326,
917
+ "metrics": {
918
+ "loss": 0.49690102084670573,
919
+ "ce_loss": 0.4869010303834489,
920
+ "lb_loss": 0.9999999881849545
921
+ }
922
+ },
923
+ {
924
+ "checkpoint_type": "bytes",
925
+ "bytes_threshold": 84000000,
926
+ "cumulative_training_bytes": 84004823,
927
+ "metrics": {
928
+ "loss": 0.4968436548828903,
929
+ "ce_loss": 0.48684366441963345,
930
+ "lb_loss": 0.9999999882473252
931
+ }
932
+ },
933
+ {
934
+ "checkpoint_type": "bytes",
935
+ "bytes_threshold": 85000000,
936
+ "cumulative_training_bytes": 85001132,
937
+ "metrics": {
938
+ "loss": 0.496751819840948,
939
+ "ce_loss": 0.4867518293776912,
940
+ "lb_loss": 0.9999999883161697
941
+ }
942
+ },
943
+ {
944
+ "checkpoint_type": "bytes",
945
+ "bytes_threshold": 86000000,
946
+ "cumulative_training_bytes": 86000628,
947
+ "metrics": {
948
+ "loss": 0.4967399565175699,
949
+ "ce_loss": 0.4867399660543131,
950
+ "lb_loss": 0.9999999883718574
951
+ }
952
+ },
953
+ {
954
+ "checkpoint_type": "bytes",
955
+ "bytes_threshold": 87000000,
956
+ "cumulative_training_bytes": 87000672,
957
+ "metrics": {
958
+ "loss": 0.49681193101589355,
959
+ "ce_loss": 0.4868119405526367,
960
+ "lb_loss": 0.9999999883783122
961
+ }
962
+ },
963
+ {
964
+ "checkpoint_type": "bytes",
965
+ "bytes_threshold": 88000000,
966
+ "cumulative_training_bytes": 88002075,
967
+ "metrics": {
968
+ "loss": 0.49670176321425324,
969
+ "ce_loss": 0.4867017727509964,
970
+ "lb_loss": 0.9999999882917427
971
+ }
972
+ },
973
+ {
974
+ "checkpoint_type": "bytes",
975
+ "bytes_threshold": 89000000,
976
+ "cumulative_training_bytes": 89004728,
977
+ "metrics": {
978
+ "loss": 0.49663121152807166,
979
+ "ce_loss": 0.4866312210648148,
980
+ "lb_loss": 0.9999999883770943
981
+ }
982
+ },
983
+ {
984
+ "checkpoint_type": "bytes",
985
+ "bytes_threshold": 90000000,
986
+ "cumulative_training_bytes": 90003725,
987
+ "metrics": {
988
+ "loss": 0.49656294246108723,
989
+ "ce_loss": 0.4865629519978304,
990
+ "lb_loss": 0.999999988555391
991
+ }
992
+ },
993
+ {
994
+ "checkpoint_type": "bytes",
995
+ "bytes_threshold": 91000000,
996
+ "cumulative_training_bytes": 91002611,
997
+ "metrics": {
998
+ "loss": 0.4965044176845958,
999
+ "ce_loss": 0.48650442722133896,
1000
+ "lb_loss": 0.9999999886813296
1001
+ }
1002
+ },
1003
+ {
1004
+ "checkpoint_type": "bytes",
1005
+ "bytes_threshold": 92000000,
1006
+ "cumulative_training_bytes": 92003164,
1007
+ "metrics": {
1008
+ "loss": 0.4964984069213024,
1009
+ "ce_loss": 0.4864984164580456,
1010
+ "lb_loss": 0.9999999888961651
1011
+ }
1012
+ },
1013
+ {
1014
+ "checkpoint_type": "bytes",
1015
+ "bytes_threshold": 93000000,
1016
+ "cumulative_training_bytes": 93001402,
1017
+ "metrics": {
1018
+ "loss": 0.49645113397473944,
1019
+ "ce_loss": 0.4864511435114826,
1020
+ "lb_loss": 0.999999989119787
1021
+ }
1022
+ },
1023
+ {
1024
+ "checkpoint_type": "bytes",
1025
+ "bytes_threshold": 94000000,
1026
+ "cumulative_training_bytes": 94007638,
1027
+ "metrics": {
1028
+ "loss": 0.4963942520052126,
1029
+ "ce_loss": 0.48639426154195575,
1030
+ "lb_loss": 0.9999999891207247
1031
+ }
1032
+ },
1033
+ {
1034
+ "checkpoint_type": "bytes",
1035
+ "bytes_threshold": 95000000,
1036
+ "cumulative_training_bytes": 95004271,
1037
+ "metrics": {
1038
+ "loss": 0.4963107445261611,
1039
+ "ce_loss": 0.48631075406290425,
1040
+ "lb_loss": 0.9999999891373812
1041
+ }
1042
+ },
1043
+ {
1044
+ "epoch": 2,
1045
+ "checkpoint_type": "epoch",
1046
+ "metrics": {
1047
+ "loss": 0.4962876345627106,
1048
+ "ce_loss": 0.48628764409945374,
1049
+ "lb_loss": 0.999999989168886,
1050
+ "training_bytes": 47653416
1051
+ },
1052
+ "cumulative_training_bytes": 95306825,
1053
+ "training_bytes_this_epoch": 47653416
1054
+ },
1055
+ {
1056
+ "checkpoint_type": "bytes",
1057
+ "bytes_threshold": 96000000,
1058
+ "cumulative_training_bytes": 96003218,
1059
+ "metrics": {
1060
+ "loss": 0.49025411134237773,
1061
+ "ce_loss": 0.4802541208791209,
1062
+ "lb_loss": 0.9999999908300546
1063
+ }
1064
+ },
1065
+ {
1066
+ "checkpoint_type": "bytes",
1067
+ "bytes_threshold": 97000000,
1068
+ "cumulative_training_bytes": 97000816,
1069
+ "metrics": {
1070
+ "loss": 0.4910255136533021,
1071
+ "ce_loss": 0.48102552319004527,
1072
+ "lb_loss": 0.9999999905603504
1073
+ }
1074
+ },
1075
+ {
1076
+ "checkpoint_type": "bytes",
1077
+ "bytes_threshold": 98000000,
1078
+ "cumulative_training_bytes": 98005358,
1079
+ "metrics": {
1080
+ "loss": 0.49233333855107553,
1081
+ "ce_loss": 0.4823333480878187,
1082
+ "lb_loss": 0.9999999910508607
1083
+ }
1084
+ },
1085
+ {
1086
+ "checkpoint_type": "bytes",
1087
+ "bytes_threshold": 99000000,
1088
+ "cumulative_training_bytes": 99000141,
1089
+ "metrics": {
1090
+ "loss": 0.4918436110636709,
1091
+ "ce_loss": 0.4818436206004141,
1092
+ "lb_loss": 0.999999992102076
1093
+ }
1094
+ },
1095
+ {
1096
+ "checkpoint_type": "bytes",
1097
+ "bytes_threshold": 100000000,
1098
+ "cumulative_training_bytes": 100005926,
1099
+ "metrics": {
1100
+ "loss": 0.4912067290626054,
1101
+ "ce_loss": 0.48120673859934854,
1102
+ "lb_loss": 0.9999999912631629
1103
+ }
1104
+ },
1105
+ {
1106
+ "checkpoint_type": "bytes",
1107
+ "bytes_threshold": 101000000,
1108
+ "cumulative_training_bytes": 101001458,
1109
+ "metrics": {
1110
+ "loss": 0.4909990244014289,
1111
+ "ce_loss": 0.48099903393817206,
1112
+ "lb_loss": 0.999999990947144
1113
+ }
1114
+ },
1115
+ {
1116
+ "checkpoint_type": "bytes",
1117
+ "bytes_threshold": 102000000,
1118
+ "cumulative_training_bytes": 102004630,
1119
+ "metrics": {
1120
+ "loss": 0.49028549532595705,
1121
+ "ce_loss": 0.4802855048627002,
1122
+ "lb_loss": 0.9999999912707156
1123
+ }
1124
+ },
1125
+ {
1126
+ "checkpoint_type": "bytes",
1127
+ "bytes_threshold": 103000000,
1128
+ "cumulative_training_bytes": 103004382,
1129
+ "metrics": {
1130
+ "loss": 0.490558137229426,
1131
+ "ce_loss": 0.48055814676616915,
1132
+ "lb_loss": 0.99999999092586
1133
+ }
1134
+ },
1135
+ {
1136
+ "checkpoint_type": "bytes",
1137
+ "bytes_threshold": 104000000,
1138
+ "cumulative_training_bytes": 104002283,
1139
+ "metrics": {
1140
+ "loss": 0.49042572008880747,
1141
+ "ce_loss": 0.48042572962555063,
1142
+ "lb_loss": 0.9999999908623717
1143
+ }
1144
+ },
1145
+ {
1146
+ "checkpoint_type": "bytes",
1147
+ "bytes_threshold": 105000000,
1148
+ "cumulative_training_bytes": 105006513,
1149
+ "metrics": {
1150
+ "loss": 0.49059360480816605,
1151
+ "ce_loss": 0.4805936143449092,
1152
+ "lb_loss": 0.9999999903559967
1153
+ }
1154
+ },
1155
+ {
1156
+ "checkpoint_type": "bytes",
1157
+ "bytes_threshold": 106000000,
1158
+ "cumulative_training_bytes": 106006613,
1159
+ "metrics": {
1160
+ "loss": 0.4903415147116462,
1161
+ "ce_loss": 0.4803415242483894,
1162
+ "lb_loss": 0.9999999906561079
1163
+ }
1164
+ },
1165
+ {
1166
+ "checkpoint_type": "bytes",
1167
+ "bytes_threshold": 107000000,
1168
+ "cumulative_training_bytes": 107005607,
1169
+ "metrics": {
1170
+ "loss": 0.4906465298378475,
1171
+ "ce_loss": 0.4806465393745907,
1172
+ "lb_loss": 0.9999999903976801
1173
+ }
1174
+ },
1175
+ {
1176
+ "checkpoint_type": "bytes",
1177
+ "bytes_threshold": 108000000,
1178
+ "cumulative_training_bytes": 108001197,
1179
+ "metrics": {
1180
+ "loss": 0.4906608704421343,
1181
+ "ce_loss": 0.48066087997887746,
1182
+ "lb_loss": 0.9999999902877164
1183
+ }
1184
+ },
1185
+ {
1186
+ "checkpoint_type": "bytes",
1187
+ "bytes_threshold": 109000000,
1188
+ "cumulative_training_bytes": 109001691,
1189
+ "metrics": {
1190
+ "loss": 0.49069485728372664,
1191
+ "ce_loss": 0.4806948668204698,
1192
+ "lb_loss": 0.9999999900325566
1193
+ }
1194
+ },
1195
+ {
1196
+ "checkpoint_type": "bytes",
1197
+ "bytes_threshold": 110000000,
1198
+ "cumulative_training_bytes": 110007304,
1199
+ "metrics": {
1200
+ "loss": 0.4906437990875403,
1201
+ "ce_loss": 0.48064380862428346,
1202
+ "lb_loss": 0.9999999899985953
1203
+ }
1204
+ },
1205
+ {
1206
+ "checkpoint_type": "bytes",
1207
+ "bytes_threshold": 111000000,
1208
+ "cumulative_training_bytes": 111006246,
1209
+ "metrics": {
1210
+ "loss": 0.49070311546325684,
1211
+ "ce_loss": 0.480703125,
1212
+ "lb_loss": 0.9999999900562008
1213
+ }
1214
+ },
1215
+ {
1216
+ "checkpoint_type": "bytes",
1217
+ "bytes_threshold": 112000000,
1218
+ "cumulative_training_bytes": 112006808,
1219
+ "metrics": {
1220
+ "loss": 0.4907320227878786,
1221
+ "ce_loss": 0.48073203232462175,
1222
+ "lb_loss": 0.9999999894783181
1223
+ }
1224
+ },
1225
+ {
1226
+ "checkpoint_type": "bytes",
1227
+ "bytes_threshold": 113000000,
1228
+ "cumulative_training_bytes": 113006280,
1229
+ "metrics": {
1230
+ "loss": 0.4907356900739835,
1231
+ "ce_loss": 0.48073569961072665,
1232
+ "lb_loss": 0.999999989610436
1233
+ }
1234
+ },
1235
+ {
1236
+ "checkpoint_type": "bytes",
1237
+ "bytes_threshold": 114000000,
1238
+ "cumulative_training_bytes": 114000244,
1239
+ "metrics": {
1240
+ "loss": 0.4906710912515451,
1241
+ "ce_loss": 0.4806711007882883,
1242
+ "lb_loss": 0.9999999897974031
1243
+ }
1244
+ },
1245
+ {
1246
+ "checkpoint_type": "bytes",
1247
+ "bytes_threshold": 115000000,
1248
+ "cumulative_training_bytes": 115000090,
1249
+ "metrics": {
1250
+ "loss": 0.49064408903496304,
1251
+ "ce_loss": 0.4806440985717062,
1252
+ "lb_loss": 0.9999999897608811
1253
+ }
1254
+ },
1255
+ {
1256
+ "checkpoint_type": "bytes",
1257
+ "bytes_threshold": 116000000,
1258
+ "cumulative_training_bytes": 116003964,
1259
+ "metrics": {
1260
+ "loss": 0.4908688999492036,
1261
+ "ce_loss": 0.48086890948594674,
1262
+ "lb_loss": 0.9999999897499409
1263
+ }
1264
+ },
1265
+ {
1266
+ "checkpoint_type": "bytes",
1267
+ "bytes_threshold": 117000000,
1268
+ "cumulative_training_bytes": 117001141,
1269
+ "metrics": {
1270
+ "loss": 0.49077886969755463,
1271
+ "ce_loss": 0.4807788792342978,
1272
+ "lb_loss": 0.9999999896522636
1273
+ }
1274
+ },
1275
+ {
1276
+ "checkpoint_type": "bytes",
1277
+ "bytes_threshold": 118000000,
1278
+ "cumulative_training_bytes": 118002964,
1279
+ "metrics": {
1280
+ "loss": 0.49081061967910844,
1281
+ "ce_loss": 0.4808106292158516,
1282
+ "lb_loss": 0.9999999897073936
1283
+ }
1284
+ },
1285
+ {
1286
+ "checkpoint_type": "bytes",
1287
+ "bytes_threshold": 119000000,
1288
+ "cumulative_training_bytes": 119004829,
1289
+ "metrics": {
1290
+ "loss": 0.49074190038735244,
1291
+ "ce_loss": 0.4807419099240956,
1292
+ "lb_loss": 0.9999999899118753
1293
+ }
1294
+ },
1295
+ {
1296
+ "checkpoint_type": "bytes",
1297
+ "bytes_threshold": 120000000,
1298
+ "cumulative_training_bytes": 120005174,
1299
+ "metrics": {
1300
+ "loss": 0.49069510202198013,
1301
+ "ce_loss": 0.4806951115587233,
1302
+ "lb_loss": 0.999999989785755
1303
+ }
1304
+ },
1305
+ {
1306
+ "checkpoint_type": "bytes",
1307
+ "bytes_threshold": 121000000,
1308
+ "cumulative_training_bytes": 121000398,
1309
+ "metrics": {
1310
+ "loss": 0.4906328099449369,
1311
+ "ce_loss": 0.4806328194816801,
1312
+ "lb_loss": 0.9999999898084403
1313
+ }
1314
+ },
1315
+ {
1316
+ "checkpoint_type": "bytes",
1317
+ "bytes_threshold": 122000000,
1318
+ "cumulative_training_bytes": 122005153,
1319
+ "metrics": {
1320
+ "loss": 0.4905734521533371,
1321
+ "ce_loss": 0.48057346169008025,
1322
+ "lb_loss": 0.9999999895931111
1323
+ }
1324
+ },
1325
+ {
1326
+ "checkpoint_type": "bytes",
1327
+ "bytes_threshold": 123000000,
1328
+ "cumulative_training_bytes": 123002062,
1329
+ "metrics": {
1330
+ "loss": 0.49056105234136627,
1331
+ "ce_loss": 0.48056106187810943,
1332
+ "lb_loss": 0.9999999894398626
1333
+ }
1334
+ },
1335
+ {
1336
+ "checkpoint_type": "bytes",
1337
+ "bytes_threshold": 124000000,
1338
+ "cumulative_training_bytes": 124006089,
1339
+ "metrics": {
1340
+ "loss": 0.4904723872690717,
1341
+ "ce_loss": 0.4804723968058149,
1342
+ "lb_loss": 0.9999999896498737
1343
+ }
1344
+ },
1345
+ {
1346
+ "checkpoint_type": "bytes",
1347
+ "bytes_threshold": 125000000,
1348
+ "cumulative_training_bytes": 125006477,
1349
+ "metrics": {
1350
+ "loss": 0.4903383307249222,
1351
+ "ce_loss": 0.4803383402616654,
1352
+ "lb_loss": 0.9999999898584517
1353
+ }
1354
+ },
1355
+ {
1356
+ "checkpoint_type": "bytes",
1357
+ "bytes_threshold": 126000000,
1358
+ "cumulative_training_bytes": 126002630,
1359
+ "metrics": {
1360
+ "loss": 0.49058030584739254,
1361
+ "ce_loss": 0.4805803153841357,
1362
+ "lb_loss": 0.9999999897561486
1363
+ }
1364
+ },
1365
+ {
1366
+ "checkpoint_type": "bytes",
1367
+ "bytes_threshold": 127000000,
1368
+ "cumulative_training_bytes": 127007067,
1369
+ "metrics": {
1370
+ "loss": 0.49066594004055153,
1371
+ "ce_loss": 0.4806659495772947,
1372
+ "lb_loss": 0.9999999898067419
1373
+ }
1374
+ },
1375
+ {
1376
+ "checkpoint_type": "bytes",
1377
+ "bytes_threshold": 128000000,
1378
+ "cumulative_training_bytes": 128000583,
1379
+ "metrics": {
1380
+ "loss": 0.49058034760611396,
1381
+ "ce_loss": 0.48058035714285713,
1382
+ "lb_loss": 0.999999989768102
1383
+ }
1384
+ },
1385
+ {
1386
+ "checkpoint_type": "bytes",
1387
+ "bytes_threshold": 129000000,
1388
+ "cumulative_training_bytes": 129007289,
1389
+ "metrics": {
1390
+ "loss": 0.4905069065050655,
1391
+ "ce_loss": 0.4805069160418087,
1392
+ "lb_loss": 0.9999999897476218
1393
+ }
1394
+ },
1395
+ {
1396
+ "checkpoint_type": "bytes",
1397
+ "bytes_threshold": 130000000,
1398
+ "cumulative_training_bytes": 130006166,
1399
+ "metrics": {
1400
+ "loss": 0.49045753542133275,
1401
+ "ce_loss": 0.4804575449580759,
1402
+ "lb_loss": 0.9999999899782128
1403
+ }
1404
+ },
1405
+ {
1406
+ "checkpoint_type": "bytes",
1407
+ "bytes_threshold": 131000000,
1408
+ "cumulative_training_bytes": 131001304,
1409
+ "metrics": {
1410
+ "loss": 0.4904289406187695,
1411
+ "ce_loss": 0.4804289501555127,
1412
+ "lb_loss": 0.9999999901426038
1413
+ }
1414
+ },
1415
+ {
1416
+ "checkpoint_type": "bytes",
1417
+ "bytes_threshold": 132000000,
1418
+ "cumulative_training_bytes": 132007108,
1419
+ "metrics": {
1420
+ "loss": 0.4903701265992885,
1421
+ "ce_loss": 0.4803701361360317,
1422
+ "lb_loss": 0.9999999899394623
1423
+ }
1424
+ },
1425
+ {
1426
+ "checkpoint_type": "bytes",
1427
+ "bytes_threshold": 133000000,
1428
+ "cumulative_training_bytes": 133003089,
1429
+ "metrics": {
1430
+ "loss": 0.49030012820954977,
1431
+ "ce_loss": 0.48030013774629293,
1432
+ "lb_loss": 0.9999999899266576
1433
+ }
1434
+ },
1435
+ {
1436
+ "checkpoint_type": "bytes",
1437
+ "bytes_threshold": 134000000,
1438
+ "cumulative_training_bytes": 134000170,
1439
+ "metrics": {
1440
+ "loss": 0.49024726003084046,
1441
+ "ce_loss": 0.4802472695675836,
1442
+ "lb_loss": 0.999999989902716
1443
+ }
1444
+ },
1445
+ {
1446
+ "checkpoint_type": "bytes",
1447
+ "bytes_threshold": 135000000,
1448
+ "cumulative_training_bytes": 135007268,
1449
+ "metrics": {
1450
+ "loss": 0.4902310506127265,
1451
+ "ce_loss": 0.48023106014946965,
1452
+ "lb_loss": 0.999999989883879
1453
+ }
1454
+ },
1455
+ {
1456
+ "checkpoint_type": "bytes",
1457
+ "bytes_threshold": 136000000,
1458
+ "cumulative_training_bytes": 136002367,
1459
+ "metrics": {
1460
+ "loss": 0.49015822482355786,
1461
+ "ce_loss": 0.48015823436030103,
1462
+ "lb_loss": 0.9999999898845927
1463
+ }
1464
+ },
1465
+ {
1466
+ "checkpoint_type": "bytes",
1467
+ "bytes_threshold": 137000000,
1468
+ "cumulative_training_bytes": 137002293,
1469
+ "metrics": {
1470
+ "loss": 0.49018864670178053,
1471
+ "ce_loss": 0.4801886562385237,
1472
+ "lb_loss": 0.9999999900512997
1473
+ }
1474
+ },
1475
+ {
1476
+ "checkpoint_type": "bytes",
1477
+ "bytes_threshold": 138000000,
1478
+ "cumulative_training_bytes": 138004174,
1479
+ "metrics": {
1480
+ "loss": 0.4901451457887006,
1481
+ "ce_loss": 0.4801451553254438,
1482
+ "lb_loss": 0.9999999901139867
1483
+ }
1484
+ },
1485
+ {
1486
+ "checkpoint_type": "bytes",
1487
+ "bytes_threshold": 139000000,
1488
+ "cumulative_training_bytes": 139006240,
1489
+ "metrics": {
1490
+ "loss": 0.4903390567955974,
1491
+ "ce_loss": 0.4803390663323406,
1492
+ "lb_loss": 0.999999990163354
1493
+ }
1494
+ },
1495
+ {
1496
+ "checkpoint_type": "bytes",
1497
+ "bytes_threshold": 140000000,
1498
+ "cumulative_training_bytes": 140006436,
1499
+ "metrics": {
1500
+ "loss": 0.49048212032282185,
1501
+ "ce_loss": 0.480482129859565,
1502
+ "lb_loss": 0.9999999901594661
1503
+ }
1504
+ },
1505
+ {
1506
+ "checkpoint_type": "bytes",
1507
+ "bytes_threshold": 141000000,
1508
+ "cumulative_training_bytes": 141007445,
1509
+ "metrics": {
1510
+ "loss": 0.4905080058343408,
1511
+ "ce_loss": 0.48050801537108395,
1512
+ "lb_loss": 0.9999999901041711
1513
+ }
1514
+ },
1515
+ {
1516
+ "checkpoint_type": "bytes",
1517
+ "bytes_threshold": 142000000,
1518
+ "cumulative_training_bytes": 142004918,
1519
+ "metrics": {
1520
+ "loss": 0.4905039665249063,
1521
+ "ce_loss": 0.48050397606164946,
1522
+ "lb_loss": 0.9999999901685075
1523
+ }
1524
+ },
1525
+ {
1526
+ "epoch": 3,
1527
+ "checkpoint_type": "epoch",
1528
+ "metrics": {
1529
+ "loss": 0.49051486986155374,
1530
+ "ce_loss": 0.4805148793982969,
1531
+ "lb_loss": 0.9999999901265442,
1532
+ "training_bytes": 47653391
1533
+ },
1534
+ "cumulative_training_bytes": 142960216,
1535
+ "training_bytes_this_epoch": 47653391
1536
+ },
1537
+ {
1538
+ "checkpoint_type": "bytes",
1539
+ "bytes_threshold": 143000000,
1540
+ "cumulative_training_bytes": 143005202,
1541
+ "metrics": {
1542
+ "loss": 0.4950260321299235,
1543
+ "ce_loss": 0.4850260416666667,
1544
+ "lb_loss": 0.9999999701976776
1545
+ }
1546
+ },
1547
+ {
1548
+ "checkpoint_type": "bytes",
1549
+ "bytes_threshold": 144000000,
1550
+ "cumulative_training_bytes": 144006005,
1551
+ "metrics": {
1552
+ "loss": 0.4904259713026729,
1553
+ "ce_loss": 0.48042598083941607,
1554
+ "lb_loss": 0.9999999908635216
1555
+ }
1556
+ },
1557
+ {
1558
+ "checkpoint_type": "bytes",
1559
+ "bytes_threshold": 145000000,
1560
+ "cumulative_training_bytes": 145001749,
1561
+ "metrics": {
1562
+ "loss": 0.4900371510437812,
1563
+ "ce_loss": 0.48003716058052437,
1564
+ "lb_loss": 0.9999999908472268
1565
+ }
1566
+ },
1567
+ {
1568
+ "checkpoint_type": "bytes",
1569
+ "bytes_threshold": 146000000,
1570
+ "cumulative_training_bytes": 146005280,
1571
+ "metrics": {
1572
+ "loss": 0.4904491602627556,
1573
+ "ce_loss": 0.48044916979949875,
1574
+ "lb_loss": 0.9999999887961194
1575
+ }
1576
+ },
1577
+ {
1578
+ "checkpoint_type": "bytes",
1579
+ "bytes_threshold": 147000000,
1580
+ "cumulative_training_bytes": 147006364,
1581
+ "metrics": {
1582
+ "loss": 0.49022183598212477,
1583
+ "ce_loss": 0.48022184551886793,
1584
+ "lb_loss": 0.9999999902158413
1585
+ }
1586
+ },
1587
+ {
1588
+ "checkpoint_type": "bytes",
1589
+ "bytes_threshold": 148000000,
1590
+ "cumulative_training_bytes": 148004606,
1591
+ "metrics": {
1592
+ "loss": 0.4898206580768932,
1593
+ "ce_loss": 0.47982066761363634,
1594
+ "lb_loss": 0.9999999900658926
1595
+ }
1596
+ },
1597
+ {
1598
+ "checkpoint_type": "bytes",
1599
+ "bytes_threshold": 149000000,
1600
+ "cumulative_training_bytes": 149001684,
1601
+ "metrics": {
1602
+ "loss": 0.48951690106452267,
1603
+ "ce_loss": 0.47951691060126583,
1604
+ "lb_loss": 0.9999999901161918
1605
+ }
1606
+ },
1607
+ {
1608
+ "checkpoint_type": "bytes",
1609
+ "bytes_threshold": 150000000,
1610
+ "cumulative_training_bytes": 150003252,
1611
+ "metrics": {
1612
+ "loss": 0.4902524334599995,
1613
+ "ce_loss": 0.4802524429967427,
1614
+ "lb_loss": 0.9999999897099473
1615
+ }
1616
+ },
1617
+ {
1618
+ "checkpoint_type": "bytes",
1619
+ "bytes_threshold": 151000000,
1620
+ "cumulative_training_bytes": 151004021,
1621
+ "metrics": {
1622
+ "loss": 0.4901546794499362,
1623
+ "ce_loss": 0.48015468898667935,
1624
+ "lb_loss": 0.9999999898484954
1625
+ }
1626
+ },
1627
+ {
1628
+ "checkpoint_type": "bytes",
1629
+ "bytes_threshold": 152000000,
1630
+ "cumulative_training_bytes": 152003583,
1631
+ "metrics": {
1632
+ "loss": 0.4901396364200731,
1633
+ "ce_loss": 0.48013964595681624,
1634
+ "lb_loss": 0.9999999896032542
1635
+ }
1636
+ },
1637
+ {
1638
+ "checkpoint_type": "bytes",
1639
+ "bytes_threshold": 153000000,
1640
+ "cumulative_training_bytes": 153004258,
1641
+ "metrics": {
1642
+ "loss": 0.49013379143505564,
1643
+ "ce_loss": 0.4801338009717988,
1644
+ "lb_loss": 0.9999999890058506
1645
+ }
1646
+ },
1647
+ {
1648
+ "checkpoint_type": "bytes",
1649
+ "bytes_threshold": 154000000,
1650
+ "cumulative_training_bytes": 154004288,
1651
+ "metrics": {
1652
+ "loss": 0.4900680994376158,
1653
+ "ce_loss": 0.480068108974359,
1654
+ "lb_loss": 0.999999989632179
1655
+ }
1656
+ },
1657
+ {
1658
+ "checkpoint_type": "bytes",
1659
+ "bytes_threshold": 155000000,
1660
+ "cumulative_training_bytes": 155004149,
1661
+ "metrics": {
1662
+ "loss": 0.4901411515178947,
1663
+ "ce_loss": 0.4801411610546379,
1664
+ "lb_loss": 0.9999999897755691
1665
+ }
1666
+ },
1667
+ {
1668
+ "checkpoint_type": "bytes",
1669
+ "bytes_threshold": 156000000,
1670
+ "cumulative_training_bytes": 156001930,
1671
+ "metrics": {
1672
+ "loss": 0.4899712896123179,
1673
+ "ce_loss": 0.47997129914906106,
1674
+ "lb_loss": 0.9999999894012868
1675
+ }
1676
+ },
1677
+ {
1678
+ "checkpoint_type": "bytes",
1679
+ "bytes_threshold": 157000000,
1680
+ "cumulative_training_bytes": 157005966,
1681
+ "metrics": {
1682
+ "loss": 0.4899014294959544,
1683
+ "ce_loss": 0.47990143903269755,
1684
+ "lb_loss": 0.9999999894758012
1685
+ }
1686
+ },
1687
+ {
1688
+ "checkpoint_type": "bytes",
1689
+ "bytes_threshold": 158000000,
1690
+ "cumulative_training_bytes": 158006659,
1691
+ "metrics": {
1692
+ "loss": 0.48980809543528125,
1693
+ "ce_loss": 0.4798081049720244,
1694
+ "lb_loss": 0.9999999895403854
1695
+ }
1696
+ },
1697
+ {
1698
+ "checkpoint_type": "bytes",
1699
+ "bytes_threshold": 159000000,
1700
+ "cumulative_training_bytes": 159001028,
1701
+ "metrics": {
1702
+ "loss": 0.4895588359286506,
1703
+ "ce_loss": 0.4795588454653938,
1704
+ "lb_loss": 0.9999999895585181
1705
+ }
1706
+ },
1707
+ {
1708
+ "checkpoint_type": "bytes",
1709
+ "bytes_threshold": 160000000,
1710
+ "cumulative_training_bytes": 160001860,
1711
+ "metrics": {
1712
+ "loss": 0.4894983198657726,
1713
+ "ce_loss": 0.47949832940251574,
1714
+ "lb_loss": 0.9999999894232549
1715
+ }
1716
+ },
1717
+ {
1718
+ "checkpoint_type": "bytes",
1719
+ "bytes_threshold": 161000000,
1720
+ "cumulative_training_bytes": 161000396,
1721
+ "metrics": {
1722
+ "loss": 0.4892045148159733,
1723
+ "ce_loss": 0.4792045243527165,
1724
+ "lb_loss": 0.9999999891972906
1725
+ }
1726
+ },
1727
+ {
1728
+ "checkpoint_type": "bytes",
1729
+ "bytes_threshold": 162000000,
1730
+ "cumulative_training_bytes": 162002358,
1731
+ "metrics": {
1732
+ "loss": 0.4891760811347486,
1733
+ "ce_loss": 0.47917609067149175,
1734
+ "lb_loss": 0.9999999890233505
1735
+ }
1736
+ },
1737
+ {
1738
+ "checkpoint_type": "bytes",
1739
+ "bytes_threshold": 163000000,
1740
+ "cumulative_training_bytes": 163000910,
1741
+ "metrics": {
1742
+ "loss": 0.4890335630177085,
1743
+ "ce_loss": 0.47903357255445167,
1744
+ "lb_loss": 0.9999999890675471
1745
+ }
1746
+ },
1747
+ {
1748
+ "checkpoint_type": "bytes",
1749
+ "bytes_threshold": 164000000,
1750
+ "cumulative_training_bytes": 164005597,
1751
+ "metrics": {
1752
+ "loss": 0.48890226029586237,
1753
+ "ce_loss": 0.47890226983260553,
1754
+ "lb_loss": 0.9999999888729321
1755
+ }
1756
+ },
1757
+ {
1758
+ "checkpoint_type": "bytes",
1759
+ "bytes_threshold": 165000000,
1760
+ "cumulative_training_bytes": 165002975,
1761
+ "metrics": {
1762
+ "loss": 0.4889194060730553,
1763
+ "ce_loss": 0.47891941560979845,
1764
+ "lb_loss": 0.9999999890234671
1765
+ }
1766
+ },
1767
+ {
1768
+ "checkpoint_type": "bytes",
1769
+ "bytes_threshold": 166000000,
1770
+ "cumulative_training_bytes": 166007294,
1771
+ "metrics": {
1772
+ "loss": 0.48903683825322025,
1773
+ "ce_loss": 0.4790368477899634,
1774
+ "lb_loss": 0.9999999888872696
1775
+ }
1776
+ },
1777
+ {
1778
+ "checkpoint_type": "bytes",
1779
+ "bytes_threshold": 167000000,
1780
+ "cumulative_training_bytes": 167001945,
1781
+ "metrics": {
1782
+ "loss": 0.4890494737780068,
1783
+ "ce_loss": 0.47904948331474995,
1784
+ "lb_loss": 0.9999999891006479
1785
+ }
1786
+ },
1787
+ {
1788
+ "checkpoint_type": "bytes",
1789
+ "bytes_threshold": 168000000,
1790
+ "cumulative_training_bytes": 168005336,
1791
+ "metrics": {
1792
+ "loss": 0.48906435342565363,
1793
+ "ce_loss": 0.4790643629623968,
1794
+ "lb_loss": 0.9999999890849336
1795
+ }
1796
+ },
1797
+ {
1798
+ "checkpoint_type": "bytes",
1799
+ "bytes_threshold": 169000000,
1800
+ "cumulative_training_bytes": 169002071,
1801
+ "metrics": {
1802
+ "loss": 0.48898078195840533,
1803
+ "ce_loss": 0.4789807914951485,
1804
+ "lb_loss": 0.9999999892392673
1805
+ }
1806
+ },
1807
+ {
1808
+ "checkpoint_type": "bytes",
1809
+ "bytes_threshold": 170000000,
1810
+ "cumulative_training_bytes": 170002507,
1811
+ "metrics": {
1812
+ "loss": 0.48883532836328514,
1813
+ "ce_loss": 0.4788353379000283,
1814
+ "lb_loss": 0.9999999893484761
1815
+ }
1816
+ },
1817
+ {
1818
+ "checkpoint_type": "bytes",
1819
+ "bytes_threshold": 171000000,
1820
+ "cumulative_training_bytes": 171005319,
1821
+ "metrics": {
1822
+ "loss": 0.48872788846981063,
1823
+ "ce_loss": 0.4787278980065538,
1824
+ "lb_loss": 0.9999999894365335
1825
+ }
1826
+ },
1827
+ {
1828
+ "checkpoint_type": "bytes",
1829
+ "bytes_threshold": 172000000,
1830
+ "cumulative_training_bytes": 172007475,
1831
+ "metrics": {
1832
+ "loss": 0.4886464073825819,
1833
+ "ce_loss": 0.4786464169193251,
1834
+ "lb_loss": 0.999999989424222
1835
+ }
1836
+ },
1837
+ {
1838
+ "checkpoint_type": "bytes",
1839
+ "bytes_threshold": 173000000,
1840
+ "cumulative_training_bytes": 173006995,
1841
+ "metrics": {
1842
+ "loss": 0.48865697313400097,
1843
+ "ce_loss": 0.47865698267074414,
1844
+ "lb_loss": 0.9999999893671633
1845
+ }
1846
+ },
1847
+ {
1848
+ "checkpoint_type": "bytes",
1849
+ "bytes_threshold": 174000000,
1850
+ "cumulative_training_bytes": 174002372,
1851
+ "metrics": {
1852
+ "loss": 0.48858499138826916,
1853
+ "ce_loss": 0.4785850009250123,
1854
+ "lb_loss": 0.9999999893993713
1855
+ }
1856
+ },
1857
+ {
1858
+ "checkpoint_type": "bytes",
1859
+ "bytes_threshold": 175000000,
1860
+ "cumulative_training_bytes": 175000872,
1861
+ "metrics": {
1862
+ "loss": 0.48849087463510193,
1863
+ "ce_loss": 0.4784908841718451,
1864
+ "lb_loss": 0.9999999894580696
1865
+ }
1866
+ },
1867
+ {
1868
+ "checkpoint_type": "bytes",
1869
+ "bytes_threshold": 176000000,
1870
+ "cumulative_training_bytes": 176007018,
1871
+ "metrics": {
1872
+ "loss": 0.4885006819310511,
1873
+ "ce_loss": 0.4785006914677943,
1874
+ "lb_loss": 0.9999999893523677
1875
+ }
1876
+ },
1877
+ {
1878
+ "checkpoint_type": "bytes",
1879
+ "bytes_threshold": 177000000,
1880
+ "cumulative_training_bytes": 177003062,
1881
+ "metrics": {
1882
+ "loss": 0.4884071085188124,
1883
+ "ce_loss": 0.4784071180555556,
1884
+ "lb_loss": 0.9999999894492003
1885
+ }
1886
+ },
1887
+ {
1888
+ "checkpoint_type": "bytes",
1889
+ "bytes_threshold": 178000000,
1890
+ "cumulative_training_bytes": 178005739,
1891
+ "metrics": {
1892
+ "loss": 0.4883760760542553,
1893
+ "ce_loss": 0.4783760855909985,
1894
+ "lb_loss": 0.9999999893214313
1895
+ }
1896
+ },
1897
+ {
1898
+ "checkpoint_type": "bytes",
1899
+ "bytes_threshold": 179000000,
1900
+ "cumulative_training_bytes": 179002039,
1901
+ "metrics": {
1902
+ "loss": 0.48841644468038026,
1903
+ "ce_loss": 0.4784164542171234,
1904
+ "lb_loss": 0.9999999892871193
1905
+ }
1906
+ },
1907
+ {
1908
+ "checkpoint_type": "bytes",
1909
+ "bytes_threshold": 180000000,
1910
+ "cumulative_training_bytes": 180001975,
1911
+ "metrics": {
1912
+ "loss": 0.4885168265783871,
1913
+ "ce_loss": 0.47851683611513024,
1914
+ "lb_loss": 0.9999999893307933
1915
+ }
1916
+ },
1917
+ {
1918
+ "checkpoint_type": "bytes",
1919
+ "bytes_threshold": 181000000,
1920
+ "cumulative_training_bytes": 181002156,
1921
+ "metrics": {
1922
+ "loss": 0.4885435228641423,
1923
+ "ce_loss": 0.4785435324008855,
1924
+ "lb_loss": 0.9999999895041126
1925
+ }
1926
+ },
1927
+ {
1928
+ "checkpoint_type": "bytes",
1929
+ "bytes_threshold": 182000000,
1930
+ "cumulative_training_bytes": 182006789,
1931
+ "metrics": {
1932
+ "loss": 0.48842715038972745,
1933
+ "ce_loss": 0.4784271599264706,
1934
+ "lb_loss": 0.9999999895516564
1935
+ }
1936
+ },
1937
+ {
1938
+ "checkpoint_type": "bytes",
1939
+ "bytes_threshold": 183000000,
1940
+ "cumulative_training_bytes": 183001003,
1941
+ "metrics": {
1942
+ "loss": 0.4883744527003505,
1943
+ "ce_loss": 0.4783744622370937,
1944
+ "lb_loss": 0.9999999895720363
1945
+ }
1946
+ },
1947
+ {
1948
+ "checkpoint_type": "bytes",
1949
+ "bytes_threshold": 184000000,
1950
+ "cumulative_training_bytes": 184002846,
1951
+ "metrics": {
1952
+ "loss": 0.4883268971737586,
1953
+ "ce_loss": 0.47832690671050176,
1954
+ "lb_loss": 0.9999999894599509
1955
+ }
1956
+ },
1957
+ {
1958
+ "checkpoint_type": "bytes",
1959
+ "bytes_threshold": 185000000,
1960
+ "cumulative_training_bytes": 185004724,
1961
+ "metrics": {
1962
+ "loss": 0.4882663181899502,
1963
+ "ce_loss": 0.47826632772669336,
1964
+ "lb_loss": 0.9999999894400365
1965
+ }
1966
+ },
1967
+ {
1968
+ "checkpoint_type": "bytes",
1969
+ "bytes_threshold": 186000000,
1970
+ "cumulative_training_bytes": 186007260,
1971
+ "metrics": {
1972
+ "loss": 0.48828150444600626,
1973
+ "ce_loss": 0.47828151398274943,
1974
+ "lb_loss": 0.9999999894740508
1975
+ }
1976
+ },
1977
+ {
1978
+ "checkpoint_type": "bytes",
1979
+ "bytes_threshold": 187000000,
1980
+ "cumulative_training_bytes": 187007019,
1981
+ "metrics": {
1982
+ "loss": 0.4882290801773191,
1983
+ "ce_loss": 0.47822908971406225,
1984
+ "lb_loss": 0.9999999894010861
1985
+ }
1986
+ },
1987
+ {
1988
+ "checkpoint_type": "bytes",
1989
+ "bytes_threshold": 188000000,
1990
+ "cumulative_training_bytes": 188003736,
1991
+ "metrics": {
1992
+ "loss": 0.488216156216274,
1993
+ "ce_loss": 0.4782161657530172,
1994
+ "lb_loss": 0.9999999894326629
1995
+ }
1996
+ },
1997
+ {
1998
+ "checkpoint_type": "bytes",
1999
+ "bytes_threshold": 189000000,
2000
+ "cumulative_training_bytes": 189007403,
2001
+ "metrics": {
2002
+ "loss": 0.4881525303701408,
2003
+ "ce_loss": 0.47815253990688394,
2004
+ "lb_loss": 0.9999999895538252
2005
+ }
2006
+ },
2007
+ {
2008
+ "checkpoint_type": "bytes",
2009
+ "bytes_threshold": 190000000,
2010
+ "cumulative_training_bytes": 190003337,
2011
+ "metrics": {
2012
+ "loss": 0.4880743821461995,
2013
+ "ce_loss": 0.4780743916829427,
2014
+ "lb_loss": 0.9999999895905299
2015
+ }
2016
+ },
2017
+ {
2018
+ "epoch": 4,
2019
+ "checkpoint_type": "epoch",
2020
+ "metrics": {
2021
+ "loss": 0.488021058104645,
2022
+ "ce_loss": 0.4780210676413882,
2023
+ "lb_loss": 0.9999999895423727,
2024
+ "training_bytes": 47653398
2025
+ },
2026
+ "cumulative_training_bytes": 190613614,
2027
+ "training_bytes_this_epoch": 47653398
2028
+ },
2029
+ {
2030
+ "checkpoint_type": "bytes",
2031
+ "bytes_threshold": 191000000,
2032
+ "cumulative_training_bytes": 191004295,
2033
+ "metrics": {
2034
+ "loss": 0.48361365467894313,
2035
+ "ce_loss": 0.4736136642156863,
2036
+ "lb_loss": 0.9999999906502518
2037
+ }
2038
+ },
2039
+ {
2040
+ "checkpoint_type": "bytes",
2041
+ "bytes_threshold": 192000000,
2042
+ "cumulative_training_bytes": 192003486,
2043
+ "metrics": {
2044
+ "loss": 0.4822246106290027,
2045
+ "ce_loss": 0.47222462016574585,
2046
+ "lb_loss": 0.9999999911086994
2047
+ }
2048
+ },
2049
+ {
2050
+ "checkpoint_type": "bytes",
2051
+ "bytes_threshold": 193000000,
2052
+ "cumulative_training_bytes": 193000756,
2053
+ "metrics": {
2054
+ "loss": 0.48206590686197065,
2055
+ "ce_loss": 0.4720659163987138,
2056
+ "lb_loss": 0.9999999875424376
2057
+ }
2058
+ },
2059
+ {
2060
+ "checkpoint_type": "bytes",
2061
+ "bytes_threshold": 194000000,
2062
+ "cumulative_training_bytes": 194006438,
2063
+ "metrics": {
2064
+ "loss": 0.4826827534723066,
2065
+ "ce_loss": 0.47268276300904977,
2066
+ "lb_loss": 0.9999999888072726
2067
+ }
2068
+ },
2069
+ {
2070
+ "checkpoint_type": "bytes",
2071
+ "bytes_threshold": 195000000,
2072
+ "cumulative_training_bytes": 195005382,
2073
+ "metrics": {
2074
+ "loss": 0.48297037944927085,
2075
+ "ce_loss": 0.472970388986014,
2076
+ "lb_loss": 0.9999999882249565
2077
+ }
2078
+ },
2079
+ {
2080
+ "checkpoint_type": "bytes",
2081
+ "bytes_threshold": 196000000,
2082
+ "cumulative_training_bytes": 196002015,
2083
+ "metrics": {
2084
+ "loss": 0.4832766776071315,
2085
+ "ce_loss": 0.47327668714387466,
2086
+ "lb_loss": 0.999999988792289
2087
+ }
2088
+ },
2089
+ {
2090
+ "checkpoint_type": "bytes",
2091
+ "bytes_threshold": 197000000,
2092
+ "cumulative_training_bytes": 197006361,
2093
+ "metrics": {
2094
+ "loss": 0.48392085377260935,
2095
+ "ce_loss": 0.4739208633093525,
2096
+ "lb_loss": 0.9999999878503721
2097
+ }
2098
+ },
2099
+ {
2100
+ "checkpoint_type": "bytes",
2101
+ "bytes_threshold": 198000000,
2102
+ "cumulative_training_bytes": 198003880,
2103
+ "metrics": {
2104
+ "loss": 0.483928608201846,
2105
+ "ce_loss": 0.4739286177385892,
2106
+ "lb_loss": 0.99999998751023
2107
+ }
2108
+ },
2109
+ {
2110
+ "checkpoint_type": "bytes",
2111
+ "bytes_threshold": 199000000,
2112
+ "cumulative_training_bytes": 199006196,
2113
+ "metrics": {
2114
+ "loss": 0.48404037288334817,
2115
+ "ce_loss": 0.47404038242009133,
2116
+ "lb_loss": 0.9999999879702041
2117
+ }
2118
+ },
2119
+ {
2120
+ "checkpoint_type": "bytes",
2121
+ "bytes_threshold": 200000000,
2122
+ "cumulative_training_bytes": 200002073,
2123
+ "metrics": {
2124
+ "loss": 0.4839540720959099,
2125
+ "ce_loss": 0.47395408163265307,
2126
+ "lb_loss": 0.999999988760267
2127
+ }
2128
+ },
2129
+ {
2130
+ "checkpoint_type": "bytes",
2131
+ "bytes_threshold": 201000000,
2132
+ "cumulative_training_bytes": 201002611,
2133
+ "metrics": {
2134
+ "loss": 0.4842147074617819,
2135
+ "ce_loss": 0.4742147169985251,
2136
+ "lb_loss": 0.9999999887032495
2137
+ }
2138
+ },
2139
+ {
2140
+ "checkpoint_type": "bytes",
2141
+ "bytes_threshold": 202000000,
2142
+ "cumulative_training_bytes": 202000755,
2143
+ "metrics": {
2144
+ "loss": 0.48400739288586786,
2145
+ "ce_loss": 0.474007402422611,
2146
+ "lb_loss": 0.9999999884480903
2147
+ }
2148
+ },
2149
+ {
2150
+ "checkpoint_type": "bytes",
2151
+ "bytes_threshold": 203000000,
2152
+ "cumulative_training_bytes": 203001562,
2153
+ "metrics": {
2154
+ "loss": 0.4841745324391381,
2155
+ "ce_loss": 0.47417454197588127,
2156
+ "lb_loss": 0.9999999887573181
2157
+ }
2158
+ },
2159
+ {
2160
+ "checkpoint_type": "bytes",
2161
+ "bytes_threshold": 204000000,
2162
+ "cumulative_training_bytes": 204005682,
2163
+ "metrics": {
2164
+ "loss": 0.48423728844666647,
2165
+ "ce_loss": 0.47423729798340963,
2166
+ "lb_loss": 0.9999999889179008
2167
+ }
2168
+ },
2169
+ {
2170
+ "checkpoint_type": "bytes",
2171
+ "bytes_threshold": 205000000,
2172
+ "cumulative_training_bytes": 205003502,
2173
+ "metrics": {
2174
+ "loss": 0.484211044443555,
2175
+ "ce_loss": 0.47421105398029817,
2176
+ "lb_loss": 0.9999999888598348
2177
+ }
2178
+ },
2179
+ {
2180
+ "checkpoint_type": "bytes",
2181
+ "bytes_threshold": 206000000,
2182
+ "cumulative_training_bytes": 206008019,
2183
+ "metrics": {
2184
+ "loss": 0.48419132477137033,
2185
+ "ce_loss": 0.4741913343081135,
2186
+ "lb_loss": 0.9999999889632016
2187
+ }
2188
+ },
2189
+ {
2190
+ "checkpoint_type": "bytes",
2191
+ "bytes_threshold": 207000000,
2192
+ "cumulative_training_bytes": 207007717,
2193
+ "metrics": {
2194
+ "loss": 0.4842689376011073,
2195
+ "ce_loss": 0.47426894713785045,
2196
+ "lb_loss": 0.9999999892488818
2197
+ }
2198
+ },
2199
+ {
2200
+ "checkpoint_type": "bytes",
2201
+ "bytes_threshold": 208000000,
2202
+ "cumulative_training_bytes": 208005208,
2203
+ "metrics": {
2204
+ "loss": 0.4841757200888075,
2205
+ "ce_loss": 0.47417572962555066,
2206
+ "lb_loss": 0.999999989313176
2207
+ }
2208
+ },
2209
+ {
2210
+ "checkpoint_type": "bytes",
2211
+ "bytes_threshold": 209000000,
2212
+ "cumulative_training_bytes": 209005329,
2213
+ "metrics": {
2214
+ "loss": 0.4841251532236735,
2215
+ "ce_loss": 0.47412516276041666,
2216
+ "lb_loss": 0.9999999895443519
2217
+ }
2218
+ },
2219
+ {
2220
+ "checkpoint_type": "bytes",
2221
+ "bytes_threshold": 210000000,
2222
+ "cumulative_training_bytes": 210006121,
2223
+ "metrics": {
2224
+ "loss": 0.4841085443945093,
2225
+ "ce_loss": 0.47410855393125245,
2226
+ "lb_loss": 0.9999999895909708
2227
+ }
2228
+ },
2229
+ {
2230
+ "checkpoint_type": "bytes",
2231
+ "bytes_threshold": 211000000,
2232
+ "cumulative_training_bytes": 211003532,
2233
+ "metrics": {
2234
+ "loss": 0.48414475511940114,
2235
+ "ce_loss": 0.4741447646561443,
2236
+ "lb_loss": 0.9999999894947094
2237
+ }
2238
+ },
2239
+ {
2240
+ "checkpoint_type": "bytes",
2241
+ "bytes_threshold": 212000000,
2242
+ "cumulative_training_bytes": 212007723,
2243
+ "metrics": {
2244
+ "loss": 0.48415256366347176,
2245
+ "ce_loss": 0.4741525732002149,
2246
+ "lb_loss": 0.9999999894966027
2247
+ }
2248
+ },
2249
+ {
2250
+ "checkpoint_type": "bytes",
2251
+ "bytes_threshold": 213000000,
2252
+ "cumulative_training_bytes": 213005205,
2253
+ "metrics": {
2254
+ "loss": 0.48423584617656507,
2255
+ "ce_loss": 0.47423585571330823,
2256
+ "lb_loss": 0.9999999894371515
2257
+ }
2258
+ },
2259
+ {
2260
+ "checkpoint_type": "bytes",
2261
+ "bytes_threshold": 214000000,
2262
+ "cumulative_training_bytes": 214007542,
2263
+ "metrics": {
2264
+ "loss": 0.4842972747625365,
2265
+ "ce_loss": 0.47429728429927964,
2266
+ "lb_loss": 0.9999999894023176
2267
+ }
2268
+ },
2269
+ {
2270
+ "checkpoint_type": "bytes",
2271
+ "bytes_threshold": 215000000,
2272
+ "cumulative_training_bytes": 215006636,
2273
+ "metrics": {
2274
+ "loss": 0.48420266889447544,
2275
+ "ce_loss": 0.4742026784312186,
2276
+ "lb_loss": 0.9999999897414117
2277
+ }
2278
+ },
2279
+ {
2280
+ "checkpoint_type": "bytes",
2281
+ "bytes_threshold": 216000000,
2282
+ "cumulative_training_bytes": 216002411,
2283
+ "metrics": {
2284
+ "loss": 0.48442725453235763,
2285
+ "ce_loss": 0.4744272640691008,
2286
+ "lb_loss": 0.9999999896582165
2287
+ }
2288
+ },
2289
+ {
2290
+ "checkpoint_type": "bytes",
2291
+ "bytes_threshold": 217000000,
2292
+ "cumulative_training_bytes": 217003351,
2293
+ "metrics": {
2294
+ "loss": 0.48465302021652246,
2295
+ "ce_loss": 0.4746530297532656,
2296
+ "lb_loss": 0.9999999896708351
2297
+ }
2298
+ },
2299
+ {
2300
+ "checkpoint_type": "bytes",
2301
+ "bytes_threshold": 218000000,
2302
+ "cumulative_training_bytes": 218001934,
2303
+ "metrics": {
2304
+ "loss": 0.48466454465906106,
2305
+ "ce_loss": 0.4746645541958042,
2306
+ "lb_loss": 0.9999999894795718
2307
+ }
2308
+ },
2309
+ {
2310
+ "checkpoint_type": "bytes",
2311
+ "bytes_threshold": 219000000,
2312
+ "cumulative_training_bytes": 219001498,
2313
+ "metrics": {
2314
+ "loss": 0.48470003685327034,
2315
+ "ce_loss": 0.4747000463900135,
2316
+ "lb_loss": 0.9999999894626067
2317
+ }
2318
+ },
2319
+ {
2320
+ "checkpoint_type": "bytes",
2321
+ "bytes_threshold": 220000000,
2322
+ "cumulative_training_bytes": 220000802,
2323
+ "metrics": {
2324
+ "loss": 0.48471733482440416,
2325
+ "ce_loss": 0.4747173443611473,
2326
+ "lb_loss": 0.9999999896022145
2327
+ }
2328
+ },
2329
+ {
2330
+ "checkpoint_type": "bytes",
2331
+ "bytes_threshold": 221000000,
2332
+ "cumulative_training_bytes": 221002631,
2333
+ "metrics": {
2334
+ "loss": 0.4847172157847394,
2335
+ "ce_loss": 0.4747172253214826,
2336
+ "lb_loss": 0.9999999895699386
2337
+ }
2338
+ },
2339
+ {
2340
+ "checkpoint_type": "bytes",
2341
+ "bytes_threshold": 222000000,
2342
+ "cumulative_training_bytes": 222001546,
2343
+ "metrics": {
2344
+ "loss": 0.48476429971392804,
2345
+ "ce_loss": 0.4747643092506712,
2346
+ "lb_loss": 0.9999999897434038
2347
+ }
2348
+ },
2349
+ {
2350
+ "checkpoint_type": "bytes",
2351
+ "bytes_threshold": 223000000,
2352
+ "cumulative_training_bytes": 223002095,
2353
+ "metrics": {
2354
+ "loss": 0.4846939223298289,
2355
+ "ce_loss": 0.47469393186657205,
2356
+ "lb_loss": 0.9999999896217131
2357
+ }
2358
+ },
2359
+ {
2360
+ "checkpoint_type": "bytes",
2361
+ "bytes_threshold": 224000000,
2362
+ "cumulative_training_bytes": 224004704,
2363
+ "metrics": {
2364
+ "loss": 0.4846698684175937,
2365
+ "ce_loss": 0.47466987795433685,
2366
+ "lb_loss": 0.9999999897695562
2367
+ }
2368
+ },
2369
+ {
2370
+ "checkpoint_type": "bytes",
2371
+ "bytes_threshold": 225000000,
2372
+ "cumulative_training_bytes": 225002022,
2373
+ "metrics": {
2374
+ "loss": 0.48464635646704474,
2375
+ "ce_loss": 0.4746463660037879,
2376
+ "lb_loss": 0.9999999898666792
2377
+ }
2378
+ },
2379
+ {
2380
+ "checkpoint_type": "bytes",
2381
+ "bytes_threshold": 226000000,
2382
+ "cumulative_training_bytes": 226003031,
2383
+ "metrics": {
2384
+ "loss": 0.4846968945550516,
2385
+ "ce_loss": 0.47469690409179477,
2386
+ "lb_loss": 0.999999989960508
2387
+ }
2388
+ },
2389
+ {
2390
+ "checkpoint_type": "bytes",
2391
+ "bytes_threshold": 227000000,
2392
+ "cumulative_training_bytes": 227007213,
2393
+ "metrics": {
2394
+ "loss": 0.48466569770009893,
2395
+ "ce_loss": 0.4746657072368421,
2396
+ "lb_loss": 0.9999999899738713
2397
+ }
2398
+ },
2399
+ {
2400
+ "checkpoint_type": "bytes",
2401
+ "bytes_threshold": 228000000,
2402
+ "cumulative_training_bytes": 228002618,
2403
+ "metrics": {
2404
+ "loss": 0.4847170054192085,
2405
+ "ce_loss": 0.47471701495595164,
2406
+ "lb_loss": 0.9999999898766133
2407
+ }
2408
+ },
2409
+ {
2410
+ "checkpoint_type": "bytes",
2411
+ "bytes_threshold": 229000000,
2412
+ "cumulative_training_bytes": 229002817,
2413
+ "metrics": {
2414
+ "loss": 0.4849277419846056,
2415
+ "ce_loss": 0.47492775152134875,
2416
+ "lb_loss": 0.9999999898439013
2417
+ }
2418
+ },
2419
+ {
2420
+ "checkpoint_type": "bytes",
2421
+ "bytes_threshold": 230000000,
2422
+ "cumulative_training_bytes": 230004657,
2423
+ "metrics": {
2424
+ "loss": 0.48519230282958,
2425
+ "ce_loss": 0.47519231236632314,
2426
+ "lb_loss": 0.9999999897664978
2427
+ }
2428
+ },
2429
+ {
2430
+ "checkpoint_type": "bytes",
2431
+ "bytes_threshold": 231000000,
2432
+ "cumulative_training_bytes": 231006924,
2433
+ "metrics": {
2434
+ "loss": 0.4853118831206326,
2435
+ "ce_loss": 0.4753118926573758,
2436
+ "lb_loss": 0.9999999898737654
2437
+ }
2438
+ },
2439
+ {
2440
+ "checkpoint_type": "bytes",
2441
+ "bytes_threshold": 232000000,
2442
+ "cumulative_training_bytes": 232007018,
2443
+ "metrics": {
2444
+ "loss": 0.4854375916427203,
2445
+ "ce_loss": 0.4754376011794635,
2446
+ "lb_loss": 0.9999999898104178
2447
+ }
2448
+ },
2449
+ {
2450
+ "checkpoint_type": "bytes",
2451
+ "bytes_threshold": 233000000,
2452
+ "cumulative_training_bytes": 233006236,
2453
+ "metrics": {
2454
+ "loss": 0.48551042782778,
2455
+ "ce_loss": 0.47551043736452314,
2456
+ "lb_loss": 0.9999999898362022
2457
+ }
2458
+ },
2459
+ {
2460
+ "checkpoint_type": "bytes",
2461
+ "bytes_threshold": 234000000,
2462
+ "cumulative_training_bytes": 234000486,
2463
+ "metrics": {
2464
+ "loss": 0.4855104365451488,
2465
+ "ce_loss": 0.47551044608189197,
2466
+ "lb_loss": 0.9999999898169262
2467
+ }
2468
+ },
2469
+ {
2470
+ "checkpoint_type": "bytes",
2471
+ "bytes_threshold": 235000000,
2472
+ "cumulative_training_bytes": 235002824,
2473
+ "metrics": {
2474
+ "loss": 0.485555099787045,
2475
+ "ce_loss": 0.47555510932378814,
2476
+ "lb_loss": 0.9999999898311206
2477
+ }
2478
+ },
2479
+ {
2480
+ "checkpoint_type": "bytes",
2481
+ "bytes_threshold": 236000000,
2482
+ "cumulative_training_bytes": 236004788,
2483
+ "metrics": {
2484
+ "loss": 0.48557505530384387,
2485
+ "ce_loss": 0.47557506484058704,
2486
+ "lb_loss": 0.9999999897843591
2487
+ }
2488
+ },
2489
+ {
2490
+ "checkpoint_type": "bytes",
2491
+ "bytes_threshold": 237000000,
2492
+ "cumulative_training_bytes": 237001532,
2493
+ "metrics": {
2494
+ "loss": 0.4854937203452311,
2495
+ "ce_loss": 0.47549372988197425,
2496
+ "lb_loss": 0.999999989826477
2497
+ }
2498
+ },
2499
+ {
2500
+ "checkpoint_type": "bytes",
2501
+ "bytes_threshold": 238000000,
2502
+ "cumulative_training_bytes": 238004993,
2503
+ "metrics": {
2504
+ "loss": 0.485468689970447,
2505
+ "ce_loss": 0.4754686995071902,
2506
+ "lb_loss": 0.9999999898203087
2507
+ }
2508
+ },
2509
+ {
2510
+ "epoch": 5,
2511
+ "checkpoint_type": "epoch",
2512
+ "metrics": {
2513
+ "loss": 0.48547107517566046,
2514
+ "ce_loss": 0.4754710847124036,
2515
+ "lb_loss": 0.9999999897626342,
2516
+ "training_bytes": 47653400
2517
+ },
2518
+ "cumulative_training_bytes": 238267014,
2519
+ "training_bytes_this_epoch": 47653400
2520
+ }
2521
+ ]
2522
+ }
run_purpose.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ data: polymer
2
+ epoch: 5
3
+ concatenation: 10
4
+ architecture: 2-stage
test_smiles.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ *C(=O)c1ccc(C(=O)Nc2ccc(C(*)=O)cc2N)cc1 *OP(=O)(Oc1ccc(I)cc1)Oc1ccc(C(C)(C)c2ccc(*)cc2)c([N+](=O)[O-])c1 */C=C/C(C=C)CCNC(=O)CCCCCCCCCCC(=O)N* *C(=O)Oc1cc(C)c(*)cc1I *Oc1ccc(Oc2ccc(Nc3ccc(NC(=O)c4ccc(*)cc4)cc3)cc2)cc1 *CC(*)(C)C(=O)OCCCCOc1ccc(-c2ccccc2)c(Cl)c1 */N=N/c1ccccc1C(C)CCC(*)=O *CC(*)C(=O)OCCN(CCC)c1ccc(-c2ccc(-c3ccc(C#N)cc3)cc2)cc1 *CCCCCCCCCCC(C)NC(=O)CCCCCCCCCCCCCCCCC(=O)N* *O[Si](*)(C)CCCCCCCCC(=O)OCCCCCCOc1ccc(/C=C/c2ccc(OCCCC)cc2)cc1
2
+ *Oc1ccc(C(C)(C)c2ccc(OC(=O)COc3ccc([PH](*)(=O)=O)cc3)cc2)cc1 *C(=O)NSCCCCCCCCCCNC(=O)C(*)(C)CC *Oc1ccc(C(C)(CC)c2ccc(Oc3ccc(C(=O)c4ccc(NC(=O)c5cc(C(*)=O)cc(C(F)(F)F)c5)cc4)cc3)cc2)cc1 *C(=O)NCc1ccc(C(C)(C)c2ccc(*)o2)c(C)c1 *CC(*)OC(C#N)C[N+](c1ccccc1)c1ccc([N+](=O)[O-])cc1 *CCOCCOP(=O)(Oc1ccc(C#N)cc1)Oc1ccc(O*)cc1 *Oc1ccc(S(=O)(=O)c2ccc(Oc3ccc(/C=C/c4cc(C)c(*)c(OC)c4)cc3C)cc2)cc1Cl *CC(*)(C)C(=O)NCCCCCOc1ccc(/N=N/c2ccc(F)cc2)cc1 *CCOCCOCCOCCOCCOc1ccc(/N=C/CCCCCCCCCCCC/N=N/c2ccc(OCCCCCCCCCC)cc2)cc1* *CC(*)C(=O)NC(C#N)C(=O)OCCCCCCCCC
3
+ *Oc1ccc(CC(NC(=O)Oc2ccc(OC)cc2)C(*)C)cc1 *C=C(CC)S(=O)(=O)c1ccc(*)cc1 *CC(CCC(*)c1ccccc1-c1ccccc1)COc1ccc2ccccc2c1 *CC(*)(COCC)C(=O)OCCCCCOc1ccc(OC(=O)c2ccc(OCCCCC)cc2)cc1 *CCCCCCCCCCCCCCCCCCNC(=O)CCCCC(=O)NCCC(*)=O *Oc1ccc(Oc2ccc(S(=O)(=O)c3ccc(Oc4ccc(C(=O)C(=O)c5ccc(*)c(Br)c5)c(Cl)c4)nc3)cc2)cc1 *CCCCCCCCCCC(=O)NCCC(=O)NNC(=O)CCCO* *N=P(*)(OCCOCCOC)OCCOCC(O)CO *Nc1ccc(NC(=O)C(C)C(*)=CC(=O)Nc2ccc(C#N)cc2)cc1 *O[Si](*)(C)CCCCCCCCCCCOc1ccc(OC(=O)c2ccc(OCc3ccc(OCCCCCCCCCCCCCC)c(C(=O)OCCCC)c3)cc2)cc1
4
+ *CCCCCCC(=O)Oc1ccc(OC(=O)c2ccc(/C=[SH]/c3ccc(O*)cc3)cc2)cc1 *c1cc(OCC)c(CCCCCCCC)c(*)c1CCCCCC *CCCCCCOC(=O)c1ccc(C(=O)NCCCCNC(=O)CCCO*)cc1 */C=N/c1ccc(Cc2ccc(CSC(=O)c3ccc(NC(=O)CCCCCCCC(=O)Nc4cccc(C(*)=O)c4)cc3)cc2)cc1 *CC(*)(CC(=O)OCCOCCCCC[N+](=O)[O-])NC(C)C */C=C/C=C/CC=C(C=C(C#N)C#N)/C=C/C=C/C=C/C=C/CNC(=O)CCCCCCCCCCCCCCCCCCC* *CC(*)C(=O)c1ccc(C(=O)OCCCCCCCC)cc1 *CCCOC(=O)SCCCCOCCCCCCCCC(=O)O* *CCCCCCCCCCCCCCNC(=O)CCSCCCCCC(=O)S* *O[Si](*)(CCOCCOc1ccc(S(=O)(=O)O)cc1)c1ccc([N+](=O)[O-])cc1
5
+ *CCOC(=O)C(CCCCCCc1ccccc1)C(*)C(=O)O *O[Si](C)(C)COC(=O)NCCCCCCCCCCCCCCNC(=O)OCCCCCCOc1ccc(-c2ccc(OC(=O)c3ccc(C(*)=O)cc3)cc2)cc1 *OC(C[SH]C(=O)OC)C(*)OC(=O)c1ccc(O)cc1 *CCCCCCCC(=O)SCCC* *CC(*)c1cc(CO)c(OC)c(O)c1Br *CC(*)(C)C(=O)OCCOC(=O)N(C)CCN(C)c1ccc(OC)cc1 *Oc1ccc(/N=C/c2ccc([Si](*)(C)OCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOCCOC)cc2)cc1 *CCCCCCNC(=O)CCc1ccc(CCNC(=O)O*)cc1 *N=P(*)(OCCOc1ccc(Cl)cc1)Oc1ccc(COC(=O)c2ccccc2)cc1 *CC/C=C(\C(=*)C#N)C(C)=S
visualizations/predictions/predictions_bytes_152,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5c891d8026d4a291bf5d0953f37e3a60830b4d23e567068d07f50ffdb66e97b1
3
+ size 9837
visualizations/predictions/predictions_bytes_153,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c93e747aca9cdf9ea5348270c2b046c5afb9be63e1ec07da6103dd8fb81a8760
3
+ size 9783
visualizations/predictions/predictions_bytes_154,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9a04b23b12a2e4cf235b16e7b590b2053d2eea452c96bf320563cae7a90edfaa
3
+ size 9689
visualizations/predictions/predictions_bytes_155,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7c526485994525dfb38ac156f5eff4d45c2c788aecd5972fbd8b15ae11f182f9
3
+ size 9607
visualizations/predictions/predictions_bytes_156,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65a3b3fa86721b9ad731016e73021c6f222ec2ad6004d79d7ab10f203a3cf702
3
+ size 9544
visualizations/predictions/predictions_bytes_157,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bd3313373ded547663ae3b47878ab02546970695b3bb10fb39fef114a5b84ed8
3
+ size 9640
visualizations/predictions/predictions_bytes_158,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af95cf2c87d29376faece8d22bbcfc80fcf8abc78c5d0a3712ed7563b9e64261
3
+ size 9654
visualizations/predictions/predictions_bytes_159,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1e6469b84b7d649ed2f9aa9a6d85f4b23af61b719b312d488b1046946493fb8
3
+ size 9650
visualizations/predictions/predictions_bytes_160,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d3d3b33dd522322f61257d8c012bff8e61fe62ebf531b74e89ac16c5c81ff95a
3
+ size 9593
visualizations/predictions/predictions_bytes_161,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de93ac1de06d4c72f0000874233aa1251c4bf334b55f494dd6727f3451911bb3
3
+ size 9524
visualizations/predictions/predictions_bytes_162,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c39e40dedb07cf78fa8edb890020c7295b6da28ccb29a6823a219118f2c8161a
3
+ size 9569
visualizations/predictions/predictions_bytes_163,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:487eb576f9a5e52656eb5dc27a0267bda8981e47132d79575f981a05d1851149
3
+ size 9576
visualizations/predictions/predictions_bytes_164,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8292cd828a23074dfa2cf87ff5acf49d9d8e5a86092d14895382a9bede45faf3
3
+ size 9522
visualizations/predictions/predictions_bytes_165,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4723d23081d884b33dcf813bbd567f3cd8ddd4e769587ce06c153bd09a161866
3
+ size 9447
visualizations/predictions/predictions_bytes_166,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c234d19bcda02be5510b556d5bec3a806cb07c8de7b8143faba8a59e9ef069ef
3
+ size 9442
visualizations/predictions/predictions_bytes_167,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:773b6ef174bac7a148085e53b6d2b0b612eec05623561a6ab4a729a725af3656
3
+ size 9477
visualizations/predictions/predictions_bytes_168,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3431319a220f5ae2e05eaa380b89e1fc3ee2d69d29163b4423c5e12d1d8eb255
3
+ size 9490
visualizations/predictions/predictions_bytes_169,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a0fe65ff8bb42d5e0323636f71a0d74634cb906906eec6c8d177bbf413d285e9
3
+ size 9520
visualizations/predictions/predictions_bytes_170,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbb0a78d8de3103006c12b4402f17442643362644757c40b367eb26feeacdbd8
3
+ size 9648
visualizations/predictions/predictions_bytes_171,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6300eb6c515e02fa751183373ab20083b52ef3da9b81969b0bf0bb8abc571057
3
+ size 9549
visualizations/predictions/predictions_bytes_172,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4decfeeb27e2ed8a8bf9e0a96262aeec77ad9dec4d15a26f7fbd66af5efba0eb
3
+ size 9560
visualizations/predictions/predictions_bytes_173,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3a56fe96e2da60dd7fd50841dff8d243c7371948b3839dddd1e35fc76c4f1963
3
+ size 9561
visualizations/predictions/predictions_bytes_174,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:70f751f38d07ba0db4490cf37875b8687fe2470e3e5c85ccf6903176e866799e
3
+ size 9564
visualizations/predictions/predictions_bytes_175,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:57a760b57390abef5ae6fbb64c72e23491d201a4463e493cadbfabcaf2d0cb91
3
+ size 9537
visualizations/predictions/predictions_bytes_176,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e54724a8ee92d04812c8588224e894b412ce9058ec00004a957220a918e4b61
3
+ size 9505
visualizations/predictions/predictions_bytes_177,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c3aa4e631019f10004abb4a40d089cf89cb3d79434943e09a685874908c0d16
3
+ size 9510
visualizations/predictions/predictions_bytes_178,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6bbc7284ff71d34a7581dece457e127934691ad4a53173de635fcd6d50be68d3
3
+ size 9557
visualizations/predictions/predictions_bytes_179,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d252c84c5fabbf5e8db953fba9face805f8aadccdf74686c0b85bb87a67e5030
3
+ size 9633
visualizations/predictions/predictions_bytes_180,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8b6adb08ee6235c86fe3e32bcfa02dfcdc0d9453274d06b0e415c1f901158228
3
+ size 9694
visualizations/predictions/predictions_bytes_181,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:489cde43c8f84db88e56bf44c633f78f28f06de9435b0582ec7adb8b29113792
3
+ size 9576
visualizations/predictions/predictions_bytes_182,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:367b42ea228585d251d4687291ca23ceae30c962314e5b43a42324c489daf2d8
3
+ size 9614
visualizations/predictions/predictions_bytes_183,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e388536e4432f6d1928f11b305faf9efcd9400ae04f8c562248c027647ac949f
3
+ size 9510
visualizations/predictions/predictions_bytes_184,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e298ffc5c564d28e9d90932fd79406d0cc0cf38a34d9db088a02bdd5adaec17
3
+ size 9440
visualizations/predictions/predictions_bytes_185,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d313110e2ccc3083cd3f0326bbdb9080e3d8c3e5fb02eadc8fd33a90cb95271e
3
+ size 9435
visualizations/predictions/predictions_bytes_186,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:86ad54fd35b8499f75dfe71744da9db6d578ca8638692df67212022ee7355b7c
3
+ size 9398
visualizations/predictions/predictions_bytes_187,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b76c3f2b5d062a71cc9359171ced28721929a02ba4ef825ac3db0614f64f7681
3
+ size 9457
visualizations/predictions/predictions_bytes_188,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2f7e21ccc428bb443f6e789e139eacfa262694d8e3c3efc66299ef8a42c6324f
3
+ size 9496
visualizations/predictions/predictions_bytes_189,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:769678dd876315eb9331b32ee81db4ebd4bbc9d4a6468c88022e42791a92c6c4
3
+ size 9561
visualizations/predictions/predictions_bytes_190,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f3044fdef421700619cb86d44dcd9a389035d36b0e1da0f212033da387295fd9
3
+ size 9551
visualizations/predictions/predictions_bytes_191,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f61977065223548e451337a4276e7a98f086c162c17ca7ddf1ba17d2b389b99d
3
+ size 9549
visualizations/predictions/predictions_bytes_192,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:96948acdfee20e135e2126d4e3dd21829540a259b404852d75a6083b81c514c4
3
+ size 9562
visualizations/predictions/predictions_bytes_193,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8762d93ccbbbdf221a616bb8d08beb9a95ac97939b69f23723d78c4ef1198441
3
+ size 9561
visualizations/predictions/predictions_bytes_194,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e0e1fcc9c80e6b0aab52ca9deb1d6a7d28713a0e1c0701d18d4cc25ad2febd38
3
+ size 9630
visualizations/predictions/predictions_bytes_195,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e0ff58cb25b4a7992d548f00f74c9176bec435702aa926ddd3ca132c2c4f7088
3
+ size 9668
visualizations/predictions/predictions_bytes_196,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36b55dd90c8d357d3394b61e2f007112df233fd0227abaddd278e32b11b1da64
3
+ size 9728
visualizations/predictions/predictions_bytes_197,000,000.pkl.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ff3e7418bd64b986e33e624c318cf29c08b3a286911260c3521704c3e007ba1
3
+ size 9630