Update model card and embedded training curves
Browse files- README.md +19 -3
- assets/tokens_per_sec.png +0 -0
- assets/train_loss.png +0 -0
- assets/val_perplexity.png +0 -0
README.md
CHANGED
|
@@ -7,14 +7,14 @@ tags:
|
|
| 7 |
- global
|
| 8 |
- 3b
|
| 9 |
- without-metadata
|
| 10 |
-
-
|
| 11 |
---
|
| 12 |
|
| 13 |
# combined_without_metadata_3b
|
| 14 |
|
| 15 |
## Summary
|
| 16 |
|
| 17 |
-
This repo contains the global combined model at the final 10k-step checkpoint for the metadata localization project. It
|
| 18 |
|
| 19 |
## Variant Metadata
|
| 20 |
|
|
@@ -51,6 +51,22 @@ This repo contains the global combined model at the final 10k-step checkpoint fo
|
|
| 51 |
- `min_decay_lr`: `0`
|
| 52 |
- `checkpoint_interval`: `100`
|
| 53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
## Project Context
|
| 55 |
|
| 56 |
This model is part of the metadata localization release. Related checkpoints and variants are grouped in the public Hugging Face collection [Metadata Conditioned LLMs](https://huggingface.co/collections/iamshnoo/metadata-conditioned-llms).
|
|
@@ -58,4 +74,4 @@ This model is part of the metadata localization release. Related checkpoints and
|
|
| 58 |
- Project repository: [https://github.com/iamshnoo/metadata_localization](https://github.com/iamshnoo/metadata_localization)
|
| 59 |
- Paper: [https://arxiv.org/abs/2601.15236](https://arxiv.org/abs/2601.15236)
|
| 60 |
|
| 61 |
-
Last synced: `2026-04-02
|
|
|
|
| 7 |
- global
|
| 8 |
- 3b
|
| 9 |
- without-metadata
|
| 10 |
+
- pretraining
|
| 11 |
---
|
| 12 |
|
| 13 |
# combined_without_metadata_3b
|
| 14 |
|
| 15 |
## Summary
|
| 16 |
|
| 17 |
+
This repo contains the global combined model at the final 10k-step checkpoint for the metadata localization project. It was trained from scratch on the project corpus, using the Llama 3.2 tokenizer and vocabulary.
|
| 18 |
|
| 19 |
## Variant Metadata
|
| 20 |
|
|
|
|
| 51 |
- `min_decay_lr`: `0`
|
| 52 |
- `checkpoint_interval`: `100`
|
| 53 |
|
| 54 |
+
## Training Curves
|
| 55 |
+
|
| 56 |
+
Static plots below were exported from the private Weights & Biases run and embedded here for public access.
|
| 57 |
+
|
| 58 |
+
### Train Loss
|
| 59 |
+
|
| 60 |
+

|
| 61 |
+
|
| 62 |
+
### Validation Perplexity
|
| 63 |
+
|
| 64 |
+

|
| 65 |
+
|
| 66 |
+
### Throughput
|
| 67 |
+
|
| 68 |
+

|
| 69 |
+
|
| 70 |
## Project Context
|
| 71 |
|
| 72 |
This model is part of the metadata localization release. Related checkpoints and variants are grouped in the public Hugging Face collection [Metadata Conditioned LLMs](https://huggingface.co/collections/iamshnoo/metadata-conditioned-llms).
|
|
|
|
| 74 |
- Project repository: [https://github.com/iamshnoo/metadata_localization](https://github.com/iamshnoo/metadata_localization)
|
| 75 |
- Paper: [https://arxiv.org/abs/2601.15236](https://arxiv.org/abs/2601.15236)
|
| 76 |
|
| 77 |
+
Last synced: `2026-04-02 14:40:46 UTC`
|
assets/tokens_per_sec.png
ADDED
|
assets/train_loss.png
ADDED
|
assets/val_perplexity.png
ADDED
|