Update model card and embedded training curves
Browse files- README.md +19 -3
- assets/tokens_per_sec.png +0 -0
- assets/train_loss.png +0 -0
- assets/val_perplexity.png +0 -0
README.md
CHANGED
|
@@ -7,7 +7,7 @@ tags:
|
|
| 7 |
- global
|
| 8 |
- 3b
|
| 9 |
- without-metadata
|
| 10 |
-
-
|
| 11 |
- intermediate-checkpoint
|
| 12 |
---
|
| 13 |
|
|
@@ -15,7 +15,7 @@ tags:
|
|
| 15 |
|
| 16 |
## Summary
|
| 17 |
|
| 18 |
-
This repo contains the global combined model exported from the 8k checkpoint for the metadata localization project. It
|
| 19 |
|
| 20 |
## Variant Metadata
|
| 21 |
|
|
@@ -53,6 +53,22 @@ This repo contains the global combined model exported from the 8k checkpoint for
|
|
| 53 |
- `min_decay_lr`: `0`
|
| 54 |
- `checkpoint_interval`: `100`
|
| 55 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
## Project Context
|
| 57 |
|
| 58 |
This model is part of the metadata localization release. Related checkpoints and variants are grouped in the public Hugging Face collection [Metadata Conditioned LLMs](https://huggingface.co/collections/iamshnoo/metadata-conditioned-llms).
|
|
@@ -60,4 +76,4 @@ This model is part of the metadata localization release. Related checkpoints and
|
|
| 60 |
- Project repository: [https://github.com/iamshnoo/metadata_localization](https://github.com/iamshnoo/metadata_localization)
|
| 61 |
- Paper: [https://arxiv.org/abs/2601.15236](https://arxiv.org/abs/2601.15236)
|
| 62 |
|
| 63 |
-
Last synced: `2026-04-02
|
|
|
|
| 7 |
- global
|
| 8 |
- 3b
|
| 9 |
- without-metadata
|
| 10 |
+
- pretraining
|
| 11 |
- intermediate-checkpoint
|
| 12 |
---
|
| 13 |
|
|
|
|
| 15 |
|
| 16 |
## Summary
|
| 17 |
|
| 18 |
+
This repo contains the global combined model exported from the 8k checkpoint for the metadata localization project. It was trained from scratch on the project corpus, using the Llama 3.2 tokenizer and vocabulary.
|
| 19 |
|
| 20 |
## Variant Metadata
|
| 21 |
|
|
|
|
| 53 |
- `min_decay_lr`: `0`
|
| 54 |
- `checkpoint_interval`: `100`
|
| 55 |
|
| 56 |
+
## Training Curves
|
| 57 |
+
|
| 58 |
+
Static plots below were exported from the private Weights & Biases run and embedded here for public access.
|
| 59 |
+
|
| 60 |
+
### Train Loss
|
| 61 |
+
|
| 62 |
+

|
| 63 |
+
|
| 64 |
+
### Validation Perplexity
|
| 65 |
+
|
| 66 |
+

|
| 67 |
+
|
| 68 |
+
### Throughput
|
| 69 |
+
|
| 70 |
+

|
| 71 |
+
|
| 72 |
## Project Context
|
| 73 |
|
| 74 |
This model is part of the metadata localization release. Related checkpoints and variants are grouped in the public Hugging Face collection [Metadata Conditioned LLMs](https://huggingface.co/collections/iamshnoo/metadata-conditioned-llms).
|
|
|
|
| 76 |
- Project repository: [https://github.com/iamshnoo/metadata_localization](https://github.com/iamshnoo/metadata_localization)
|
| 77 |
- Paper: [https://arxiv.org/abs/2601.15236](https://arxiv.org/abs/2601.15236)
|
| 78 |
|
| 79 |
+
Last synced: `2026-04-02 14:40:40 UTC`
|
assets/tokens_per_sec.png
ADDED
|
assets/train_loss.png
ADDED
|
assets/val_perplexity.png
ADDED
|