iamshnoo commited on
Commit
21d2e05
·
verified ·
1 Parent(s): 964fb77

Update model card and embedded training curves

Browse files
README.md CHANGED
@@ -14,7 +14,7 @@ tags:
14
 
15
  ## Summary
16
 
17
- This repo contains the merged chat model for the combined with metadata branch of the metadata localization project. It was produced by supervised fine-tuning on the project QA benchmark after continued pretraining.
18
 
19
  ## Variant Metadata
20
 
@@ -58,6 +58,22 @@ This repo contains the merged chat model for the combined with metadata branch o
58
  - `per_device_train_batch_size=2`, `gradient_accumulation_steps=8`
59
  - LoRA targets: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
  ## Project Context
62
 
63
  This model is part of the metadata localization release. Related checkpoints and variants are grouped in the public Hugging Face collection [Metadata Conditioned LLMs](https://huggingface.co/collections/iamshnoo/metadata-conditioned-llms).
@@ -65,4 +81,4 @@ This model is part of the metadata localization release. Related checkpoints and
65
  - Project repository: [https://github.com/iamshnoo/metadata_localization](https://github.com/iamshnoo/metadata_localization)
66
  - Paper: [https://arxiv.org/abs/2601.15236](https://arxiv.org/abs/2601.15236)
67
 
68
- Last synced: `2026-04-02 13:51:17 UTC`
 
14
 
15
  ## Summary
16
 
17
+ This repo contains the merged chat model for the combined with metadata branch of the metadata localization project. It was produced by supervised fine-tuning on the project QA benchmark after project pretraining.
18
 
19
  ## Variant Metadata
20
 
 
58
  - `per_device_train_batch_size=2`, `gradient_accumulation_steps=8`
59
  - LoRA targets: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
60
 
61
+ ## Training Curves
62
+
63
+ Static plots below were exported from the private Weights & Biases run and embedded here for public access.
64
+
65
+ ### Train Loss
66
+
67
+ ![Train Loss](assets/train_loss.png)
68
+
69
+ ### Learning Rate
70
+
71
+ ![Learning Rate](assets/learning_rate.png)
72
+
73
+ ### Gradient Norm
74
+
75
+ ![Gradient Norm](assets/grad_norm.png)
76
+
77
  ## Project Context
78
 
79
  This model is part of the metadata localization release. Related checkpoints and variants are grouped in the public Hugging Face collection [Metadata Conditioned LLMs](https://huggingface.co/collections/iamshnoo/metadata-conditioned-llms).
 
81
  - Project repository: [https://github.com/iamshnoo/metadata_localization](https://github.com/iamshnoo/metadata_localization)
82
  - Paper: [https://arxiv.org/abs/2601.15236](https://arxiv.org/abs/2601.15236)
83
 
84
+ Last synced: `2026-04-02 14:48:16 UTC`
assets/grad_norm.png ADDED
assets/learning_rate.png ADDED
assets/train_loss.png ADDED