Metadata Conditioned LLMs
Collection
Pretraining Data: English NOW corpus (english-corpora.org/now). Paper: arxiv.org/abs/2601.15236. Code: github.com/iamshnoo/metadata_localization • 91 items • Updated
This repo contains the merged chat model for the combined with metadata 3b branch of the metadata localization project. It was produced by supervised fine-tuning on the project QA benchmark after project pretraining.
sft_chatchat3bwith_metadatacombined_with_metadata_3btrim-snowball-4https://wandb.ai/iamshnoo/huggingface/runs/7zo3u4gdfinished5h 19m 34strain/loss: 1.0289train/global_step: 7,764train/epoch: 3train/learning_rate: 0train/grad_norm: 0.1125per_device_train_batch_size: 2gradient_accumulation_steps: 8learning_rate: 0.0002num_train_epochs: 3optim: adamw_bnb_8bitbf16: Truegradient_checkpointing: Trueuse_liger_kernel: TruePEFT / LoRAadamw_bnb_8bitbf16=True, gradient_checkpointing=True, use_liger_kernel=Trueper_device_train_batch_size=2, gradient_accumulation_steps=8q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_projStatic plots below were exported from the private Weights & Biases run and embedded here for public access.
This model is part of the metadata localization release. Related checkpoints and variants are grouped in the public Hugging Face collection Metadata Conditioned LLMs.
Last synced: 2026-04-02 14:48:05 UTC