Commit History
download model weights on preprocess step (#1693) 5783839 unverified
verbose failure message (#1694) cbbf039 unverified
fix for when sample_packing and eval_sample_packing are different (#1695) 18cabc0 unverified
add back packing efficiency estimate so epochs and multi-gpu works properly (#1697) ed8ef65 unverified
ensure explicit eval_sample_packing to avoid mismatch issues (#1692) 9c1af1a unverified
Phi-3 conversation format, example training script and perplexity metric (#1582) cf64284 unverified
add support for rpo_alpha (#1681) c996881 unverified
re-enable DPO for tests in modal ci (#1374) 1f151c0 unverified
need to add back drop_last for sampler (#1676) 05b0bd0 unverified
cleanup the deepspeed proxy model at the end of training (#1675) d4f6c65 unverified
load explicit splits on datasets (#1652) a944f7b unverified
set chat_template in datasets config automatically (#1664) 9d4225a unverified
use mixins for orpo and kto configs so they work with axolotl customizations (#1674) f7332ac unverified
revert multipack batch sampler changes (#1672) a6b37bd unverified
handle the system role too for chat templates (#1671) b752080 unverified
make sure the CI fails when pytest script fails (#1669) fe650dd unverified
Correct name of MixtralBlockSparseTop2MLP (L -> l) (#1667) 65db903 unverified
Fix: ensure correct handling of `val_set_size` as `float` or `int` (#1655) 6a5a725 unverified
Generalizing the chat_template prompt strategy (#1660) [skip ci] cc11c6b unverified
Keith Stevens commited on
support for custom messages field in sharegpt (#1651) bbfed31 unverified
enable loraplus setting for dpo trainer (#1646) a27d5e1 unverified
allow report_to for multiple providers (#1647) 6299eb5 unverified
Fix llama3 chat_template (extra <|eot_id|> on last turn) (#1635) 7c2bf30 unverified
Add KTO support (#1640) 22ae21a unverified
fixes to save on fractional save_steps (#1643) ba45531 unverified
Unsloth optims for Llama (#1609) 8a1572a unverified
add save_only_model option (#1634) 702a669 unverified
Fix `total_num_steps` (#1566) 81da7d2 unverified
FIX: max_length and max_prompt_length was not being sent to ORPOTrainer (#1584) 1e1921b unverified
make sure to save on the last step (#1615) 1634ac8 unverified
fix attention mask collation (#1603) 0298273 unverified
feat: Add LLaMA-3 instruct prompt strategies for fine-tuning (#1553) 50421c8 unverified
adding llama3 fastchat conversation monkeypatch (#1539) b32c08f unverified
ignore the fsdp_config section too (#1606) [skip ci] fff06af unverified
make sure to save the lora adapter at the end of RL/dpo training (#1573) 796a085 unverified
improve tool handling roles (#1587) cb78a36 unverified
feat: exclude mamba blocks for jamba (#1578) 8b9c15b unverified
Pass deepspeed and fsdp as None explicitly when merging adapters to allow custom device_map (#1575) 9e1480e unverified
improve save callbacks (#1592) 29cf15a unverified
FIX: TRL trainer preprocessing step was running in one process (#1583) b9bb169 unverified
Ali Mosavian Ali Mosavian commited on