Commit History
Train parameters exclusively in specific ranges (#1390) 05bcc9e unverified
FDSP + QLoRA (#1378) 9b6ee83 unverified
fix(examples): remove is_*_derived as it's parsed automatically (#1297) a7a9a14 unverified
Add seq2seq eval benchmark callback (#1274) 5a5d474 unverified
Mixtral fixes 20240124 (#1192) [skip ci] 54d2ac1 unverified
Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) (#1155) cc25039 unverified
Set eval_sample_packing to false in mistral config.yaml (#1003) 384b817 unverified
Kevin Sydney commited on
set output_router_logits for mixtral config: (#995) 628b754 unverified
change val size (#992) 93ebec1 unverified
Fix Deepspeed loading (#950) 5ea3aa3 unverified
new evals_per_epoch and saves_per_epoch to make things cleaner (#944) 5f79b82 unverified
Mixtral official (#942) 7fabc4d unverified
update to latest transformers for mixstral support (#929) 35f9b0f unverified
Mixtral multipack (#928) 68b227a unverified
Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified
feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified
don't compile deepspeed or bitsandbytes from source (#837) f544ab2 unverified
fix eval_steps to be a sane default (#797) 8b79ff0 unverified
disable eval table w sample packing in examples (#778) 9b43e7e unverified
simplify by removing duplicate base_model_config (#772) 2d8def6 unverified
Fix: lowercase `True` values in config (#713) ace70b3 unverified
atgctg commited on
Get qlora mistral-7b fine tuning working on a single 4090 (#708) 295b266 unverified
lukemarsden commited on
fix unneeded space (#699) f91db19 unverified
lint 83a950b unverified
new lr, sample pack 4c8ddf2
Fix: Higher vram usage for mistral and sample_packing (#691) 669f1d0 unverified
Adding qlora config for Mistral (#675) d4a88e4 unverified
Abhishek Mishra commited on
prepared dataset caching, other misc fixes (#665) e50a64e unverified
Update mistral/README.md (#647) b88f515 unverified
Adarsh Shirawalmath commited on