Commit History
fix(readme): Clarify doc for tokenizer_config (#1323) [skip ci] 2ed52bd unverified
fix(readme): update inference md link (#1311) [skip ci] 3d2cd80 unverified
Add seq2seq eval benchmark callback (#1274) 5a5d474 unverified
Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273) 8430db2 unverified
allow the optimizer prune ratio for ReLoRA to be configurable (#1287) 4b997c3 unverified
Update README.md (#1281) b2a4cb4 unverified
add support for https remote yamls (#1277) 9bca7db unverified
allow remote data paths (#1278) 91cf4ee unverified
copy edits (#1276) 1daecd1 unverified
Add link to axolotl cloud image on latitude (#1275) 4a654b3 unverified
contributor avatars (#1269) 411293b unverified
add contact info for dedicated support for axolotl [skip ci] (#1243) dfd1885 unverified
support for true batches with multipack (#1230) 00568c1 unverified
Fix and document test_datasets (#1228) 5787e1a unverified
Peft lotfq (#1222) 4cb7900 unverified
Feat/chatml add system message (#1117) 98b4762 unverified
Mixtral fixes 20240124 (#1192) [skip ci] 54d2ac1 unverified
update docs [skip ci] (#1176) b715cd5 unverified
Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) (#1155) cc25039 unverified
Update README.md (#1169) [skip ci] 9135b9e unverified
Ayush Singh commited on
Deprecate max packed sequence len (#1141) 2ce5c0d unverified
feat(dataset): add config to keep processed dataset in memory (#1152) 3db5f2f unverified
Fix link for Minotaur model (#1146) [skip-ci] 08b8ba0 unverified
Agnostic cloud gpu docker image and Jupyter lab (#1097) ece0211 unverified
Add `layers_to_transform` for `lora_config` (#1118) 8487b97 unverified
xzuyn commited on
fix(readme): clarify custom user prompt [no-ci] (#1124) 9cd27b2 unverified
Update README.md (#1103) b502392 unverified
paired kto support (#1069) d7057cc unverified
Add: mlflow for experiment tracking (#1059) [skip ci] 090c24d unverified
Sponsors (#1065) 1496441 unverified
feature: better device mapping for large models (#918) bdfefaf unverified
set default for merge (#1044) 63fb3eb unverified
chore(readme): update instruction to set config to load from cache (#1030) b31038a unverified
use recommended setting for use_reentrant w gradient checkpointing (#1021) 4d2e842 unverified
Adds chat templates (#1022) f8ae59b unverified
feat: remove need to add load_in* during merge (#1017) f6ecf14 unverified
[Docs] Nit: Remind people to auth to wandb if they are going to use it (#1013) dec66d7 unverified
Update README.md (#1012) 76357dc unverified
remove landmark attn and xpos rope implementations (#1010) 70b46ca unverified
Update README.md (#966) d25c34c unverified
fix: switch to using the HuggingFace Transformers NEFT implementation (#941) ef24342 unverified
kallewoof commited on