v9-fix: oracle training routing, calibrated inference, update docs 5a14212 natmin322 commited on Mar 23
v7: C5 Data-Informed Subspace Init + restructure contributions to 2 core claims aeb2d78 natmin322 commited on Mar 21
fix: reduce CPU RAM to prevent OOM SIGKILL - gc.collect, del temps, eval_accumulation_steps, cache cleanup c03ffe2 natmin322 commited on Mar 18
fix: add trust_remote_code=True to second load_dataset call in run_llama.py e9acf77 natmin322 commited on Mar 18
dataset: allow custom dataset code by setting trust_remote_code=True in run_llama.py 55f7d25 natmin322 commited on Mar 18
C4: Spectrally-Conditioned LoRA Training — preconditioned gradient + spectral entropy regularization 2d42b51 natmin322 commited on Mar 17
SpecRoute V3: adaptive bias, symmetric inference, threshold 0.995, batch size optimization 9ea634d natmin322 commited on Mar 17
fix: pass attention_mask directly to model.generate(), not via GenerationConfig 915a112 natmin322 commited on Mar 12
fix: override _save to disable safetensors for T5 shared embedding weights bb4c9d9 natmin322 commited on Mar 12
fix: denumpify_detensorize moved to trainer_utils in transformers 4.40+ a57a027 natmin322 commited on Mar 12
fix: add explicit trainer_pt_utils imports (nested_truncate etc.) removed from trainer.* in 4.40+ e4e078c natmin322 commited on Mar 12
fix: comprehensive transformers 4.40 compat across all trainer files 2e720a9 natmin322 commited on Mar 12