Continual / root_gainlora /src /run_t5.py

Commit History

fix FT score
b9eaa7b

natmin322 commited on

Fix OOM: prev LoRA on CPU + no_grad entire prev contribution (no saved tensors for backward)
3aef764

natmin322 commited on

Fix deprecation warnings: torch.load weights_only, cupy fromDlpack, as_target_tokenizer
3c3ef28

natmin322 commited on

fix: re-initialize trans_input/prompt_key after from_pretrained, add epsilon to norm, use_reentrant=False
d6a9f4f

natmin322 commited on

fix: restore _set_gradient_checkpointing + enable_input_require_grads for gradient checkpointing
2b87f4b

natmin322 commited on

fix: use default use_reentrant=True for gradient checkpointing (model expects it)
008c76c

natmin322 commited on

fix: add FP16 safety check, dataset script check, LoRA sanity check
dfdd675

natmin322 commited on

fix: LoRA reinit, gradient checkpointing use_reentrant=False, checkpoint existence check, collator padding alignment
5299479

natmin322 commited on

add root_gainlora to repo for testing
e2bef95

natmin322 commited on

reduce rubish
5d89844

natmin322 commited on

fix: trust_remote_code and abs paths for load_dataset
5e730b3

natmin322 commited on

clean code
b8c0787

natmin322 commited on

revert: discard all 2-GPU DataParallel/DDP changes, back to f51f791
8644d30

natmin322 commited on

DDP 2 GPU
aa4ad72

natmin322 commited on

fix: SVD jitter + trans_input.pt exists guard in both root and improve
c1277de

natmin322 commited on

test root
276234e

natmin322 commited on

new change
92ad19e

natmin322 commited on