natmin322
/

Continual

Model card Files Files and versions

Continual / root_gainlora /src /run_t5.py

Commit History

fix FT score

b9eaa7b

natmin322 commited on Mar 14

Fix OOM: prev LoRA on CPU + no_grad entire prev contribution (no saved tensors for backward)

3aef764

natmin322 commited on Mar 13

Fix deprecation warnings: torch.load weights_only, cupy fromDlpack, as_target_tokenizer

3c3ef28

natmin322 commited on Mar 13

fix: re-initialize trans_input/prompt_key after from_pretrained, add epsilon to norm, use_reentrant=False

d6a9f4f

natmin322 commited on Mar 13

fix: restore _set_gradient_checkpointing + enable_input_require_grads for gradient checkpointing

2b87f4b

natmin322 commited on Mar 12

fix: use default use_reentrant=True for gradient checkpointing (model expects it)

008c76c

natmin322 commited on Mar 12

fix: add FP16 safety check, dataset script check, LoRA sanity check

dfdd675

natmin322 commited on Mar 12

fix: LoRA reinit, gradient checkpointing use_reentrant=False, checkpoint existence check, collator padding alignment

5299479

natmin322 commited on Mar 12

add root_gainlora to repo for testing

e2bef95

natmin322 commited on Mar 12

reduce rubish

5d89844

natmin322 commited on Mar 12

fix: trust_remote_code and abs paths for load_dataset

5e730b3

natmin322 commited on Mar 10

update

d8766fe

natmin322 commited on Mar 10

clean code

b8c0787

natmin322 commited on Mar 10

revert: discard all 2-GPU DataParallel/DDP changes, back to f51f791

8644d30

natmin322 commited on Mar 10

DDP 2 GPU

aa4ad72

natmin322 commited on Mar 10

fix: SVD jitter + trans_input.pt exists guard in both root and improve

c1277de

natmin322 commited on Mar 9

test root

276234e

natmin322 commited on Mar 9

new change

92ad19e

natmin322 commited on Mar 9