code2lora
/

code2lora-gru

Model card Files Files and versions

code2lora commited on about 19 hours ago

Commit

26460cb

·

verified ·

1 Parent(s): bb8d446

Update dataset/model card

Files changed (1) hide show

README.md +28 -0

README.md ADDED Viewed

	@@ -0,0 +1,28 @@

+---
+license: mit
+tags: [code, lora, hypernetwork, peft, recurrent]
+---
+# Code2LoRA-GRU — streaming hypernetwork
+Final checkpoint of the **streaming Code2LoRA-GRU** used in the paper. A
+1-layer GRU rolls the recurrence over per-commit diff embeddings and emits
+a rank-16 LoRA adapter for `Qwen/Qwen2.5-Coder-1.5B` at *O(1)* per commit.
+## Files
+| File | Description |
+|---|---|
+| `code2lora_gru.pt`  | Trained GRU + `Code2LoRAHead` weights (~2.85 GB, fp32). |
+| `metrics.jsonl`     | Per-step training metrics (loss, val EM/EditSim/CodeBLEU). |
+## Training recipe
+* 3 epochs of truncated BPTT (window K=16) on
+  `code2lora/code2lora-data-smartcap` (train QnAs) plus
+  `code2lora/code2lora-data-commits` (commit metadata + diff embeddings).
+* AdamW + cosine schedule, max-seq-len 8192, bf16, single H100 80 GB.
+## Companion model
+`code2lora/code2lora-direct` -- the static-snapshot variant.