How to use from the
Use from the
PEFT library
Task type is invalid.

Code2LoRA-GRU โ€” streaming hypernetwork

Final checkpoint of the streaming Code2LoRA-GRU used in the paper. A 1-layer GRU rolls the recurrence over per-commit diff embeddings and emits a rank-16 LoRA adapter for Qwen/Qwen2.5-Coder-1.5B at O(1) per commit.

Files

File Description
code2lora_gru.pt Trained GRU + Code2LoRAHead weights (~2.85 GB, fp32).
metrics.jsonl Per-step training metrics (loss, val EM/EditSim/CodeBLEU).

Training recipe

  • 3 epochs of truncated BPTT (window K=16) on code2lora/code2lora-data-smartcap (train QnAs) plus code2lora/code2lora-data-commits (commit metadata + diff embeddings).
  • AdamW + cosine schedule, max-seq-len 8192, bf16, single H100 80 GB.

Companion model

code2lora/code2lora-direct -- the static-snapshot variant.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support