Instructions to use vhn1s/neuromem-lora-checkpoints with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use vhn1s/neuromem-lora-checkpoints with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
NeuroMem SDA LoRA Calibration Checkpoints
Trained LoRA adapters + saved SDA address banks for the NeuroMem SDA project. Each subfolder is one calibration run with its own hyperparameters.
Base model: meta-llama/Llama-3.2-1B-Instruct
Eval set: WikiText-103 test, causal language modeling, --sda_causal_eval
Eval pipeline: evals/eval_perplexity.py from https://github.com/Tejas-JB/NeuroMem (branch vihaan/phase1-energy-baseline)
Current Leaderboard (in our local env β see Known Limitation below)
| Run | M | k | LoRA r | Steps | Train CE | Eval PPL (5-win ctx=512, causal) |
|---|---|---|---|---|---|---|
| run_d (lost, evaluated only) | 16384 | 32 | 64 | 4000 | 4.93 | 414 β best |
| run_f (lost, evaluated only) | 8192 | 16 | 128 | 4000 | 3.65 | 432 (ctx=1024) / 447 (ctx=512) |
| run_e (lost, evaluated only) | 8192 | 16 | 64 | 4000 | 5.07 | 523 |
| run_k_M16k_k32_r256_6k | 16384 | 32 | 256 | 6000 | 1.25 | 1,059 (overfit) |
| sweep_rank128_2k | 8192 | 16 | 128 | 2000 | β | pending Benji eval |
| sweep_rank64_2k | 8192 | 16 | 64 | 2000 | β | pending Benji eval |
| run_001_rank16_2k | 8192 | 16 | 16 | 2000 | 7.94 | 5,828 |
In-flight (auto-uploaded as they finish):
run_j_M32k_r128_8kβ M=32k, rank=128, 8000 stepsrun_l_M32k_r256_8kβ M=32k, rank=256, 8000 stepsrun_p_M16k_k32_r64_8kβ M=16k/k=32, rank=64, 8000 steps (sweet-spot push)run_n_M32k_r256_12kβ chained after run_jrun_o_M16k_k32_r256_12kβ chained after run_l
Known Limitation (please read before using)
The Eval PPL numbers above are from our local environment, where raw uncalibrated SDA produces PPL ~9,781 (under causal eval) instead of the published baseline of PPL ~17.95. We have a documented but unresolved environmental regression in our SDA forward pass β likely a CUDA/cudnn-level fp16 numerical issue that doesn't affect Benji's setup.
The relative improvements are valid (rank-16 β 64 yields ~10Γ PPL drop in our env, confirming capacity scaling). The absolute PPL values are pessimistic. Re-evaluating these checkpoints in the original SDA paper environment is needed for paper-quality absolute numbers.
How to use a checkpoint
from transformers import AutoModelForCausalLM
from peft import PeftModel
import torch
model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.2-1B-Instruct",
dtype=torch.float16,
attn_implementation="eager",
)
# Patch with SDA β see https://github.com/Tejas-JB/NeuroMem/blob/vihaan/phase1-energy-baseline/src/sda/wrapper.py
from src.sda.wrapper import patch_all_layers
patch_all_layers(model, M=8192, r=0.3601) # match the checkpoint's M, r
# Load LoRA + SDA banks
model = PeftModel.from_pretrained(model, "<path/to/checkpoint/dir>")
sda_banks = torch.load("<path/to/sda_address_banks.pt>", map_location="cpu", weights_only=True)
# (See evals/eval_perplexity.py:268 `load_sda_checkpoint` for the full restore flow)
Run
python evals/eval_perplexity.py \
--model meta-llama/Llama-3.2-1B-Instruct \
--attention sda --M <checkpoint_M> --k <checkpoint_k> --r <checkpoint_r> \
--patch_layers all --attn_implementation eager \
--sda_checkpoint <path/to/checkpoint> \
--dataset wikitext-103 --split test \
--batch_size 1 --context_length 2048 \
--output eval_result.json
Add --sda_causal_eval for spec-compliant causal evaluation (slow but correct).
- Downloads last month
- -
Model tree for vhn1s/neuromem-lora-checkpoints
Base model
meta-llama/Llama-3.2-1B-Instruct