ApexCoder-1.5B · LoRA Adapter

Last updated: 2026-03-20 — Cycle 2

Lightweight LoRA adapter (~150 MB) for the ApexCoder model. Apply on top of Gianloko/apex-coder-1.5b — no need to re-download the full 3 GB merged model every cycle.

Adapter config	Value
Base model	`Gianloko/apex-coder-1.5b`
LoRA rank (r)	32
LoRA alpha	64
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens, lm_head
Training loss	0.2274

📊 Evaluation — Cycle 2

Metric	Value
LLM-as-judge (avg)	12.6/15
Perplexity	1.14
Δ vs previous cycle	+12.6

By reasoning type

Type	Status	Score	Progress

Cycle history

Cycle	Date	Score	PPL	Δ	vs Published
1	2026-03-20	12.9/15	1.17	+12.9	12.9
2	2026-03-20	12.6/15	1.14	+12.6	13.2

🚀 Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base = AutoModelForCausalLM.from_pretrained(
    "Gianloko/apex-coder-1.5b",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Gianloko/apex-coder-1.5b")

# Apply LoRA adapter
model = PeftModel.from_pretrained(base, "Gianloko/apex-coder-1.5b-lora")
model = model.merge_and_unload()  # optional: fuse weights for faster inference

messages = [
    {"role": "system", "content": "You are ApexCoder, a world-class Salesforce expert."},
    {"role": "user",   "content": "Write a bulkified Apex trigger on Opportunity..."},
]
inputs = tokenizer.apply_chat_template(
    messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)

output = model.generate(inputs, max_new_tokens=512, temperature=0.1, do_sample=False)
print(tokenizer.decode(output[0][inputs.shape[1]:], skip_special_tokens=True))