ApexCoder-1.5B Β· Merged 16-bit Model
Last updated: 2026-03-20 β Cycle 2
Production-ready merged model (base + LoRA fused into 16-bit weights). Trained on a single NVIDIA A40 (44 GB) using Unsloth QLoRA + TRL SFTTrainer.
Looking for a smaller download? Use the LoRA adapter (
150 MB) or the GGUF Q4_K_M (986 MB) for Ollama.
π Evaluation β Cycle 2
| Metric | Value |
|---|---|
| LLM-as-judge (avg) | 12.6/15 |
| Perplexity | 1.14 |
| Ξ vs previous cycle | +12.6 |
| Training loss | 0.2274 |
| Training samples | 8,990 |
| Training steps | 1100 |
By reasoning type
| Type | Status | Score | Progress |
|---|
Cycle history
| Cycle | Date | Score | PPL | Ξ | vs Published |
|---|---|---|---|---|---|
| 1 | 2026-03-20 | 12.9/15 | 1.17 | +12.9 | 12.9 |
| 2 | 2026-03-20 | 12.6/15 | 1.14 | +12.6 | 13.2 |
π Quick start
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"Gianloko/apex-coder-1.5b",
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Gianloko/apex-coder-1.5b")
messages = [
{"role": "system", "content": "You are ApexCoder, a world-class Salesforce expert."},
{"role": "user", "content": "Write a bulkified Apex trigger on Opportunity that prevents status changes to Closed Won if no related Products exist."},
]
inputs = tokenizer.apply_chat_template(
messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)
output = model.generate(inputs, max_new_tokens=512, temperature=0.1, do_sample=False)
print(tokenizer.decode(output[0][inputs.shape[1]:], skip_special_tokens=True))
π¦ Ollama (GGUF β recommended for local use)
ollama pull hf.co/Gianloko/apex-coder-1.5b-GGUF:Q4_K_M
ollama run hf.co/Gianloko/apex-coder-1.5b-GGUF:Q4_K_M
π§ LoRA adapter
If you already have the base model loaded, use the LoRA adapter (~150 MB) instead:
from peft import PeftModel
model = PeftModel.from_pretrained(base_model, "Gianloko/apex-coder-1.5b-lora")
βοΈ V6 pipeline notes
- Warm-start training β cycle 2+ initialises from previous LoRA adapter
- Best-ever gate β publish blocked if new model regresses vs published model
- Data quality β validated with langdetect + non-ASCII ratio filter
- CanaryCallback β 3 probes per epoch, majority-fail aborts training
- Post-merge validation β 3 sanity + 3 hallucination probes gate every push
- Dataset versioned β cycle tags on HuggingFace for full rollback capability
License
Apache 2.0
- Downloads last month
- 274