sero-nouscoder-14b-sft
A personal coding assistant fine-tuned on 11,711 real coding conversations from my daily development work.
Model Details
| Property | Value |
|---|---|
| Base Model | NousResearch/NousCoder-14B |
| Parameters | 14.8B |
| Architecture | Qwen3-based decoder-only transformer |
| Training Method | QLoRA (4-bit quantization + LoRA r=64) |
| Training Tokens | ~51.75 million |
| Final Loss | 0.685 |
| Token Accuracy | 81.6% |
| License | Apache 2.0 |
The Experiment
Why I Did This
I've accumulated thousands of coding conversations with AI assistants over the past year. These conversations represent my actual coding style, problem-solving patterns, and domain expertise across:
- Solidity/Web3 - Smart contracts, DeFi protocols, ethers.js
- TypeScript/Node.js - Backend services, API development
- Python - Scripts, data processing, automation
- SQL - Database queries, schema design
- DevOps - Docker, deployment, infrastructure
The goal: create a coding assistant that thinks like me and understands my codebase patterns.
Data Extraction Pipeline
Raw Data Sources
βββ Claude Projects conversations (233MB)
βββ Claude chat history exports
βββ Cursor IDE conversations
βββ Various AI assistant logs
β
βΌ
Extraction & Parsing
βββ Parse JSONL conversation logs
βββ Extract message pairs (user/assistant)
βββ Normalize formats across sources
βββ Deduplicate conversations
β
βΌ
Security Scanning
βββ Regex patterns for API keys, tokens
βββ Private key detection
βββ Path/username redaction
βββ Quarantine flagged entries
β
βΌ
Quality Filtering
βββ Remove empty/trivial exchanges
βββ Filter non-code conversations
βββ Length-based filtering
βββ Train/dev/test split (90/5/5)
β
βΌ
Final Dataset
βββ train.jsonl (11,711 conversations)
βββ dev.jsonl (107 conversations)
βββ test.jsonl (123 conversations)
Security Note: 95,561 conversations were quarantined due to potential sensitive content. Only verified clean data was used for training.
Dataset Statistics
| Split | Conversations | Purpose |
|---|---|---|
| Train | 11,711 | Model training |
| Validation | 107 | Hyperparameter tuning |
| Test | 123 | Final evaluation |
| Quarantined | 95,561 | Excluded (potential secrets) |
Sample Training Conversation
{
"messages": [
{
"role": "user",
"content": "For the code present, we get this error:\n```\nConversion of type 'Contract' to type 'ICoboSafeAccount' may be a mistake...\n```\nHow can I resolve this?"
},
{
"role": "assistant",
"content": "The error occurs because TypeScript doesn't automatically recognize that the Contract instance implements the ICoboSafeAccount interface. You can fix this by using a double type assertion:\n\n```typescript\nconst coboSafeAccount = new ethers.Contract(\n coboSafeAccountAddress,\n COBO_SAFE_ACCOUNT_ABI,\n signer\n) as unknown as ICoboSafeAccount;\n```"
}
]
}
Training Details
Configuration
# Model
MODEL_ID = "NousResearch/NousCoder-14B"
# QLoRA Config
BITS = 4
LORA_R = 64
LORA_ALPHA = 128
LORA_DROPOUT = 0.05
TARGET_MODULES = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"]
# Training
BATCH_SIZE = 2
GRADIENT_ACCUMULATION = 8 # Effective batch size: 16
LEARNING_RATE = 2e-5
EPOCHS = 3
MAX_LENGTH = 4096
PACKING = True # Efficient sequence packing
Infrastructure
- Platform: HuggingFace Jobs
- GPU: NVIDIA A100 80GB
- Training Time: ~18 hours (timed out at 93% completion)
- Cost: ~$45 USD
Training Progress
| Epoch | Step | Loss | Token Accuracy | Learning Rate |
|---|---|---|---|---|
| 0.03 | ~10 | 1.355 | 71.2% | 2.0e-5 |
| 0.28 | ~80 | 0.920 | 77.2% | 1.9e-5 |
| 0.54 | ~160 | 0.781 | 79.5% | 1.8e-5 |
| 1.04 | ~320 | 0.743 | 80.4% | 1.5e-5 |
| 1.55 | ~480 | 0.711 | 80.8% | 1.1e-5 |
| 2.05 | ~640 | 0.722 | 80.7% | 6.5e-6 |
| 2.52 | ~800 | 0.705 | 81.2% | 1.4e-6 |
Loss Curve
Loss
1.4 β β
β β²
1.2 β β²
β β²
1.0 β β
β β²
0.8 β βββββββ
β β²
0.7 β ββββββββββ
β
0.6 β
βββββββββββββββββββββββββββββ
0 0.5 1.0 1.5 2.0 2.5 Epoch
The model showed strong convergence:
- Rapid initial loss drop (1.35 β 0.78 in first 0.5 epochs)
- Stable training through epochs 1-2
- Final loss plateau around 0.70
Usage
With Transformers + PEFT
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model + LoRA adapter
base = AutoModelForCausalLM.from_pretrained(
"NousResearch/NousCoder-14B",
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(base, "0xSero/sero-nouscoder-14b-sft")
tokenizer = AutoTokenizer.from_pretrained("0xSero/sero-nouscoder-14b-sft")
# Generate
messages = [{"role": "user", "content": "Write a Solidity ERC20 token with permit"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
With vLLM (Recommended for Serving)
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest
llm = LLM(
model="NousResearch/NousCoder-14B",
enable_lora=True,
max_lora_rank=64,
)
outputs = llm.generate(
["Explain how to deploy a contract with ethers.js v6"],
SamplingParams(temperature=0.7, max_tokens=512),
lora_request=LoRARequest("sero", 1, "0xSero/sero-nouscoder-14b-sft")
)
Merge Adapter for Standalone Model
from transformers import AutoModelForCausalLM
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"NousResearch/NousCoder-14B",
torch_dtype=torch.bfloat16,
device_map="cpu",
)
model = PeftModel.from_pretrained(base, "0xSero/sero-nouscoder-14b-sft")
merged = model.merge_and_unload()
merged.save_pretrained("./sero-nouscoder-merged")
VRAM Requirements
| Precision | VRAM Required |
|---|---|
| bfloat16 (full) | ~30GB |
| 8-bit (bitsandbytes) | ~16GB |
| 4-bit (GPTQ/AWQ) | ~8GB |
Limitations
- Domain Focused: Optimized for Solidity, TypeScript, Python - may underperform on other languages
- 93% Trained: Training timed out before completing epoch 3 (2.52/3.0 epochs)
- Personal Style: Tuned to my coding patterns, which may not generalize to all users
- LoRA Adapter: Requires base model + adapter loading (not standalone)
Files
sero-nouscoder-14b-sft/
βββ adapter_config.json # LoRA configuration
βββ adapter_model.safetensors # Trained LoRA weights (USE THIS)
βββ tokenizer.json # Tokenizer
βββ tokenizer_config.json # Tokenizer config
βββ special_tokens_map.json # Special tokens
βββ chat_template.jinja # Chat template
βββ last-checkpoint/ # Training checkpoint (for resuming)
βββ optimizer.pt
βββ scheduler.pt
βββ trainer_state.json
βββ ...
Next Steps
- DPO alignment training on preference pairs
- GPTQ/AWQ quantization for consumer GPU deployment
- Evaluation on coding benchmarks
- Tool/agent fine-tuning on 136K tool trajectory events
Citation
@misc{sero-nouscoder-14b-sft,
author = {0xSero},
title = {sero-nouscoder-14b-sft: Personal Coding Assistant},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/0xSero/sero-nouscoder-14b-sft}
}
Acknowledgments
- NousResearch for the excellent NousCoder-14B base model
- HuggingFace for the Jobs compute platform
- The TRL and PEFT teams for making fine-tuning accessible
Built with ~$45 of compute and 11,711 real coding conversations.
- Downloads last month
- 67