Support this work: donate.sybilsolutions.ai

REAP surfaces: GLM | MiniMax | Qwen | Gemma | Paper | Code | PR17 | Cerebras Collection

sero-nouscoder-14b-sft

A personal coding assistant fine-tuned on 11,711 real coding conversations from my daily development work.

Model Details

Property	Value
Base Model	NousResearch/NousCoder-14B
Parameters	14.8B
Architecture	Qwen3-based decoder-only transformer
Training Method	QLoRA (4-bit quantization + LoRA r=64)
Training Tokens	~51.75 million
Final Loss	0.685
Token Accuracy	81.6%
License	Apache 2.0

The Experiment

Why I Did This

I've accumulated thousands of coding conversations with AI assistants over the past year. These conversations represent my actual coding style, problem-solving patterns, and domain expertise across:

Solidity/Web3 - Smart contracts, DeFi protocols, ethers.js
TypeScript/Node.js - Backend services, API development
Python - Scripts, data processing, automation
SQL - Database queries, schema design
DevOps - Docker, deployment, infrastructure

The goal: create a coding assistant that thinks like me and understands my codebase patterns.

Data Extraction Pipeline

Raw Data Sources
├── Claude Projects conversations (233MB)
├── Claude chat history exports
├── Cursor IDE conversations
└── Various AI assistant logs
         │
         ▼
    Extraction & Parsing
    ├── Parse JSONL conversation logs
    ├── Extract message pairs (user/assistant)
    ├── Normalize formats across sources
    └── Deduplicate conversations
         │
         ▼
    Security Scanning
    ├── Regex patterns for API keys, tokens
    ├── Private key detection
    ├── Path/username redaction
    └── Quarantine flagged entries
         │
         ▼
    Quality Filtering
    ├── Remove empty/trivial exchanges
    ├── Filter non-code conversations
    ├── Length-based filtering
    └── Train/dev/test split (90/5/5)
         │
         ▼
    Final Dataset
    ├── train.jsonl (11,711 conversations)
    ├── dev.jsonl (107 conversations)
    └── test.jsonl (123 conversations)

Security Note: 95,561 conversations were quarantined due to potential sensitive content. Only verified clean data was used for training.

Dataset Statistics

Split	Conversations	Purpose
Train	11,711	Model training
Validation	107	Hyperparameter tuning
Test	123	Final evaluation
Quarantined	95,561	Excluded (potential secrets)

Sample Training Conversation

{
  "messages": [
    {
      "role": "user",
      "content": "For the code present, we get this error:\n```\nConversion of type 'Contract' to type 'ICoboSafeAccount' may be a mistake...\n```\nHow can I resolve this?"
    },
    {
      "role": "assistant",
      "content": "The error occurs because TypeScript doesn't automatically recognize that the Contract instance implements the ICoboSafeAccount interface. You can fix this by using a double type assertion:\n\n```typescript\nconst coboSafeAccount = new ethers.Contract(\n    coboSafeAccountAddress,\n    COBO_SAFE_ACCOUNT_ABI,\n    signer\n) as unknown as ICoboSafeAccount;\n```"
    }
  ]
}

Training Details

Configuration

# Model
MODEL_ID = "NousResearch/NousCoder-14B"

# QLoRA Config
BITS = 4
LORA_R = 64
LORA_ALPHA = 128
LORA_DROPOUT = 0.05
TARGET_MODULES = ["q_proj", "k_proj", "v_proj", "o_proj",
                  "gate_proj", "up_proj", "down_proj"]

# Training
BATCH_SIZE = 2
GRADIENT_ACCUMULATION = 8  # Effective batch size: 16
LEARNING_RATE = 2e-5
EPOCHS = 3
MAX_LENGTH = 4096
PACKING = True  # Efficient sequence packing

Infrastructure

Platform: HuggingFace Jobs
GPU: NVIDIA A100 80GB
Training Time: ~18 hours (timed out at 93% completion)
Cost: ~$45 USD

Training Progress

Epoch	Step	Loss	Token Accuracy	Learning Rate
0.03	~10	1.355	71.2%	2.0e-5
0.28	~80	0.920	77.2%	1.9e-5
0.54	~160	0.781	79.5%	1.8e-5
1.04	~320	0.743	80.4%	1.5e-5
1.55	~480	0.711	80.8%	1.1e-5
2.05	~640	0.722	80.7%	6.5e-6
2.52	~800	0.705	81.2%	1.4e-6

Loss Curve

Loss
1.4 │ ●
    │  ╲
1.2 │   ╲
    │    ╲
1.0 │     ●
    │      ╲
0.8 │       ●──●──●
    │              ╲
0.7 │               ●──●──●──●
    │
0.6 │
    └────────────────────────────
        0    0.5   1.0   1.5   2.0   2.5  Epoch

The model showed strong convergence:

Rapid initial loss drop (1.35 → 0.78 in first 0.5 epochs)
Stable training through epochs 1-2
Final loss plateau around 0.70

Usage

With Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model + LoRA adapter
base = AutoModelForCausalLM.from_pretrained(
    "NousResearch/NousCoder-14B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "0xSero/sero-nouscoder-14b-sft")
tokenizer = AutoTokenizer.from_pretrained("0xSero/sero-nouscoder-14b-sft")

# Generate
messages = [{"role": "user", "content": "Write a Solidity ERC20 token with permit"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With vLLM (Recommended for Serving)

from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest

llm = LLM(
    model="NousResearch/NousCoder-14B",
    enable_lora=True,
    max_lora_rank=64,
)

outputs = llm.generate(
    ["Explain how to deploy a contract with ethers.js v6"],
    SamplingParams(temperature=0.7, max_tokens=512),
    lora_request=LoRARequest("sero", 1, "0xSero/sero-nouscoder-14b-sft")
)

Merge Adapter for Standalone Model

from transformers import AutoModelForCausalLM
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "NousResearch/NousCoder-14B",
    torch_dtype=torch.bfloat16,
    device_map="cpu",
)
model = PeftModel.from_pretrained(base, "0xSero/sero-nouscoder-14b-sft")
merged = model.merge_and_unload()
merged.save_pretrained("./sero-nouscoder-merged")

VRAM Requirements

Precision	VRAM Required
bfloat16 (full)	~30GB
8-bit (bitsandbytes)	~16GB
4-bit (GPTQ/AWQ)	~8GB

Limitations

Domain Focused: Optimized for Solidity, TypeScript, Python - may underperform on other languages
93% Trained: Training timed out before completing epoch 3 (2.52/3.0 epochs)
Personal Style: Tuned to my coding patterns, which may not generalize to all users
LoRA Adapter: Requires base model + adapter loading (not standalone)

Files

sero-nouscoder-14b-sft/
├── adapter_config.json      # LoRA configuration
├── adapter_model.safetensors # Trained LoRA weights (USE THIS)
├── tokenizer.json           # Tokenizer
├── tokenizer_config.json    # Tokenizer config
├── special_tokens_map.json  # Special tokens
├── chat_template.jinja      # Chat template
└── last-checkpoint/         # Training checkpoint (for resuming)
    ├── optimizer.pt
    ├── scheduler.pt
    ├── trainer_state.json
    └── ...

Next Steps

DPO alignment training on preference pairs
GPTQ/AWQ quantization for consumer GPU deployment
Evaluation on coding benchmarks
Tool/agent fine-tuning on 136K tool trajectory events

Citation

@misc{sero-nouscoder-14b-sft,
  author = {0xSero},
  title = {sero-nouscoder-14b-sft: Personal Coding Assistant},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/0xSero/sero-nouscoder-14b-sft}
}

Acknowledgments

NousResearch for the excellent NousCoder-14B base model
HuggingFace for the Jobs compute platform
The TRL and PEFT teams for making fine-tuning accessible

Built with ~$45 of compute and 11,711 real coding conversations.

Support

If this work is useful, support Sybil Solutions here: https://donate.sybilsolutions.ai

Support and links

Donate: https://donate.sybilsolutions.ai
X: https://x.com/0xsero
GitHub: https://github.com/0xsero

Model tree for 0xSero/sero-nouscoder-14b-sft

Base model

Qwen/Qwen3-14B-Base

Finetuned

Qwen/Qwen3-14B