sero-nouscoder-14b-sft

A personal coding assistant fine-tuned on 11,711 real coding conversations from my daily development work.

Model Details

Property Value
Base Model NousResearch/NousCoder-14B
Parameters 14.8B
Architecture Qwen3-based decoder-only transformer
Training Method QLoRA (4-bit quantization + LoRA r=64)
Training Tokens ~51.75 million
Final Loss 0.685
Token Accuracy 81.6%
License Apache 2.0

The Experiment

Why I Did This

I've accumulated thousands of coding conversations with AI assistants over the past year. These conversations represent my actual coding style, problem-solving patterns, and domain expertise across:

  • Solidity/Web3 - Smart contracts, DeFi protocols, ethers.js
  • TypeScript/Node.js - Backend services, API development
  • Python - Scripts, data processing, automation
  • SQL - Database queries, schema design
  • DevOps - Docker, deployment, infrastructure

The goal: create a coding assistant that thinks like me and understands my codebase patterns.

Data Extraction Pipeline

Raw Data Sources
β”œβ”€β”€ Claude Projects conversations (233MB)
β”œβ”€β”€ Claude chat history exports
β”œβ”€β”€ Cursor IDE conversations
└── Various AI assistant logs
         β”‚
         β–Ό
    Extraction & Parsing
    β”œβ”€β”€ Parse JSONL conversation logs
    β”œβ”€β”€ Extract message pairs (user/assistant)
    β”œβ”€β”€ Normalize formats across sources
    └── Deduplicate conversations
         β”‚
         β–Ό
    Security Scanning
    β”œβ”€β”€ Regex patterns for API keys, tokens
    β”œβ”€β”€ Private key detection
    β”œβ”€β”€ Path/username redaction
    └── Quarantine flagged entries
         β”‚
         β–Ό
    Quality Filtering
    β”œβ”€β”€ Remove empty/trivial exchanges
    β”œβ”€β”€ Filter non-code conversations
    β”œβ”€β”€ Length-based filtering
    └── Train/dev/test split (90/5/5)
         β”‚
         β–Ό
    Final Dataset
    β”œβ”€β”€ train.jsonl (11,711 conversations)
    β”œβ”€β”€ dev.jsonl (107 conversations)
    └── test.jsonl (123 conversations)

Security Note: 95,561 conversations were quarantined due to potential sensitive content. Only verified clean data was used for training.

Dataset Statistics

Split Conversations Purpose
Train 11,711 Model training
Validation 107 Hyperparameter tuning
Test 123 Final evaluation
Quarantined 95,561 Excluded (potential secrets)

Sample Training Conversation

{
  "messages": [
    {
      "role": "user",
      "content": "For the code present, we get this error:\n```\nConversion of type 'Contract' to type 'ICoboSafeAccount' may be a mistake...\n```\nHow can I resolve this?"
    },
    {
      "role": "assistant",
      "content": "The error occurs because TypeScript doesn't automatically recognize that the Contract instance implements the ICoboSafeAccount interface. You can fix this by using a double type assertion:\n\n```typescript\nconst coboSafeAccount = new ethers.Contract(\n    coboSafeAccountAddress,\n    COBO_SAFE_ACCOUNT_ABI,\n    signer\n) as unknown as ICoboSafeAccount;\n```"
    }
  ]
}

Training Details

Configuration

# Model
MODEL_ID = "NousResearch/NousCoder-14B"

# QLoRA Config
BITS = 4
LORA_R = 64
LORA_ALPHA = 128
LORA_DROPOUT = 0.05
TARGET_MODULES = ["q_proj", "k_proj", "v_proj", "o_proj",
                  "gate_proj", "up_proj", "down_proj"]

# Training
BATCH_SIZE = 2
GRADIENT_ACCUMULATION = 8  # Effective batch size: 16
LEARNING_RATE = 2e-5
EPOCHS = 3
MAX_LENGTH = 4096
PACKING = True  # Efficient sequence packing

Infrastructure

  • Platform: HuggingFace Jobs
  • GPU: NVIDIA A100 80GB
  • Training Time: ~18 hours (timed out at 93% completion)
  • Cost: ~$45 USD

Training Progress

Epoch Step Loss Token Accuracy Learning Rate
0.03 ~10 1.355 71.2% 2.0e-5
0.28 ~80 0.920 77.2% 1.9e-5
0.54 ~160 0.781 79.5% 1.8e-5
1.04 ~320 0.743 80.4% 1.5e-5
1.55 ~480 0.711 80.8% 1.1e-5
2.05 ~640 0.722 80.7% 6.5e-6
2.52 ~800 0.705 81.2% 1.4e-6

Loss Curve

Loss
1.4 β”‚ ●
    β”‚  β•²
1.2 β”‚   β•²
    β”‚    β•²
1.0 β”‚     ●
    β”‚      β•²
0.8 β”‚       ●──●──●
    β”‚              β•²
0.7 β”‚               ●──●──●──●
    β”‚
0.6 β”‚
    └────────────────────────────
        0    0.5   1.0   1.5   2.0   2.5  Epoch

The model showed strong convergence:

  • Rapid initial loss drop (1.35 β†’ 0.78 in first 0.5 epochs)
  • Stable training through epochs 1-2
  • Final loss plateau around 0.70

Usage

With Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model + LoRA adapter
base = AutoModelForCausalLM.from_pretrained(
    "NousResearch/NousCoder-14B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "0xSero/sero-nouscoder-14b-sft")
tokenizer = AutoTokenizer.from_pretrained("0xSero/sero-nouscoder-14b-sft")

# Generate
messages = [{"role": "user", "content": "Write a Solidity ERC20 token with permit"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With vLLM (Recommended for Serving)

from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest

llm = LLM(
    model="NousResearch/NousCoder-14B",
    enable_lora=True,
    max_lora_rank=64,
)

outputs = llm.generate(
    ["Explain how to deploy a contract with ethers.js v6"],
    SamplingParams(temperature=0.7, max_tokens=512),
    lora_request=LoRARequest("sero", 1, "0xSero/sero-nouscoder-14b-sft")
)

Merge Adapter for Standalone Model

from transformers import AutoModelForCausalLM
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "NousResearch/NousCoder-14B",
    torch_dtype=torch.bfloat16,
    device_map="cpu",
)
model = PeftModel.from_pretrained(base, "0xSero/sero-nouscoder-14b-sft")
merged = model.merge_and_unload()
merged.save_pretrained("./sero-nouscoder-merged")

VRAM Requirements

Precision VRAM Required
bfloat16 (full) ~30GB
8-bit (bitsandbytes) ~16GB
4-bit (GPTQ/AWQ) ~8GB

Limitations

  • Domain Focused: Optimized for Solidity, TypeScript, Python - may underperform on other languages
  • 93% Trained: Training timed out before completing epoch 3 (2.52/3.0 epochs)
  • Personal Style: Tuned to my coding patterns, which may not generalize to all users
  • LoRA Adapter: Requires base model + adapter loading (not standalone)

Files

sero-nouscoder-14b-sft/
β”œβ”€β”€ adapter_config.json      # LoRA configuration
β”œβ”€β”€ adapter_model.safetensors # Trained LoRA weights (USE THIS)
β”œβ”€β”€ tokenizer.json           # Tokenizer
β”œβ”€β”€ tokenizer_config.json    # Tokenizer config
β”œβ”€β”€ special_tokens_map.json  # Special tokens
β”œβ”€β”€ chat_template.jinja      # Chat template
└── last-checkpoint/         # Training checkpoint (for resuming)
    β”œβ”€β”€ optimizer.pt
    β”œβ”€β”€ scheduler.pt
    β”œβ”€β”€ trainer_state.json
    └── ...

Next Steps

  • DPO alignment training on preference pairs
  • GPTQ/AWQ quantization for consumer GPU deployment
  • Evaluation on coding benchmarks
  • Tool/agent fine-tuning on 136K tool trajectory events

Citation

@misc{sero-nouscoder-14b-sft,
  author = {0xSero},
  title = {sero-nouscoder-14b-sft: Personal Coding Assistant},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/0xSero/sero-nouscoder-14b-sft}
}

Acknowledgments

  • NousResearch for the excellent NousCoder-14B base model
  • HuggingFace for the Jobs compute platform
  • The TRL and PEFT teams for making fine-tuning accessible

Built with ~$45 of compute and 11,711 real coding conversations.

Downloads last month
67
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for 0xSero/sero-nouscoder-14b-sft

Finetuned
Qwen/Qwen3-14B
Adapter
(1)
this model