Instructions to use zipaltrivedi/dotnet-coder-14b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use zipaltrivedi/dotnet-coder-14b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="zipaltrivedi/dotnet-coder-14b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("zipaltrivedi/dotnet-coder-14b")
model = AutoModelForCausalLM.from_pretrained("zipaltrivedi/dotnet-coder-14b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use zipaltrivedi/dotnet-coder-14b with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="zipaltrivedi/dotnet-coder-14b",
	filename="dotnet-coder-14b-Q2_K.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Inference
Local Apps Settings

llama.cpp

How to use zipaltrivedi/dotnet-coder-14b with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M

Use Docker

docker model run hf.co/zipaltrivedi/dotnet-coder-14b:Q4_K_M

LM Studio
Jan

vLLM

How to use zipaltrivedi/dotnet-coder-14b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "zipaltrivedi/dotnet-coder-14b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zipaltrivedi/dotnet-coder-14b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/zipaltrivedi/dotnet-coder-14b:Q4_K_M

SGLang

How to use zipaltrivedi/dotnet-coder-14b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "zipaltrivedi/dotnet-coder-14b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zipaltrivedi/dotnet-coder-14b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "zipaltrivedi/dotnet-coder-14b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "zipaltrivedi/dotnet-coder-14b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use zipaltrivedi/dotnet-coder-14b with Ollama:
```
ollama run hf.co/zipaltrivedi/dotnet-coder-14b:Q4_K_M
```

Unsloth Studio

How to use zipaltrivedi/dotnet-coder-14b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for zipaltrivedi/dotnet-coder-14b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for zipaltrivedi/dotnet-coder-14b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for zipaltrivedi/dotnet-coder-14b to start chatting

How to use zipaltrivedi/dotnet-coder-14b with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "zipaltrivedi/dotnet-coder-14b:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use zipaltrivedi/dotnet-coder-14b with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default zipaltrivedi/dotnet-coder-14b:Q4_K_M

Run Hermes

hermes

Atomic Chat new

OpenClaw new

How to use zipaltrivedi/dotnet-coder-14b with OpenClaw:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf zipaltrivedi/dotnet-coder-14b:Q4_K_M

Configure OpenClaw

# Install OpenClaw:
npm install -g openclaw@latest
# Register the local server and set it as the default model:
openclaw onboard --non-interactive --mode local \
  --auth-choice custom-api-key \
  --custom-base-url http://127.0.0.1:8080/v1 \
  --custom-model-id "zipaltrivedi/dotnet-coder-14b:Q4_K_M" \
  --custom-provider-id llama-cpp \
  --custom-compatibility openai \
  --custom-text-input \
  --accept-risk \
  --skip-health

Run OpenClaw

openclaw agent --local --agent main --message "Hello from Hugging Face"

Docker Model Runner
How to use zipaltrivedi/dotnet-coder-14b with Docker Model Runner:
```
docker model run hf.co/zipaltrivedi/dotnet-coder-14b:Q4_K_M
```

Lemonade

How to use zipaltrivedi/dotnet-coder-14b with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull zipaltrivedi/dotnet-coder-14b:Q4_K_M

Run and chat with the model

lemonade run user.dotnet-coder-14b-Q4_K_M

List all available models

lemonade list

dotnet-coder-14b

File size: 16,413 Bytes

c1223ff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e24b7a6
c1223ff
eff0e27
c1223ff
5402ea3
c1223ff
5402ea3
 
 
 
2c65a8e
 
5402ea3
 
 
71ddb0f
5402ea3
 
 
 
 
 
 
 
 
 
 
71ddb0f
 
 
5402ea3
 
c1223ff
 
 
e24b7a6
5402ea3
c1223ff
 
5402ea3
c1223ff
 
1ad251a
 
c1223ff
 
 
1ad251a
c1223ff
 
 
1ad251a
 
 
 
 
 
 
 
 
 
 
 
71ddb0f
1ad251a
 
 
 
 
 
 
 
 
 
 
 
71ddb0f
1ad251a
5402ea3
c1223ff
5402ea3
c1223ff
5402ea3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c1223ff
7ecdfce
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c1223ff
 
 
 
7ecdfce
 
 
c1223ff
 
 
 
 
5402ea3
c1223ff
71ddb0f
 
 
 
 
 
12c11eb
 
2c65a8e
71ddb0f
c1223ff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12c11eb
c1223ff
 
 
 
 
71ddb0f
 
 
 
 
 
 
 
 
 
 
 
 
c1223ff
71ddb0f
 
c1223ff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71ddb0f
 
 
 
 
 
c1223ff
 
 
 
 
 
 
 
 
 
 
 
5402ea3
c1223ff
5402ea3
 
 
 
 
c1223ff
5402ea3
c1223ff
5402ea3
 
 
 
 
 
c1223ff
5402ea3
c1223ff
5402ea3
c1223ff
5402ea3
c1223ff
 
 
5402ea3
 
c1223ff
 
eff0e27
 
e24b7a6
 
 
 
71ddb0f
e24b7a6
 
 
eff0e27
e24b7a6
 
 
 
 
 
eff0e27
c1223ff
 
 
 
 
 
5402ea3
c1223ff
e24b7a6
 
 
 
 
71ddb0f
e24b7a6
c1223ff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e24b7a6
 
 
 
71ddb0f

---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-14B-Instruct
tags:
  - csharp
  - dotnet
  - code
  - fine-tuned
  - dpo
  - qlora
  - coding-assistant
  - aspnet
  - entity-framework
language:
  - en
pipeline_tag: text-generation
library_name: transformers
---

# dotnet-coder-14b

A C#/.NET specialist fine-tuned from Qwen2.5-Coder-14B-Instruct. Achieves 97% compile rate on C# code generation benchmarks — higher than Qwen2.5-Coder-32B (80%) and Qwen2.5-72B (80%) on our evaluation suite.

Designed for coding agents and experienced .NET developers who need compilable, self-contained C# code.

## Quick Specs

| | |
|---|---|
| **Parameters** | 14.7B |
| **Base Model** | Qwen2.5-Coder-14B-Instruct |
| **Max Context** | 32,768 tokens (base model) |
| **Trained Sequence Length** | 2,048 tokens |
| **Training Method** | QLoRA SFT + Iterative DPO |
| **Training Data** | 107K C# records |
| **License** | Apache 2.0 |
| **VRAM (Q4_K_M)** | ~9GB |

## What Makes This Different: Compile-Verified DPO

Most code models are trained on code and hope it works. We verified it.

After SFT training, our model compiled at only 57%. We ran 3 rounds of **iterative DPO using `dotnet build` as the reward signal** — generating code, compiling it, and teaching the model to prefer code that actually compiles:

| Stage | Method | Compile Rate |
|---|---|---|
| Base (Qwen2.5-Coder-14B) | — | 83% |
| After SFT (107K records) | QLoRA | 57% |
| After DPO Round 1 | +126 pairs, beta=0.1 | 73% |
| After DPO Round 2 | +256 pairs, beta=0.2 | 87% |
| **After DPO Round 3** | **+468 pairs, beta=0.3** | **97%** |

The preference signal is binary and objective: the compiler says yes or no. No human labeling, no LLM-as-judge — just `dotnet build`.

## Benchmarks

Evaluated on **120+ unique prompts** across 4 independent test sets (original 30, holdout 30, and two validation sets of 40 each). No test prompts were used during training. All benchmarks are self-reported using our evaluation suite — results may vary with different prompts, inference settings, or SDK versions.

### Compile Rate (code compiles with `dotnet build`)

| Model | Parameters | Compile Rate | Holdout (unseen prompts) |
|---|---|---|---|
| **dotnet-coder-14b** | **14B** | **97%** | **97%** |
| StarCoder2-15B-Instruct | 15B | 83% | — |
| Phi-4-14B | 14B | 83% | — |
| Qwen2.5-Coder-14B-Instruct | 14B | 83% | 93% |
| Qwen2.5-72B-Instruct | 72B | 80% | 80% |
| Qwen2.5-Coder-32B-Instruct | 32B | 80% | 87% |
| Yi-Coder-9B-Chat | 9B | 70% | — |
| Qwen2.5-Coder-7B-Instruct | 7B | 57% | 67% |
| DeepSeek-R1-Distill-Qwen-14B | 14B | 10% | 13% |

All competitor models evaluated using the same 30-prompt test suite, same system prompt, same .NET 8 SDK with identical NuGet packages, and same code extraction pipeline. Competitor models loaded in 4-bit quantization on A100 80GB.

### Multi-SDK Compatibility

Tested the same generated code against three .NET SDK versions:

| .NET Version | Compile Rate |
|---|---|
| .NET 6.0 | 29/30 (97%) |
| .NET 8.0 | 29/30 (97%) |
| .NET 9.0 | 29/30 (97%) |

The single failure across all SDKs is the same prompt — a `??` operator applied to an incompatible type in a value object base class. The model generates SDK-agnostic code that works across .NET 6, 8, and 9.

### Temperature Robustness

Compile rate at different sampling temperatures (30 prompts each):

| Temperature | Compile Rate |
|---|---|
| 0.2 | 29/30 (97%) |
| 0.5 | 29/30 (97%) |
| 0.8 | 29/30 (97%) |
| 1.0 | 26/30 (87%) |

The model maintains 97% compile rate through temperatures 0.2–0.8. Even at maximum randomness (temp=1.0), it still achieves 87%.

### Expert .NET Knowledge

| Model | Parameters | Expert Gotchas (/10) | C# 13/14 Features (/10) |
|---|---|---|---|
| **dotnet-coder-14b** | **14B** | **10** | **10** |
| Qwen2.5-72B-Instruct | 72B | 9 | 10 |
| Qwen2.5-Coder-14B-Instruct | 14B | 8 | 10 |
| Qwen2.5-Coder-32B-Instruct | 32B | 6 | 10 |
| DeepSeek-R1-Distill-Qwen-14B | 14B | 3 | 5 |

Expert gotcha questions test: ConcurrentDictionary.GetOrAdd atomicity, async void crash scenarios, ConfigureAwait deadlocks, HttpClient socket exhaustion, IEnumerable vs IQueryable in EF Core, DI captive dependency, ValueTask consumption rules, N+1 queries, GC generation costs, and middleware pipeline internals.

### Complex Multi-Class Tasks

10 tasks requiring 3-6 interacting classes (CQRS with dispatcher, event sourcing, fluent validation framework, resilience pipeline with circuit breaker, state machines, plugin systems):

- **First try: 70-90%**
- **With retry: 90%+**

## Example Output

**Prompt:** "Write a retry policy with exponential backoff for HttpClient in C#"

```csharp
using System;
using System.Net.Http;
using System.Threading.Tasks;

public class RetryPolicyHttpClient
{
    private readonly HttpClient _client = new();
    private static readonly Random Random = new();

    public async Task<string> GetAsync(string url)
    {
        int maxRetries = 3;
        for (int attempt = 0; attempt < maxRetries; attempt++)
        {
            try
            {
                return await _client.GetStringAsync(url);
            }
            catch (Exception ex) when (attempt < maxRetries - 1 && IsTransient(ex))
            {
                double delayInSeconds = Math.Pow(2, attempt) + Random.NextDouble();
                await Task.Delay(TimeSpan.FromSeconds(delayInSeconds));
            }
        }
        return await _client.GetStringAsync(url);
    }

    private static bool IsTransient(Exception exception)
    {
        return exception is HttpRequestException;
    }
}
```

No preamble, no "Certainly!", no package install instructions — just correct, compilable code with jitter-based backoff.

### Error Diagnosis

**Prompt:** "I'm getting `error CS0246: The type or namespace name 'ILogger<>' could not be found`. What's wrong?"

> The error message indicates that the compiler cannot find the `ILogger<>` type. This is because the `ILogger<>` interface is part of the Microsoft.Extensions.Logging namespace, which is not included in your code.
>
> Add the using directive: `using Microsoft.Extensions.Logging;`
>
> Make sure you also have the NuGet package installed:
> ```
> Install-Package Microsoft.Extensions.Logging
> ```

### Performance Issue Detection

**Prompt:** "My API endpoint is slow" + code with N+1 query

> The problem is that you're loading all Orders into memory, then for each one making a separate database call to load the related Customer. This is N+1 queries. The solution is to use Include() to eagerly load the related data in a single query:

```csharp
public async Task<List<OrderDto>> GetOrders()
{
    return await _context.Orders
        .Include(o => o.Customer)
        .Select(o => new OrderDto(o.Id, o.Customer.Name, o.Total))
        .ToListAsync();
}
```

The model handles both code generation and interactive debugging — it can diagnose errors, suggest fixes, recommend package installations, and identify performance issues like N+1 queries.

## What This Model Is Good At

- Writing complete, compilable C# classes and files
- **Diagnosing build errors** and suggesting fixes with correct NuGet packages
- **Identifying performance issues** like N+1 queries, socket exhaustion, GC pressure
- **Modifying existing code** on request (adding methods, refactoring)
- ASP.NET Core middleware, controllers, and minimal APIs
- Entity Framework Core queries, configurations, and patterns
- Design patterns implemented in idiomatic C#
- Modern C# features (records, primary constructors, collection expressions, pattern matching)
- Explaining C# gotchas and .NET internals
- Self-contained code — defines all types it references

## Recommended Settings

**System prompt** (used during training — best results with this or similar):

> You are an expert C# and .NET developer. Write complete, compilable C# code. Include all necessary using statements and namespace declarations.

**Inference parameters**: temperature=0.2, top_p=0.9, max_new_tokens=2048

The base model supports up to 32,768 tokens, so you can use the full 32K context window. The fine-tuning was done on sequences up to 2,048 tokens — the model performs best within this range but still works beyond it thanks to the base model's capabilities. The model will stop generating when done (EOS token), so setting a higher limit won't cause unnecessary output.

## Usage

### With Transformers

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "zipaltrivedi/dotnet-coder-14b",
    torch_dtype=torch.float16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("zipaltrivedi/dotnet-coder-14b")

messages = [
    {"role": "system", "content": "You are an expert C# and .NET developer. Write complete, compilable C# code."},
    {"role": "user", "content": "Write a thread-safe LRU cache with generic key and value types in C#."},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=2048, temperature=0.2, top_p=0.9)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
```

### With Ollama

Download a GGUF file from the Files tab, then create a `Modelfile`:

```
FROM ./dotnet-coder-14b-Q4_K_M.gguf

SYSTEM "You are an expert C# and .NET developer. Write complete, compilable C# code. Include all necessary using statements and namespace declarations."

PARAMETER temperature 0.2
PARAMETER top_p 0.9
```

Then run:

```bash
ollama create dotnet-coder -f Modelfile
ollama run dotnet-coder
```

### With llama.cpp

```bash
./llama-cli -m dotnet-coder-14b-Q4_K_M.gguf -p "Write a C# class for..." --temp 0.2
```

### With LM Studio / Jan / GPT4All

Download the GGUF file matching your hardware from the Files tab and load it in your preferred UI.

## GGUF Quantizations

| Quantization | Size | Min RAM | Recommended For |
|---|---|---|---|
| Q8_0 | 14.6 GB | 16GB+ | Best quality — RTX 4090, A100, M3 Max 36GB |
| Q6_K | 11.3 GB | 14GB+ | High quality — RTX 4080, M2 Max 32GB |
| **Q4_K_M** | **8.4 GB** | **10GB+** | **Recommended — RTX 3080/4070, M2 Pro 16GB** |
| Q4_K_S | 8.0 GB | 10GB+ | Slightly smaller — M1 Pro 16GB |
| Q3_K_M | 6.8 GB | 8GB+ | Budget GPU, Apple M1/M2 8GB |
| Q2_K | 5.4 GB | 6GB+ | CPU-only inference, minimum viable |

## Training Details

### Dataset (107K records)

| Source | Records | Description |
|---|---|---|
| Expert C# knowledge | 54,443 | Curated Q&A covering gotchas, patterns, best practices, version-specific features |
| Compile-verified repos | 35,736 | Self-contained C# files from 140 GitHub repos, filtered and verified |
| .NET runtime source | 12,352 | Code from dotnet/runtime, aspnetcore, and other core .NET repos |
| Synthetic examples | 4,906 | C# 13/14 features, debugging pairs, code review examples |

### SFT Hyperparameters

- **Method**: QLoRA 4-bit, LoRA rank 64, alpha 128
- **Training**: 2 epochs, lr=2e-4, cosine schedule, 3% warmup, packing enabled
- **Batch**: effective batch size 16 (2 per device x 8 gradient accumulation)
- **Hardware**: RunPod A100 80GB SXM, ~13 hours
- **Cost**: ~$20

### DPO Hyperparameters

- **Rounds**: 3 iterative rounds
- **Pair generation**: Model generates 3-5 responses per prompt at different temperatures, compiled with `dotnet build`, pass=chosen / fail=rejected
- **Training**: beta=0.1→0.2→0.3 (increasing preference strength), lr=5e-5, 1-3 epochs per round
- **Total pairs**: 850 across all rounds
- **Hardware**: Same A100, ~2 hours total
- **Cost**: ~$5

### Evaluation Methodology

All compile tests use actual `dotnet build` with .NET 8 SDK against a project with common NuGet packages (EF Core, ASP.NET Core, Microsoft.Extensions). Pass/fail is binary based on compiler exit code — no manual evaluation or LLM-as-judge.

Tests are run across 4 independent prompt sets totaling 120+ unique prompts. Holdout and validation prompts were never used during any stage of training or DPO pair generation.

## Limitations

- **Optimized for single-file generation** — for multi-project solutions, use as a component alongside an agent framework
- **Best for experienced developers** — gives direct code answers, not step-by-step tutorials
- **English only** — trained on English C# content
- **14B parameter model** — for extremely complex architectural decisions, larger models may provide more nuanced analysis
- **Compile rate is not 100%** — the remaining ~3% failures are typically complex generic dispatch patterns (e.g., CQRS mediator with runtime handler resolution) that produce type constraint errors
- **Compile ≠ correct** — code that compiles is not guaranteed to be logically correct or free of runtime errors. Compilation is a necessary but not sufficient measure of code quality. Always review generated code before production use

## Benchmark Limitations

- All benchmarks are **self-reported** using our custom evaluation suite — not a standardized benchmark like HumanEval or MBPP
- Compile rate is primarily tested against **.NET 8 SDK** with a specific set of NuGet packages. Cross-validation against .NET 6, 8, and 9 shows identical results (97%), but results may differ with other package configurations
- Expert knowledge evaluation involved checking whether responses address the core question with code examples — this has a subjective component
- Sample sizes (30 prompts per test set) are small; results have inherent variance
- No formal analysis of training/test data overlap with the Qwen2.5-Coder base model's pre-training data
- **Metric circularity**: DPO training uses `dotnet build` as the reward signal, and compile rate is measured using the same tool. While the evaluation prompts are completely separate from DPO training prompts, the model is optimized for the same metric it's evaluated on

## Ethical Considerations

- **No safety alignment**: This model has no specific safety training beyond what exists in the Qwen2.5-Coder base model. It may generate code with security vulnerabilities if prompted
- **Bias**: Training data is sourced from public repositories and Q&A sites, which may reflect coding conventions and patterns from specific communities
- **Not a substitute for code review**: Generated code should be reviewed by a developer before use in production
- **Training data provenance**: Training data includes content from StackOverflow (CC-BY-SA 4.0), Microsoft Learn (CC-BY-4.0), The Stack (permissive licenses), and GitHub repos (Apache/MIT). The relationship between CC-BY-SA training data and model output licensing is an open legal question across the LLM industry. Users should be aware of this when using generated code in commercial settings

## Use Cases

- **Coding agent backend** — serve via OpenAI-compatible API for use with OpenCode, Continue, Cursor, Claude Code
- **Local code assistant** — run with Ollama or LM Studio for offline C# development
- **CI/CD code generation** — generate boilerplate, tests, and implementations in automated pipelines
- **Code review** — get expert-level feedback on C# patterns and .NET best practices

## Reproducibility

- **Base model**: `Qwen/Qwen2.5-Coder-14B-Instruct` from HuggingFace
- **Training framework**: Unsloth 2026.4.5 + PEFT + TRL on RunPod A100 80GB
- **Random seed**: 42 (dataset shuffle)
- Training scripts, evaluation code, and LoRA adapter weights available upon request

## License

Apache 2.0 (same as base model Qwen2.5-Coder-14B-Instruct)

## Citation

```bibtex
@misc{dotnet-coder-14b,
  author = {Zipal Trivedi},
  title = {dotnet-coder-14b: A C#/.NET Specialist Language Model},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/zipaltrivedi/dotnet-coder-14b}
}
```

## Acknowledgments

- Base model: [Qwen2.5-Coder-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-14B-Instruct) by Alibaba
- Training framework: [Unsloth](https://github.com/unslothai/unsloth)
- Training data sources: The Stack (permissive licenses), StackOverflow (CC-BY-SA 4.0), Microsoft Learn (CC-BY-4.0), GitHub repos (Apache/MIT licensed)

## Contact

For issues, questions, or feedback: [HuggingFace Discussions](https://huggingface.co/zipaltrivedi/dotnet-coder-14b/discussions)