File size: 4,217 Bytes
9da4fcc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a3098b4
 
 
 
9da4fcc
 
 
 
a3098b4
9da4fcc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a3098b4
9da4fcc
 
a3098b4
9da4fcc
 
 
 
 
a3098b4
9da4fcc
 
a3098b4
9da4fcc
 
 
 
 
a3098b4
9da4fcc
 
 
a3098b4
 
 
 
 
 
 
9da4fcc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a3098b4
9da4fcc
 
 
 
a3098b4
9da4fcc
 
 
 
 
 
 
a3098b4
9da4fcc
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
---
license: apache-2.0
base_model: Nanbeige/Nanbeige4.1-3B
tags:
- code
- python
- fine-tuned
- lora
- direct-output
language:
- en
pipeline_tag: text-generation
---

# Nanbeige 4.1 Python DeepThink - 3B

Fine-tuned version of [Nanbeige/Nanbeige4.1-3B](https://huggingface.co/Nanbeige/Nanbeige4.1-3B) specialized for Python code generation with direct, focused output.

**Version:** E1 (Experiment 1)  
**Training Focus:** Code accuracy and clean output format  
**Status:** Production-ready for direct code generation tasks

## Model Description

This model was fine-tuned using LoRA on 45,757 examples (84% Python code, 16% mathematical reasoning) to specialize in Python code generation. It achieves 87.4% token-level accuracy while providing clean, direct responses optimized for production use.

## Training Details

- **Base Model:** Nanbeige/Nanbeige4.1-3B (3B parameters)
- **Method:** LoRA (r=16, alpha=16)
- **Trainable Parameters:** 28.4M (0.72%)
- **Training Time:** ~16 hours on RTX 5060 Ti 16GB
- **Datasets:** Magicoder-OSS-Instruct-75K (Python), GSM8K (reasoning)
- **Framework:** Transformers + PEFT

### Performance Improvements

| Metric | Baseline | Fine-tuned | Change |
|--------|----------|------------|--------|
| Loss | 1.04 | 0.45 | -57% |
| Token Accuracy | 76.3% | 87.4% | +11.1 pts |
| Entropy | 0.78 | 0.44 | -44% |

## Key Features

-**Direct Output Format** - Clean code responses without verbose preambles
-**High Accuracy** - 87% token-level accuracy on Python tasks
-**Fast Inference** - Optimized for quick responses
- ⚠️ **Suppressed Chain-of-Thought** - E1 focuses on direct answers (reasoning occurs internally but isn't narrated)

## Usage

### Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
    trust_remote_code=True
)

prompt = 'Write a Python function to validate email addresses'
inputs = tokenizer(prompt, return_tensors='pt')
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0]))
```

### Ollama
```bash
# Pull from Ollama registry
ollama pull fauxpaslife/nanbeige4.1-python-deepthink:3b

# Run
ollama run fauxpaslife/nanbeige4.1-python-deepthink:3b
```

### llama.cpp
```bash
# Download GGUF
wget https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B/resolve/main/nanbeige4.1-python-deepthink-q8.gguf

# Run
./llama-cli -m nanbeige4.1-python-deepthink-q8.gguf -p \"Write a binary search function\"
```

## File Structure

- *.safetensors - Merged model weights (Transformers)
- config.json - Model configuration
- 	okenizer.json - Tokenizer files
- 
anbeige4.1-python-deepthink-fp16.gguf - Full precision GGUF (7.9GB)
- 
anbeige4.1-python-deepthink-q8.gguf - 8-bit quantized GGUF (4.2GB)

## Best Use Cases

- Direct Python code generation
- Algorithm implementations
- Flask/FastAPI endpoint creation
- Code debugging with concise explanations
- Production codebases requiring deterministic output

## When to Use Base Model Instead

- Complex problems requiring visible reasoning
- Exploring multiple solution approaches
- Educational explanations with thought process
- Research/debugging requiring transparency

## Training Notes

E1 focused on direct output format. Training data contained no chain-of-thought examples, resulting in suppressed <think> tag behavior. Internal reasoning capability is preserved (evidenced by accuracy gains), but output format is optimized for production code generation.

**E2 Development:** Next iteration will reintroduce chain-of-thought reasoning while maintaining code quality.

## Citation
```bibtex
@misc{nanbeige-python-deepthink-e1,
  title={Nanbeige 4.1 Python DeepThink 3B},
  author={deltakitsune},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B}
}
```

## License

Apache 2.0 (same as base model)

## Developed By

**deltakitsune** (fauxpaslife)  
Part of the Delta:Kitsune AI platform development  
February 2026