File size: 4,217 Bytes
9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc a3098b4 9da4fcc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | ---
license: apache-2.0
base_model: Nanbeige/Nanbeige4.1-3B
tags:
- code
- python
- fine-tuned
- lora
- direct-output
language:
- en
pipeline_tag: text-generation
---
# Nanbeige 4.1 Python DeepThink - 3B
Fine-tuned version of [Nanbeige/Nanbeige4.1-3B](https://huggingface.co/Nanbeige/Nanbeige4.1-3B) specialized for Python code generation with direct, focused output.
**Version:** E1 (Experiment 1)
**Training Focus:** Code accuracy and clean output format
**Status:** Production-ready for direct code generation tasks
## Model Description
This model was fine-tuned using LoRA on 45,757 examples (84% Python code, 16% mathematical reasoning) to specialize in Python code generation. It achieves 87.4% token-level accuracy while providing clean, direct responses optimized for production use.
## Training Details
- **Base Model:** Nanbeige/Nanbeige4.1-3B (3B parameters)
- **Method:** LoRA (r=16, alpha=16)
- **Trainable Parameters:** 28.4M (0.72%)
- **Training Time:** ~16 hours on RTX 5060 Ti 16GB
- **Datasets:** Magicoder-OSS-Instruct-75K (Python), GSM8K (reasoning)
- **Framework:** Transformers + PEFT
### Performance Improvements
| Metric | Baseline | Fine-tuned | Change |
|--------|----------|------------|--------|
| Loss | 1.04 | 0.45 | -57% |
| Token Accuracy | 76.3% | 87.4% | +11.1 pts |
| Entropy | 0.78 | 0.44 | -44% |
## Key Features
- ✅ **Direct Output Format** - Clean code responses without verbose preambles
- ✅ **High Accuracy** - 87% token-level accuracy on Python tasks
- ✅ **Fast Inference** - Optimized for quick responses
- ⚠️ **Suppressed Chain-of-Thought** - E1 focuses on direct answers (reasoning occurs internally but isn't narrated)
## Usage
### Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
trust_remote_code=True
)
prompt = 'Write a Python function to validate email addresses'
inputs = tokenizer(prompt, return_tensors='pt')
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0]))
```
### Ollama
```bash
# Pull from Ollama registry
ollama pull fauxpaslife/nanbeige4.1-python-deepthink:3b
# Run
ollama run fauxpaslife/nanbeige4.1-python-deepthink:3b
```
### llama.cpp
```bash
# Download GGUF
wget https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B/resolve/main/nanbeige4.1-python-deepthink-q8.gguf
# Run
./llama-cli -m nanbeige4.1-python-deepthink-q8.gguf -p \"Write a binary search function\"
```
## File Structure
- *.safetensors - Merged model weights (Transformers)
- config.json - Model configuration
- okenizer.json - Tokenizer files
-
anbeige4.1-python-deepthink-fp16.gguf - Full precision GGUF (7.9GB)
-
anbeige4.1-python-deepthink-q8.gguf - 8-bit quantized GGUF (4.2GB)
## Best Use Cases
- Direct Python code generation
- Algorithm implementations
- Flask/FastAPI endpoint creation
- Code debugging with concise explanations
- Production codebases requiring deterministic output
## When to Use Base Model Instead
- Complex problems requiring visible reasoning
- Exploring multiple solution approaches
- Educational explanations with thought process
- Research/debugging requiring transparency
## Training Notes
E1 focused on direct output format. Training data contained no chain-of-thought examples, resulting in suppressed <think> tag behavior. Internal reasoning capability is preserved (evidenced by accuracy gains), but output format is optimized for production code generation.
**E2 Development:** Next iteration will reintroduce chain-of-thought reasoning while maintaining code quality.
## Citation
```bibtex
@misc{nanbeige-python-deepthink-e1,
title={Nanbeige 4.1 Python DeepThink 3B},
author={deltakitsune},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B}
}
```
## License
Apache 2.0 (same as base model)
## Developed By
**deltakitsune** (fauxpaslife)
Part of the Delta:Kitsune AI platform development
February 2026
|