---
license: apache-2.0
base_model: Nanbeige/Nanbeige4.1-3B
tags:
- code
- python
- fine-tuned
- lora
- direct-output
language:
- en
pipeline_tag: text-generation
---

# Nanbeige 4.1 Python DeepThink - 3B

Fine-tuned version of [Nanbeige/Nanbeige4.1-3B](https://huggingface.co/Nanbeige/Nanbeige4.1-3B) specialized for Python code generation with direct, focused output.

**Version:** E1 (Experiment 1)  
**Training Focus:** Code accuracy and clean output format  
**Status:** Production-ready for direct code generation tasks

## Model Description

This model was fine-tuned using LoRA on 45,757 examples (84% Python code, 16% mathematical reasoning) to specialize in Python code generation. It achieves 87.4% token-level accuracy while providing clean, direct responses optimized for production use.

## Training Details

- **Base Model:** Nanbeige/Nanbeige4.1-3B (3B parameters)
- **Method:** LoRA (r=16, alpha=16)
- **Trainable Parameters:** 28.4M (0.72%)
- **Training Time:** ~16 hours on RTX 5060 Ti 16GB
- **Datasets:** Magicoder-OSS-Instruct-75K (Python), GSM8K (reasoning)
- **Framework:** Transformers + PEFT

### Performance Improvements

| Metric | Baseline | Fine-tuned | Change |
|--------|----------|------------|--------|
| Loss | 1.04 | 0.45 | -57% |
| Token Accuracy | 76.3% | 87.4% | +11.1 pts |
| Entropy | 0.78 | 0.44 | -44% |

## Key Features

- ✅ **Direct Output Format** - Clean code responses without verbose preambles
- ✅ **High Accuracy** - 87% token-level accuracy on Python tasks
- ✅ **Fast Inference** - Optimized for quick responses
- ⚠️ **Suppressed Chain-of-Thought** - E1 focuses on direct answers (reasoning occurs internally but isn't narrated)

## Usage

### Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
    trust_remote_code=True
)

prompt = 'Write a Python function to validate email addresses'
inputs = tokenizer(prompt, return_tensors='pt')
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0]))
```

### Ollama
```bash
# Pull from Ollama registry
ollama pull fauxpaslife/nanbeige4.1-python-deepthink:3b

# Run
ollama run fauxpaslife/nanbeige4.1-python-deepthink:3b
```

### llama.cpp
```bash
# Download GGUF
wget https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B/resolve/main/nanbeige4.1-python-deepthink-q8.gguf

# Run
./llama-cli -m nanbeige4.1-python-deepthink-q8.gguf -p \"Write a binary search function\"
```

## File Structure

- *.safetensors - Merged model weights (Transformers)
- config.json - Model configuration
- 	okenizer.json - Tokenizer files
- 
anbeige4.1-python-deepthink-fp16.gguf - Full precision GGUF (7.9GB)
- 
anbeige4.1-python-deepthink-q8.gguf - 8-bit quantized GGUF (4.2GB)

## Best Use Cases

- Direct Python code generation
- Algorithm implementations
- Flask/FastAPI endpoint creation
- Code debugging with concise explanations
- Production codebases requiring deterministic output

## When to Use Base Model Instead

- Complex problems requiring visible reasoning
- Exploring multiple solution approaches
- Educational explanations with thought process
- Research/debugging requiring transparency

## Training Notes

E1 focused on direct output format. Training data contained no chain-of-thought examples, resulting in suppressed <think> tag behavior. Internal reasoning capability is preserved (evidenced by accuracy gains), but output format is optimized for production code generation.

**E2 Development:** Next iteration will reintroduce chain-of-thought reasoning while maintaining code quality.

## Citation
```bibtex
@misc{nanbeige-python-deepthink-e1,
  title={Nanbeige 4.1 Python DeepThink 3B},
  author={deltakitsune},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B}
}
```

## License

Apache 2.0 (same as base model)

## Developed By

**deltakitsune** (fauxpaslife)  
Part of the Delta:Kitsune AI platform development  
February 2026