deltakitsune's picture
Upload README.md with huggingface_hub
a3098b4 verified
---
license: apache-2.0
base_model: Nanbeige/Nanbeige4.1-3B
tags:
- code
- python
- fine-tuned
- lora
- direct-output
language:
- en
pipeline_tag: text-generation
---
# Nanbeige 4.1 Python DeepThink - 3B
Fine-tuned version of [Nanbeige/Nanbeige4.1-3B](https://huggingface.co/Nanbeige/Nanbeige4.1-3B) specialized for Python code generation with direct, focused output.
**Version:** E1 (Experiment 1)
**Training Focus:** Code accuracy and clean output format
**Status:** Production-ready for direct code generation tasks
## Model Description
This model was fine-tuned using LoRA on 45,757 examples (84% Python code, 16% mathematical reasoning) to specialize in Python code generation. It achieves 87.4% token-level accuracy while providing clean, direct responses optimized for production use.
## Training Details
- **Base Model:** Nanbeige/Nanbeige4.1-3B (3B parameters)
- **Method:** LoRA (r=16, alpha=16)
- **Trainable Parameters:** 28.4M (0.72%)
- **Training Time:** ~16 hours on RTX 5060 Ti 16GB
- **Datasets:** Magicoder-OSS-Instruct-75K (Python), GSM8K (reasoning)
- **Framework:** Transformers + PEFT
### Performance Improvements
| Metric | Baseline | Fine-tuned | Change |
|--------|----------|------------|--------|
| Loss | 1.04 | 0.45 | -57% |
| Token Accuracy | 76.3% | 87.4% | +11.1 pts |
| Entropy | 0.78 | 0.44 | -44% |
## Key Features
-**Direct Output Format** - Clean code responses without verbose preambles
-**High Accuracy** - 87% token-level accuracy on Python tasks
-**Fast Inference** - Optimized for quick responses
- ⚠️ **Suppressed Chain-of-Thought** - E1 focuses on direct answers (reasoning occurs internally but isn't narrated)
## Usage
### Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
trust_remote_code=True
)
prompt = 'Write a Python function to validate email addresses'
inputs = tokenizer(prompt, return_tensors='pt')
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0]))
```
### Ollama
```bash
# Pull from Ollama registry
ollama pull fauxpaslife/nanbeige4.1-python-deepthink:3b
# Run
ollama run fauxpaslife/nanbeige4.1-python-deepthink:3b
```
### llama.cpp
```bash
# Download GGUF
wget https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B/resolve/main/nanbeige4.1-python-deepthink-q8.gguf
# Run
./llama-cli -m nanbeige4.1-python-deepthink-q8.gguf -p \"Write a binary search function\"
```
## File Structure
- *.safetensors - Merged model weights (Transformers)
- config.json - Model configuration
- okenizer.json - Tokenizer files
-
anbeige4.1-python-deepthink-fp16.gguf - Full precision GGUF (7.9GB)
-
anbeige4.1-python-deepthink-q8.gguf - 8-bit quantized GGUF (4.2GB)
## Best Use Cases
- Direct Python code generation
- Algorithm implementations
- Flask/FastAPI endpoint creation
- Code debugging with concise explanations
- Production codebases requiring deterministic output
## When to Use Base Model Instead
- Complex problems requiring visible reasoning
- Exploring multiple solution approaches
- Educational explanations with thought process
- Research/debugging requiring transparency
## Training Notes
E1 focused on direct output format. Training data contained no chain-of-thought examples, resulting in suppressed <think> tag behavior. Internal reasoning capability is preserved (evidenced by accuracy gains), but output format is optimized for production code generation.
**E2 Development:** Next iteration will reintroduce chain-of-thought reasoning while maintaining code quality.
## Citation
```bibtex
@misc{nanbeige-python-deepthink-e1,
title={Nanbeige 4.1 Python DeepThink 3B},
author={deltakitsune},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B}
}
```
## License
Apache 2.0 (same as base model)
## Developed By
**deltakitsune** (fauxpaslife)
Part of the Delta:Kitsune AI platform development
February 2026