| | --- |
| | license: apache-2.0 |
| | base_model: Nanbeige/Nanbeige4.1-3B |
| | tags: |
| | - code |
| | - python |
| | - fine-tuned |
| | - lora |
| | - direct-output |
| | language: |
| | - en |
| | pipeline_tag: text-generation |
| | --- |
| | |
| | # Nanbeige 4.1 Python DeepThink - 3B |
| |
|
| | Fine-tuned version of [Nanbeige/Nanbeige4.1-3B](https://huggingface.co/Nanbeige/Nanbeige4.1-3B) specialized for Python code generation with direct, focused output. |
| |
|
| | **Version:** E1 (Experiment 1) |
| | **Training Focus:** Code accuracy and clean output format |
| | **Status:** Production-ready for direct code generation tasks |
| |
|
| | ## Model Description |
| |
|
| | This model was fine-tuned using LoRA on 45,757 examples (84% Python code, 16% mathematical reasoning) to specialize in Python code generation. It achieves 87.4% token-level accuracy while providing clean, direct responses optimized for production use. |
| |
|
| | ## Training Details |
| |
|
| | - **Base Model:** Nanbeige/Nanbeige4.1-3B (3B parameters) |
| | - **Method:** LoRA (r=16, alpha=16) |
| | - **Trainable Parameters:** 28.4M (0.72%) |
| | - **Training Time:** ~16 hours on RTX 5060 Ti 16GB |
| | - **Datasets:** Magicoder-OSS-Instruct-75K (Python), GSM8K (reasoning) |
| | - **Framework:** Transformers + PEFT |
| |
|
| | ### Performance Improvements |
| |
|
| | | Metric | Baseline | Fine-tuned | Change | |
| | |--------|----------|------------|--------| |
| | | Loss | 1.04 | 0.45 | -57% | |
| | | Token Accuracy | 76.3% | 87.4% | +11.1 pts | |
| | | Entropy | 0.78 | 0.44 | -44% | |
| |
|
| | ## Key Features |
| |
|
| | - ✅ **Direct Output Format** - Clean code responses without verbose preambles |
| | - ✅ **High Accuracy** - 87% token-level accuracy on Python tasks |
| | - ✅ **Fast Inference** - Optimized for quick responses |
| | - ⚠️ **Suppressed Chain-of-Thought** - E1 focuses on direct answers (reasoning occurs internally but isn't narrated) |
| |
|
| | ## Usage |
| |
|
| | ### Transformers |
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model = AutoModelForCausalLM.from_pretrained( |
| | 'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B', |
| | trust_remote_code=True |
| | ) |
| | tokenizer = AutoTokenizer.from_pretrained( |
| | 'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B', |
| | trust_remote_code=True |
| | ) |
| | |
| | prompt = 'Write a Python function to validate email addresses' |
| | inputs = tokenizer(prompt, return_tensors='pt') |
| | outputs = model.generate(**inputs, max_length=512) |
| | print(tokenizer.decode(outputs[0])) |
| | ``` |
| |
|
| | ### Ollama |
| | ```bash |
| | # Pull from Ollama registry |
| | ollama pull fauxpaslife/nanbeige4.1-python-deepthink:3b |
| | |
| | # Run |
| | ollama run fauxpaslife/nanbeige4.1-python-deepthink:3b |
| | ``` |
| |
|
| | ### llama.cpp |
| | ```bash |
| | # Download GGUF |
| | wget https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B/resolve/main/nanbeige4.1-python-deepthink-q8.gguf |
| | |
| | # Run |
| | ./llama-cli -m nanbeige4.1-python-deepthink-q8.gguf -p \"Write a binary search function\" |
| | ``` |
| |
|
| | ## File Structure |
| |
|
| | - *.safetensors - Merged model weights (Transformers) |
| | - config.json - Model configuration |
| | - okenizer.json - Tokenizer files |
| | - |
| | anbeige4.1-python-deepthink-fp16.gguf - Full precision GGUF (7.9GB) |
| | - |
| | anbeige4.1-python-deepthink-q8.gguf - 8-bit quantized GGUF (4.2GB) |
| | |
| | ## Best Use Cases |
| | |
| | - Direct Python code generation |
| | - Algorithm implementations |
| | - Flask/FastAPI endpoint creation |
| | - Code debugging with concise explanations |
| | - Production codebases requiring deterministic output |
| | |
| | ## When to Use Base Model Instead |
| | |
| | - Complex problems requiring visible reasoning |
| | - Exploring multiple solution approaches |
| | - Educational explanations with thought process |
| | - Research/debugging requiring transparency |
| | |
| | ## Training Notes |
| | |
| | E1 focused on direct output format. Training data contained no chain-of-thought examples, resulting in suppressed <think> tag behavior. Internal reasoning capability is preserved (evidenced by accuracy gains), but output format is optimized for production code generation. |
| | |
| | **E2 Development:** Next iteration will reintroduce chain-of-thought reasoning while maintaining code quality. |
| | |
| | ## Citation |
| | ```bibtex |
| | @misc{nanbeige-python-deepthink-e1, |
| | title={Nanbeige 4.1 Python DeepThink 3B}, |
| | author={deltakitsune}, |
| | year={2026}, |
| | publisher={HuggingFace}, |
| | url={https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B} |
| | } |
| | ``` |
| | |
| | ## License |
| | |
| | Apache 2.0 (same as base model) |
| | |
| | ## Developed By |
| | |
| | **deltakitsune** (fauxpaslife) |
| | Part of the Delta:Kitsune AI platform development |
| | February 2026
|
| | |