--- license: apache-2.0 base_model: Nanbeige/Nanbeige4.1-3B tags: - code - python - fine-tuned - lora - direct-output language: - en pipeline_tag: text-generation --- # Nanbeige 4.1 Python DeepThink - 3B Fine-tuned version of [Nanbeige/Nanbeige4.1-3B](https://huggingface.co/Nanbeige/Nanbeige4.1-3B) specialized for Python code generation with direct, focused output. **Version:** E1 (Experiment 1) **Training Focus:** Code accuracy and clean output format **Status:** Production-ready for direct code generation tasks ## Model Description This model was fine-tuned using LoRA on 45,757 examples (84% Python code, 16% mathematical reasoning) to specialize in Python code generation. It achieves 87.4% token-level accuracy while providing clean, direct responses optimized for production use. ## Training Details - **Base Model:** Nanbeige/Nanbeige4.1-3B (3B parameters) - **Method:** LoRA (r=16, alpha=16) - **Trainable Parameters:** 28.4M (0.72%) - **Training Time:** ~16 hours on RTX 5060 Ti 16GB - **Datasets:** Magicoder-OSS-Instruct-75K (Python), GSM8K (reasoning) - **Framework:** Transformers + PEFT ### Performance Improvements | Metric | Baseline | Fine-tuned | Change | |--------|----------|------------|--------| | Loss | 1.04 | 0.45 | -57% | | Token Accuracy | 76.3% | 87.4% | +11.1 pts | | Entropy | 0.78 | 0.44 | -44% | ## Key Features - ✅ **Direct Output Format** - Clean code responses without verbose preambles - ✅ **High Accuracy** - 87% token-level accuracy on Python tasks - ✅ **Fast Inference** - Optimized for quick responses - ⚠️ **Suppressed Chain-of-Thought** - E1 focuses on direct answers (reasoning occurs internally but isn't narrated) ## Usage ### Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( 'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B', trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained( 'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B', trust_remote_code=True ) prompt = 'Write a Python function to validate email addresses' inputs = tokenizer(prompt, return_tensors='pt') outputs = model.generate(**inputs, max_length=512) print(tokenizer.decode(outputs[0])) ``` ### Ollama ```bash # Pull from Ollama registry ollama pull fauxpaslife/nanbeige4.1-python-deepthink:3b # Run ollama run fauxpaslife/nanbeige4.1-python-deepthink:3b ``` ### llama.cpp ```bash # Download GGUF wget https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B/resolve/main/nanbeige4.1-python-deepthink-q8.gguf # Run ./llama-cli -m nanbeige4.1-python-deepthink-q8.gguf -p \"Write a binary search function\" ``` ## File Structure - *.safetensors - Merged model weights (Transformers) - config.json - Model configuration - okenizer.json - Tokenizer files - anbeige4.1-python-deepthink-fp16.gguf - Full precision GGUF (7.9GB) - anbeige4.1-python-deepthink-q8.gguf - 8-bit quantized GGUF (4.2GB) ## Best Use Cases - Direct Python code generation - Algorithm implementations - Flask/FastAPI endpoint creation - Code debugging with concise explanations - Production codebases requiring deterministic output ## When to Use Base Model Instead - Complex problems requiring visible reasoning - Exploring multiple solution approaches - Educational explanations with thought process - Research/debugging requiring transparency ## Training Notes E1 focused on direct output format. Training data contained no chain-of-thought examples, resulting in suppressed tag behavior. Internal reasoning capability is preserved (evidenced by accuracy gains), but output format is optimized for production code generation. **E2 Development:** Next iteration will reintroduce chain-of-thought reasoning while maintaining code quality. ## Citation ```bibtex @misc{nanbeige-python-deepthink-e1, title={Nanbeige 4.1 Python DeepThink 3B}, author={deltakitsune}, year={2026}, publisher={HuggingFace}, url={https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B} } ``` ## License Apache 2.0 (same as base model) ## Developed By **deltakitsune** (fauxpaslife) Part of the Delta:Kitsune AI platform development February 2026