Upload README.md with huggingface_hub

a3098b4 verified 28 days ago

4.22 kB

	---
	license: apache-2.0
	base_model: Nanbeige/Nanbeige4.1-3B
	tags:
	- code
	- python
	- fine-tuned
	- lora
	- direct-output
	language:
	- en
	pipeline_tag: text-generation
	---

	# Nanbeige 4.1 Python DeepThink - 3B

	Fine-tuned version of [Nanbeige/Nanbeige4.1-3B](https://huggingface.co/Nanbeige/Nanbeige4.1-3B) specialized for Python code generation with direct, focused output.

	Version: E1 (Experiment 1)
	Training Focus: Code accuracy and clean output format
	Status: Production-ready for direct code generation tasks

	## Model Description

	This model was fine-tuned using LoRA on 45,757 examples (84% Python code, 16% mathematical reasoning) to specialize in Python code generation. It achieves 87.4% token-level accuracy while providing clean, direct responses optimized for production use.

	## Training Details

	- Base Model: Nanbeige/Nanbeige4.1-3B (3B parameters)
	- Method: LoRA (r=16, alpha=16)
	- Trainable Parameters: 28.4M (0.72%)
	- Training Time: ~16 hours on RTX 5060 Ti 16GB
	- Datasets: Magicoder-OSS-Instruct-75K (Python), GSM8K (reasoning)
	- Framework: Transformers + PEFT

	### Performance Improvements

	\| Metric \| Baseline \| Fine-tuned \| Change \|
	\|--------\|----------\|------------\|--------\|
	\| Loss \| 1.04 \| 0.45 \| -57% \|
	\| Token Accuracy \| 76.3% \| 87.4% \| +11.1 pts \|
	\| Entropy \| 0.78 \| 0.44 \| -44% \|

	## Key Features

	- ✅ Direct Output Format - Clean code responses without verbose preambles
	- ✅ High Accuracy - 87% token-level accuracy on Python tasks
	- ✅ Fast Inference - Optimized for quick responses
	- ⚠️ Suppressed Chain-of-Thought - E1 focuses on direct answers (reasoning occurs internally but isn't narrated)

	## Usage

	### Transformers
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained(
	'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
	trust_remote_code=True
	)
	tokenizer = AutoTokenizer.from_pretrained(
	'deltakitsune/Nanbeige-4.1-Python-DeepThink-3B',
	trust_remote_code=True
	)

	prompt = 'Write a Python function to validate email addresses'
	inputs = tokenizer(prompt, return_tensors='pt')
	outputs = model.generate(**inputs, max_length=512)
	print(tokenizer.decode(outputs[0]))
	```

	### Ollama
	```bash
	# Pull from Ollama registry
	ollama pull fauxpaslife/nanbeige4.1-python-deepthink:3b

	# Run
	ollama run fauxpaslife/nanbeige4.1-python-deepthink:3b
	```

	### llama.cpp
	```bash
	# Download GGUF
	wget https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B/resolve/main/nanbeige4.1-python-deepthink-q8.gguf

	# Run
	./llama-cli -m nanbeige4.1-python-deepthink-q8.gguf -p \"Write a binary search function\"
	```

	## File Structure

	- *.safetensors - Merged model weights (Transformers)
	- config.json - Model configuration
	- okenizer.json - Tokenizer files
	-
	anbeige4.1-python-deepthink-fp16.gguf - Full precision GGUF (7.9GB)
	-
	anbeige4.1-python-deepthink-q8.gguf - 8-bit quantized GGUF (4.2GB)

	## Best Use Cases

	- Direct Python code generation
	- Algorithm implementations
	- Flask/FastAPI endpoint creation
	- Code debugging with concise explanations
	- Production codebases requiring deterministic output

	## When to Use Base Model Instead

	- Complex problems requiring visible reasoning
	- Exploring multiple solution approaches
	- Educational explanations with thought process
	- Research/debugging requiring transparency

	## Training Notes

	E1 focused on direct output format. Training data contained no chain-of-thought examples, resulting in suppressed <think> tag behavior. Internal reasoning capability is preserved (evidenced by accuracy gains), but output format is optimized for production code generation.

	E2 Development: Next iteration will reintroduce chain-of-thought reasoning while maintaining code quality.

	## Citation
	```bibtex
	@misc{nanbeige-python-deepthink-e1,
	title={Nanbeige 4.1 Python DeepThink 3B},
	author={deltakitsune},
	year={2026},
	publisher={HuggingFace},
	url={https://huggingface.co/deltakitsune/Nanbeige-4.1-Python-DeepThink-3B}
	}
	```

	## License

	Apache 2.0 (same as base model)

	## Developed By

	deltakitsune (fauxpaslife)
	Part of the Delta:Kitsune AI platform development
	February 2026