DeepSeek-R1-Distill-Qwen-7B — Python Code Fine-tune

A LoRA fine-tuned version of DeepSeek-R1-Distill-Qwen-7B specialized for Python code generation.

Model Details

Model Description

Developed by: Armand (@ArmanS11)
Model type: Large Language Model — LoRA fine-tune
Language(s): English
License: MIT
Finetuned from: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Model Sources

Base model: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
Training dataset: https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca

Uses

Direct Use

Generate Python code from natural language instructions. Examples:

Writing functions, classes, algorithms
Async/await patterns
Data structures and error handling

Out-of-Scope Use

Not intended for other programming languages
Not suitable for production security-critical code without review

Bias, Risks, and Limitations

Generated code should always be reviewed before use in production. The model may occasionally produce syntactically incorrect code, particularly for complex async patterns.

Training Details

Training Data

iamtarun/python_code_instructions_18k_alpaca — 18,612 Python code instruction/response pairs.

Train split: 17,681 examples
Validation split: 931 examples

Training Hyperparameters

Parameter	Value
Method	LoRA
LoRA Rank	8
LoRA Layers	8
Learning Rate	5e-6
Batch Size	2
Iterations	2000
Quantization	4-bit

Technical Specifications

Compute Infrastructure

Hardware

Apple MacBook Pro M4 — 16 GB unified memory

Software

MLX (Apple Silicon optimized)
M-Courtyard fine-tuning app

Model Card Authors

Armand — @ArmandS11

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for ArmandS11/DeepSeekR1-7B-FineTuned-python

Base model

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Adapter

(118)

this model

ArmandS11
/

DeepSeekR1-7B-FineTuned-python