ArmandS11
/

DeepSeekR1-7B-FineTuned-python

Text Generation

Model card Files Files and versions

DeepSeekR1-7B-FineTuned-python / README.md

ArmandS11's picture

Update README.md

ea0f264 verified 2 months ago

|

history blame contribute delete

2.31 kB

	---
	license: mit
	datasets:
	- iamtarun/python_code_instructions_18k_alpaca
	language:
	- en
	- fr
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	pipeline_tag: text-generation
	library_name: mlx
	tags:
	- code
	- python
	- deepseek
	- fine-tuned
	- lora
	---
	# DeepSeek-R1-Distill-Qwen-7B — Python Code Fine-tune

	A LoRA fine-tuned version of [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) specialized for Python code generation.

	## Model Details

	### Model Description

	- Developed by: Armand (@ArmanS11)
	- Model type: Large Language Model — LoRA fine-tune
	- Language(s): English
	- License: MIT
	- Finetuned from: [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)

	### Model Sources

	- Base model: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	- Training dataset: https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca

	## Uses

	### Direct Use

	Generate Python code from natural language instructions. Examples:
	- Writing functions, classes, algorithms
	- Async/await patterns
	- Data structures and error handling

	### Out-of-Scope Use

	- Not intended for other programming languages
	- Not suitable for production security-critical code without review

	## Bias, Risks, and Limitations

	Generated code should always be reviewed before use in production. The model may occasionally produce syntactically incorrect code, particularly for complex async patterns.


	## Training Details

	### Training Data

	[iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) — 18,612 Python code instruction/response pairs.

	- Train split: 17,681 examples
	- Validation split: 931 examples

	### Training Hyperparameters

	\| Parameter \| Value \|
	\|---\|---\|
	\| Method \| LoRA \|
	\| LoRA Rank \| 8 \|
	\| LoRA Layers \| 8 \|
	\| Learning Rate \| 5e-6 \|
	\| Batch Size \| 2 \|
	\| Iterations \| 2000 \|
	\| Quantization \| 4-bit \|

	## Technical Specifications

	### Compute Infrastructure

	#### Hardware
	- Apple MacBook Pro M4 — 16 GB unified memory

	#### Software
	- MLX (Apple Silicon optimized)
	- M-Courtyard fine-tuning app

	## Model Card Authors

	Armand — [@ArmandS11](https://huggingface.co/ArmandS11/)