ArmandS11
/

DeepSeekR1-7B-FineTuned-python

Text Generation

Model card Files Files and versions

ArmandS11 commited on Mar 15

Commit

ee9e517

·

verified ·

1 Parent(s): c0af3b9

Upload README.md

Files changed (1) hide show

README.md +73 -16

README.md CHANGED Viewed

@@ -1,16 +1,73 @@
----
-license: mit
-datasets:
-- iamtarun/python_code_instructions_18k_alpaca
-language:
-- en
-base_model:
-- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
-pipeline_tag: text-generation
-library_name: mlx
-tags:
-- python
-- code
-- lora
-- fine-tuned
----

+# DeepSeek-R1-Distill-Qwen-7B — Python Code Fine-tune
+A LoRA fine-tuned version of [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) specialized for Python code generation.
+## Model Details
+### Model Description
+- **Developed by:** Armand (@ArmanS11)
+- **Model type:** Large Language Model — LoRA fine-tune
+- **Language(s):** English
+- **License:** MIT
+- **Finetuned from:** [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
+### Model Sources
+- **Base model:** https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
+- **Training dataset:** https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca
+## Uses
+### Direct Use
+Generate Python code from natural language instructions. Examples:
+- Writing functions, classes, algorithms
+- Async/await patterns
+- Data structures and error handling
+### Out-of-Scope Use
+- Not intended for other programming languages
+- Not suitable for production security-critical code without review
+## Bias, Risks, and Limitations
+Generated code should always be reviewed before use in production. The model may occasionally produce syntactically incorrect code, particularly for complex async patterns.
+## Training Details
+### Training Data
+[iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) — 18,612 Python code instruction/response pairs.
+- **Train split:** 17,681 examples
+- **Validation split:** 931 examples
+### Training Hyperparameters
+| Parameter | Value |
+|---|---|
+| Method | LoRA |
+| LoRA Rank | 8 |
+| LoRA Layers | 8 |
+| Learning Rate | 5e-6 |
+| Batch Size | 2 |
+| Iterations | 2000 |
+| Quantization | 4-bit |
+## Technical Specifications
+### Compute Infrastructure
+#### Hardware
+- Apple MacBook Pro M4 — 16 GB unified memory
+#### Software
+- MLX (Apple Silicon optimized)
+- M-Courtyard fine-tuning app
+## Model Card Authors
+Armand — [@ArmandS11](https://huggingface.co/ArmandS11/)