Jessylg27
/

specialized-coding-logic-llm

@@ -1,56 +1,78 @@
 ---
-base_model: unsloth/qwen2.5-coder-32b-instruct-bnb-4bit
 library_name: peft
-model_name: output_model
 tags:
-- base_model:adapter:unsloth/qwen2.5-coder-32b-instruct-bnb-4bit
-- lora
 - sft
-- transformers
 - trl
-- unsloth
-licence: license
-pipeline_tag: text-generation
 ---
-# Model Card for output_model
-This model is a fine-tuned version of [unsloth/qwen2.5-coder-32b-instruct-bnb-4bit](https://huggingface.co/unsloth/qwen2.5-coder-32b-instruct-bnb-4bit).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
 ```python
 from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="None", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
-```
-## Training procedure
-This model was trained with SFT.
-### Framework versions
-- PEFT 0.18.1
-- TRL: 0.23.1
-- Transformers: 4.57.1
-- Pytorch: 2.9.0+cu128
-- Datasets: 4.3.0
-- Tokenizers: 0.22.2
-## Citations
-Cite TRL as:
 ```bibtex
 @misc{vonwerra2022trl,
 	title        = {{TRL: Transformer Reinforcement Learning}},
@@ -58,6 +80,7 @@ Cite TRL as:
 	year         = 2020,
 	journal      = {GitHub repository},
 	publisher    = {GitHub},
-	howpublished = {\url{https://github.com/huggingface/trl}}
 }
 ```

 ---
+base_model:
+- Qwen/Qwen2.5-Coder-32B-Instruct
 library_name: peft
+license: cc-by-nc-4.0
+datasets:
+- Jessylg27/DeepThink-Code-Lite
+language:
+- en
+- fr
 tags:
+- code
+- logic
+- reasoning
+- qwen2.5
+- unsloth
 - sft
 - trl
 ---
+# Specialized Coding Logic LLM (32B)
+This model is a specialized fine-tuned version of [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct).
+It has been optimized to enhance **logical reasoning** and **code generation capabilities**.
+## 🧠 Model Description
+**Specialized Coding Logic LLM** builds upon the powerful Qwen 2.5 Coder architecture (32B parameters). It has been fine-tuned using the **DeepThink-Code-Lite** dataset to improve its ability to:
+- Solve complex algorithmic problems.
+- Follow multi-step logical instructions.
+- Generate cleaner and more optimized code.
+## 📊 Dataset
+This model was trained on the custom dataset:
+👉 **[Jessylg27/DeepThink-Code-Lite](https://huggingface.co/datasets/Jessylg27/DeepThink-Code-Lite)**
+## 🚀 Quick Start
+You can use this model directly with the Hugging Face `pipeline`.
 ```python
 from transformers import pipeline
+# Define the model ID
+model_id = "Jessylg27/specialized-coding-logic-llm"
+# Initialize the pipeline
+generator = pipeline("text-generation", model=model_id, device_map="auto")
+# Prompt the model
+question = "Write a Python function to solve the Traveling Salesman Problem using dynamic programming."
+output = generator([{"role": "user", "content": question}], max_new_tokens=512, return_full_text=False)[0]
+print(output["generated_text"])
+```
+## 🛠️ Training procedure
+This model was trained with **SFT (Supervised Fine-Tuning)** using the [TRL library](https://github.com/huggingface/trl) and [Unsloth](https://github.com/unslothai/unsloth) for efficient training.
+### Framework versions
+* **PEFT:** 0.18.1
+* **TRL:** 0.24.0
+* **Transformers:** 4.57.3
+* **Pytorch:** 2.8.0+cu128
+* **Datasets:** 4.3.0
+* **Tokenizers:** 0.22.2
+## 📜 Citations
+If you use this model or the TRL library, please cite:
 ```bibtex
 @misc{vonwerra2022trl,
 	title        = {{TRL: Transformer Reinforcement Learning}},
 	year         = 2020,
 	journal      = {GitHub repository},
 	publisher    = {GitHub},
+	howpublished = {\url{[https://github.com/huggingface/trl](https://github.com/huggingface/trl)}}
 }
 ```