--- base_model: - Qwen/Qwen2.5-Coder-32B-Instruct library_name: peft license: cc-by-nc-4.0 datasets: - Jessylg27/DeepThink-Code-Lite language: - en - fr tags: - code - logic - reasoning - qwen2.5 - unsloth - sft - trl --- # Specialized Coding Logic LLM (32B) This model is a specialized fine-tuned version of [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct). It has been optimized to enhance **logical reasoning** and **code generation capabilities**. ## 🧠 Model Description **Specialized Coding Logic LLM** builds upon the powerful Qwen 2.5 Coder architecture (32B parameters). It has been fine-tuned using the **DeepThink-Code-Lite** dataset to improve its ability to: - Solve complex algorithmic problems. - Follow multi-step logical instructions. - Generate cleaner and more optimized code. ## 📊 Dataset This model was trained on the custom dataset: 👉 **[Jessylg27/DeepThink-Code-Lite](https://huggingface.co/datasets/Jessylg27/DeepThink-Code-Lite)** ## 🚀 Quick Start You can use this model directly with the Hugging Face `pipeline`. ```python from transformers import pipeline # Define the model ID model_id = "Jessylg27/specialized-coding-logic-llm" # Initialize the pipeline generator = pipeline("text-generation", model=model_id, device_map="auto") # Prompt the model question = "Write a Python function to solve the Traveling Salesman Problem using dynamic programming." output = generator([{"role": "user", "content": question}], max_new_tokens=512, return_full_text=False)[0] print(output["generated_text"]) ``` ## 🛠️ Training procedure This model was trained with **SFT (Supervised Fine-Tuning)** using the [TRL library](https://github.com/huggingface/trl) and [Unsloth](https://github.com/unslothai/unsloth) for efficient training. ### Framework versions * **PEFT:** 0.18.1 * **TRL:** 0.24.0 * **Transformers:** 4.57.3 * **Pytorch:** 2.8.0+cu128 * **Datasets:** 4.3.0 * **Tokenizers:** 0.22.2 ## 📜 Citations If you use this model or the TRL library, please cite: ```bibtex @misc{vonwerra2022trl, title = {{TRL: Transformer Reinforcement Learning}}, author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec}, year = 2020, journal = {GitHub repository}, publisher = {GitHub}, howpublished = {\url{[https://github.com/huggingface/trl](https://github.com/huggingface/trl)}} } ```