|
|
--- |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-Coder-32B-Instruct |
|
|
library_name: peft |
|
|
license: cc-by-nc-4.0 |
|
|
datasets: |
|
|
- Jessylg27/DeepThink-Code-Lite |
|
|
language: |
|
|
- en |
|
|
- fr |
|
|
tags: |
|
|
- code |
|
|
- logic |
|
|
- reasoning |
|
|
- qwen2.5 |
|
|
- unsloth |
|
|
- sft |
|
|
- trl |
|
|
--- |
|
|
|
|
|
# Specialized Coding Logic LLM (32B) |
|
|
|
|
|
This model is a specialized fine-tuned version of [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct). |
|
|
It has been optimized to enhance **logical reasoning** and **code generation capabilities**. |
|
|
|
|
|
## π§ Model Description |
|
|
|
|
|
**Specialized Coding Logic LLM** builds upon the powerful Qwen 2.5 Coder architecture (32B parameters). It has been fine-tuned using the **DeepThink-Code-Lite** dataset to improve its ability to: |
|
|
- Solve complex algorithmic problems. |
|
|
- Follow multi-step logical instructions. |
|
|
- Generate cleaner and more optimized code. |
|
|
|
|
|
## π Dataset |
|
|
|
|
|
This model was trained on the custom dataset: |
|
|
π **[Jessylg27/DeepThink-Code-Lite](https://huggingface.co/datasets/Jessylg27/DeepThink-Code-Lite)** |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
You can use this model directly with the Hugging Face `pipeline`. |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
# Define the model ID |
|
|
model_id = "Jessylg27/specialized-coding-logic-llm" |
|
|
|
|
|
# Initialize the pipeline |
|
|
generator = pipeline("text-generation", model=model_id, device_map="auto") |
|
|
|
|
|
# Prompt the model |
|
|
question = "Write a Python function to solve the Traveling Salesman Problem using dynamic programming." |
|
|
output = generator([{"role": "user", "content": question}], max_new_tokens=512, return_full_text=False)[0] |
|
|
|
|
|
print(output["generated_text"]) |
|
|
|
|
|
``` |
|
|
|
|
|
## π οΈ Training procedure |
|
|
|
|
|
This model was trained with **SFT (Supervised Fine-Tuning)** using the [TRL library](https://github.com/huggingface/trl) and [Unsloth](https://github.com/unslothai/unsloth) for efficient training. |
|
|
|
|
|
### Framework versions |
|
|
|
|
|
* **PEFT:** 0.18.1 |
|
|
* **TRL:** 0.24.0 |
|
|
* **Transformers:** 4.57.3 |
|
|
* **Pytorch:** 2.8.0+cu128 |
|
|
* **Datasets:** 4.3.0 |
|
|
* **Tokenizers:** 0.22.2 |
|
|
|
|
|
## π Citations |
|
|
|
|
|
If you use this model or the TRL library, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{vonwerra2022trl, |
|
|
title = {{TRL: Transformer Reinforcement Learning}}, |
|
|
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec}, |
|
|
year = 2020, |
|
|
journal = {GitHub repository}, |
|
|
publisher = {GitHub}, |
|
|
howpublished = {\url{[https://github.com/huggingface/trl](https://github.com/huggingface/trl)}} |
|
|
} |
|
|
|
|
|
``` |