Jessylg27 commited on
Commit
077ae6f
ยท
verified ยท
1 Parent(s): c955ab2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -34
README.md CHANGED
@@ -1,59 +1,78 @@
1
  ---
2
- base_model: unsloth/qwen2.5-coder-32b-instruct-bnb-4bit
 
3
  library_name: peft
4
- model_name: model
 
 
 
 
 
5
  tags:
6
- - base_model:adapter:unsloth/qwen2.5-coder-32b-instruct-bnb-4bit
7
- - lora
 
 
 
8
  - sft
9
- - transformers
10
  - trl
11
- - unsloth
12
- licence: license
13
- pipeline_tag: text-generation
14
- license: cc-by-nc-4.0
15
- datasets:
16
- - Jessylg27/DeepThink-Code-Lite
17
  ---
18
 
19
- # Model Card for model
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- This model is a fine-tuned version of [unsloth/qwen2.5-coder-32b-instruct-bnb-4bit](https://huggingface.co/unsloth/qwen2.5-coder-32b-instruct-bnb-4bit).
22
- It has been trained using [TRL](https://github.com/huggingface/trl).
23
 
24
- ## Quick start
 
 
25
 
26
  ```python
27
  from transformers import pipeline
28
 
29
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
30
- generator = pipeline("text-generation", model="None", device="cuda")
31
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
32
- print(output["generated_text"])
33
- ```
34
 
35
- ## Training procedure
 
36
 
37
-
 
 
38
 
 
39
 
40
- This model was trained with SFT.
41
 
42
- ### Framework versions
 
 
43
 
44
- - PEFT 0.18.1
45
- - TRL: 0.24.0
46
- - Transformers: 4.57.3
47
- - Pytorch: 2.8.0+cu128
48
- - Datasets: 4.3.0
49
- - Tokenizers: 0.22.2
50
 
51
- ## Citations
 
 
 
 
 
52
 
 
53
 
 
54
 
55
- Cite TRL as:
56
-
57
  ```bibtex
58
  @misc{vonwerra2022trl,
59
  title = {{TRL: Transformer Reinforcement Learning}},
@@ -61,6 +80,7 @@ Cite TRL as:
61
  year = 2020,
62
  journal = {GitHub repository},
63
  publisher = {GitHub},
64
- howpublished = {\url{https://github.com/huggingface/trl}}
65
  }
 
66
  ```
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-Coder-32B-Instruct
4
  library_name: peft
5
+ license: apache-2.0
6
+ datasets:
7
+ - Jessylg27/DeepThink-Code-Lite
8
+ language:
9
+ - en
10
+ - fr
11
  tags:
12
+ - code
13
+ - logic
14
+ - reasoning
15
+ - qwen2.5
16
+ - unsloth
17
  - sft
 
18
  - trl
 
 
 
 
 
 
19
  ---
20
 
21
+ # Specialized Coding Logic LLM (32B)
22
+
23
+ This model is a specialized fine-tuned version of [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct).
24
+ It has been optimized to enhance **logical reasoning** and **code generation capabilities**.
25
+
26
+ ## ๐Ÿง  Model Description
27
+
28
+ **Specialized Coding Logic LLM** builds upon the powerful Qwen 2.5 Coder architecture (32B parameters). It has been fine-tuned using the **DeepThink-Code-Lite** dataset to improve its ability to:
29
+ - Solve complex algorithmic problems.
30
+ - Follow multi-step logical instructions.
31
+ - Generate cleaner and more optimized code.
32
+
33
+ ## ๐Ÿ“Š Dataset
34
 
35
+ This model was trained on the custom dataset:
36
+ ๐Ÿ‘‰ **[Jessylg27/DeepThink-Code-Lite](https://huggingface.co/datasets/Jessylg27/DeepThink-Code-Lite)**
37
 
38
+ ## ๐Ÿš€ Quick Start
39
+
40
+ You can use this model directly with the Hugging Face `pipeline`.
41
 
42
  ```python
43
  from transformers import pipeline
44
 
45
+ # Define the model ID
46
+ model_id = "Jessylg27/specialized-coding-logic-llm"
 
 
 
47
 
48
+ # Initialize the pipeline
49
+ generator = pipeline("text-generation", model=model_id, device_map="auto")
50
 
51
+ # Prompt the model
52
+ question = "Write a Python function to solve the Traveling Salesman Problem using dynamic programming."
53
+ output = generator([{"role": "user", "content": question}], max_new_tokens=512, return_full_text=False)[0]
54
 
55
+ print(output["generated_text"])
56
 
57
+ ```
58
 
59
+ ## ๐Ÿ› ๏ธ Training procedure
60
+
61
+ This model was trained with **SFT (Supervised Fine-Tuning)** using the [TRL library](https://github.com/huggingface/trl) and [Unsloth](https://github.com/unslothai/unsloth) for efficient training.
62
 
63
+ ### Framework versions
 
 
 
 
 
64
 
65
+ * **PEFT:** 0.18.1
66
+ * **TRL:** 0.24.0
67
+ * **Transformers:** 4.57.3
68
+ * **Pytorch:** 2.8.0+cu128
69
+ * **Datasets:** 4.3.0
70
+ * **Tokenizers:** 0.22.2
71
 
72
+ ## ๐Ÿ“œ Citations
73
 
74
+ If you use this model or the TRL library, please cite:
75
 
 
 
76
  ```bibtex
77
  @misc{vonwerra2022trl,
78
  title = {{TRL: Transformer Reinforcement Learning}},
 
80
  year = 2020,
81
  journal = {GitHub repository},
82
  publisher = {GitHub},
83
+ howpublished = {\url{[https://github.com/huggingface/trl](https://github.com/huggingface/trl)}}
84
  }
85
+
86
  ```