Jessylg27 commited on
Commit
c0799bc
ยท
verified ยท
1 Parent(s): 5b30bec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -31
README.md CHANGED
@@ -1,56 +1,78 @@
1
  ---
2
- base_model: unsloth/qwen2.5-coder-32b-instruct-bnb-4bit
 
3
  library_name: peft
4
- model_name: output_model
 
 
 
 
 
5
  tags:
6
- - base_model:adapter:unsloth/qwen2.5-coder-32b-instruct-bnb-4bit
7
- - lora
 
 
 
8
  - sft
9
- - transformers
10
  - trl
11
- - unsloth
12
- licence: license
13
- pipeline_tag: text-generation
14
  ---
15
 
16
- # Model Card for output_model
 
 
 
 
 
17
 
18
- This model is a fine-tuned version of [unsloth/qwen2.5-coder-32b-instruct-bnb-4bit](https://huggingface.co/unsloth/qwen2.5-coder-32b-instruct-bnb-4bit).
19
- It has been trained using [TRL](https://github.com/huggingface/trl).
 
 
20
 
21
- ## Quick start
 
 
 
 
 
 
 
22
 
23
  ```python
24
  from transformers import pipeline
25
 
26
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
27
- generator = pipeline("text-generation", model="None", device="cuda")
28
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
29
- print(output["generated_text"])
30
- ```
31
 
32
- ## Training procedure
 
33
 
34
-
 
 
35
 
 
36
 
37
- This model was trained with SFT.
38
 
39
- ### Framework versions
40
 
41
- - PEFT 0.18.1
42
- - TRL: 0.23.1
43
- - Transformers: 4.57.1
44
- - Pytorch: 2.9.0+cu128
45
- - Datasets: 4.3.0
46
- - Tokenizers: 0.22.2
47
 
48
- ## Citations
 
 
 
 
 
 
 
49
 
 
50
 
 
51
 
52
- Cite TRL as:
53
-
54
  ```bibtex
55
  @misc{vonwerra2022trl,
56
  title = {{TRL: Transformer Reinforcement Learning}},
@@ -58,6 +80,7 @@ Cite TRL as:
58
  year = 2020,
59
  journal = {GitHub repository},
60
  publisher = {GitHub},
61
- howpublished = {\url{https://github.com/huggingface/trl}}
62
  }
 
63
  ```
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-Coder-32B-Instruct
4
  library_name: peft
5
+ license: cc-by-nc-4.0
6
+ datasets:
7
+ - Jessylg27/DeepThink-Code-Lite
8
+ language:
9
+ - en
10
+ - fr
11
  tags:
12
+ - code
13
+ - logic
14
+ - reasoning
15
+ - qwen2.5
16
+ - unsloth
17
  - sft
 
18
  - trl
 
 
 
19
  ---
20
 
21
+ # Specialized Coding Logic LLM (32B)
22
+
23
+ This model is a specialized fine-tuned version of [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct).
24
+ It has been optimized to enhance **logical reasoning** and **code generation capabilities**.
25
+
26
+ ## ๐Ÿง  Model Description
27
 
28
+ **Specialized Coding Logic LLM** builds upon the powerful Qwen 2.5 Coder architecture (32B parameters). It has been fine-tuned using the **DeepThink-Code-Lite** dataset to improve its ability to:
29
+ - Solve complex algorithmic problems.
30
+ - Follow multi-step logical instructions.
31
+ - Generate cleaner and more optimized code.
32
 
33
+ ## ๐Ÿ“Š Dataset
34
+
35
+ This model was trained on the custom dataset:
36
+ ๐Ÿ‘‰ **[Jessylg27/DeepThink-Code-Lite](https://huggingface.co/datasets/Jessylg27/DeepThink-Code-Lite)**
37
+
38
+ ## ๐Ÿš€ Quick Start
39
+
40
+ You can use this model directly with the Hugging Face `pipeline`.
41
 
42
  ```python
43
  from transformers import pipeline
44
 
45
+ # Define the model ID
46
+ model_id = "Jessylg27/specialized-coding-logic-llm"
 
 
 
47
 
48
+ # Initialize the pipeline
49
+ generator = pipeline("text-generation", model=model_id, device_map="auto")
50
 
51
+ # Prompt the model
52
+ question = "Write a Python function to solve the Traveling Salesman Problem using dynamic programming."
53
+ output = generator([{"role": "user", "content": question}], max_new_tokens=512, return_full_text=False)[0]
54
 
55
+ print(output["generated_text"])
56
 
57
+ ```
58
 
59
+ ## ๐Ÿ› ๏ธ Training procedure
60
 
61
+ This model was trained with **SFT (Supervised Fine-Tuning)** using the [TRL library](https://github.com/huggingface/trl) and [Unsloth](https://github.com/unslothai/unsloth) for efficient training.
 
 
 
 
 
62
 
63
+ ### Framework versions
64
+
65
+ * **PEFT:** 0.18.1
66
+ * **TRL:** 0.24.0
67
+ * **Transformers:** 4.57.3
68
+ * **Pytorch:** 2.8.0+cu128
69
+ * **Datasets:** 4.3.0
70
+ * **Tokenizers:** 0.22.2
71
 
72
+ ## ๐Ÿ“œ Citations
73
 
74
+ If you use this model or the TRL library, please cite:
75
 
 
 
76
  ```bibtex
77
  @misc{vonwerra2022trl,
78
  title = {{TRL: Transformer Reinforcement Learning}},
 
80
  year = 2020,
81
  journal = {GitHub repository},
82
  publisher = {GitHub},
83
+ howpublished = {\url{[https://github.com/huggingface/trl](https://github.com/huggingface/trl)}}
84
  }
85
+
86
  ```