prxshetty commited on
Commit
fc969c2
·
verified ·
1 Parent(s): 3dbf976

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +124 -16
README.md CHANGED
@@ -1,22 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- base_model: unsloth/gpt-oss-20b-unsloth-bnb-4bit
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - gpt_oss
8
- - trl
9
- license: apache-2.0
10
- language:
11
- - en
 
 
12
  ---
13
 
14
- # Uploaded model
 
 
 
 
 
 
 
15
 
16
- - **Developed by:** prxshetty
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/gpt-oss-20b-unsloth-bnb-4bit
19
 
20
- This gpt_oss model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
+ ---
2
+ license: mit
3
+ library_name: transformers
4
+ base_model: unsloth/gpt-oss-20b
5
+ tags:
6
+ - gpt-oss
7
+ - lora
8
+ - unsloth
9
+ - text-generation
10
+ - instruction-following
11
+ - multilingual
12
+ datasets:
13
+ - HuggingFaceH4/Multilingual-Thinking
14
+ pipeline_tag: text-generation
15
+ language:
16
+ - en
17
+ ---
18
+
19
+ # GPT-OSS-20B Fine-Tuned
20
+
21
+ A fine-tuned **gpt-oss-20b** model optimized for *efficient text generation, multilingual conversational tasks, and instruction-following*.
22
+
23
+ ---
24
+
25
+ ## Overview
26
+
27
+ | Item | Details |
28
+ |---|---|
29
+ | **Base checkpoint** | `unsloth/gpt-oss-20b` |
30
+ | **Fine-tune method** | LoRA (PEFT) with Unsloth |
31
+ | **Training run** | 30 steps • Multilingual-Thinking dataset |
32
+ | **Trainable params** | [To be calculated, if available] |
33
+ | **Loss** | [Loss metrics unavailable] |
34
+ | **Hardware** | [Hardware details unavailable] |
35
+ | **License** | MIT License (Base model: Refer to gpt-oss-20b license) |
36
+ | **Intended use** | Educational, research, and chat-based applications |
37
+
38
+ ---
39
+
40
+ ## Datasets
41
+
42
+ | Dataset | Size | Focus |
43
+ |---|---|---|
44
+ | `HuggingFaceH4/Multilingual-Thinking` | [Size unavailable] | Multilingual reasoning and conversational tasks |
45
+
46
+ The dataset was wrapped with the **chat template** before training.
47
+
48
+ ---
49
+
50
+ ## Installation
51
+
52
+ To use this model, install the required dependencies:
53
+
54
+ ```bash
55
+ pip install torch>=2.8.0 triton>=3.4.0 transformers>=4.55.3 bitsandbytes unsloth
56
+ ```
57
+
58
+ ## Usage
59
+
60
+ ### Loading the Model
61
+
62
+ ```python
63
+ from unsloth import FastLanguageModel
64
+ import torch
65
+
66
+ model, tokenizer = FastLanguageModel.from_pretrained(
67
+ model_name="unsloth/gpt-oss-20b",
68
+ max_seq_length=1024,
69
+ dtype=torch.float16,
70
+ load_in_4bit=True,
71
+ )
72
+ ```
73
+
74
+ ### Fine-Tuning with LoRA
75
+
76
+ ```python
77
+ model = FastLanguageModel.get_peft_model(
78
+ model,
79
+ r=8,
80
+ target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
81
+ lora_alpha=16,
82
+ lora_dropout=0,
83
+ bias="none",
84
+ use_gradient_checkpointing="unsloth",
85
+ )
86
+ ```
87
+
88
+ ### Inference
89
+
90
+ ```python
91
+ from transformers import TextStreamer
92
+
93
+ messages = [
94
+ {"role": "user", "content": "Solve x^5 + 3x^4 - 10 = 3."},
95
+ ]
96
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True).to(model.device)
97
+
98
+ outputs = model.generate(**inputs, max_new_tokens=512, streamer=TextStreamer(tokenizer))
99
+ ```
100
+
101
  ---
102
+
103
+ ## Training Details
104
+
105
+ ### Training Configuration
106
+
107
+ - **Batch Size**: 1
108
+ - **Gradient Accumulation Steps**: 4
109
+ - **Learning Rate**: 2e-4
110
+ - **Optimizer**: adamw_8bit
111
+ - **Warmup Steps**: 5
112
+ - **Max Steps**: 30
113
+
114
  ---
115
 
116
+ ## Responsible Use
117
+
118
+ - **Bias**: The model may reflect biases in the training data. Users should evaluate outputs for fairness.
119
+ - **Misuse**: Avoid using for harmful or misleading content generation.
120
+ - **Limitations**: Optimized for efficiency with 4-bit quantization, which may introduce minor accuracy trade-offs. Limited to 1024-token sequences.
121
+ - **Disclaimer**: Not intended for critical decision-making. The author and base-model creators accept no liability for misuse or errors.
122
+
123
+ ---
124
 
125
+ ## Acknowledgements
 
 
126
 
127
+ - The unsloth library for enabling efficient fine-tuning.
128
+ - Hugging Face for providing the base model and training infrastructure.
129
 
130
+ ---