Saif10 commited on
Commit
e98565a
ยท
verified ยท
1 Parent(s): ce3461f
Files changed (1) hide show
  1. README.md +15 -23
README.md CHANGED
@@ -13,15 +13,16 @@ datasets:
13
  - stanfordnlp/sst2
14
  base_model:
15
  - openai-community/gpt2
 
16
  ---
17
 
18
- # ๐Ÿง  GPT-2 SFT Model โ€“ Supervised Fine-Tuning for Positive Sentiment
19
 
20
  This model is the **first stage** in a 3-step RLHF (Reinforcement Learning from Human Feedback) pipeline using **GPT-2**. It has been fine-tuned on the **Stanford Sentiment Treebank v2 (SST2)** dataset, focusing on generating sentences with a positive sentiment tone.
21
 
22
  ---
23
 
24
- ## ๐Ÿ“Œ Context
25
 
26
  This model is part of the following RLHF project structure:
27
 
@@ -33,7 +34,7 @@ You are currently viewing the **SFT model**.
33
 
34
  ---
35
 
36
- ## โœ… Model Objective
37
 
38
  Train GPT-2 on sentiment-labeled sentences to mimic human-like, sentiment-aware generation.
39
 
@@ -41,37 +42,28 @@ Train GPT-2 on sentiment-labeled sentences to mimic human-like, sentiment-aware
41
  - **Output:** GPT-2 completes it with a positively-toned sentence.
42
 
43
  ---
44
-
45
- ## ๐Ÿ“š Training Details
46
-
47
- ### ๐Ÿ”ง Dataset
48
 
49
  - **Source:** `stanfordnlp/sst2`
50
  - **Type:** Movie review sentences
51
  - **Labels:** Positive and Negative
52
  - **Preprocessing:** Only positive samples retained for SFT
53
 
54
- ### โš™๏ธ Configuration
55
-
56
- - **Model Base:** `gpt2`
57
- - **Max Sequence Length:** 128
58
- - **Batch Size:** 8
59
- - **Epochs:** 3
60
- - **Optimizer:** AdamW
61
- - **Learning Rate:** 5e-5
62
- - **Precision:** FP16
63
-
64
- ---
65
-
66
- ## ๐Ÿš€ Usage
67
 
68
  ```python
69
  from transformers import AutoModelForCausalLM, AutoTokenizer
70
 
71
- model = AutoModelForCausalLM.from_pretrained("your-hf-username/gpt2-sft-positive")
72
- tokenizer = AutoTokenizer.from_pretrained("your-hf-username/gpt2-sft-positive")
73
 
74
  prompt = "The movie was"
75
  inputs = tokenizer(prompt, return_tensors="pt")
76
  outputs = model.generate(**inputs, max_new_tokens=30)
77
- print(tokenizer.decode(outputs[0]))
 
 
 
 
 
 
 
 
13
  - stanfordnlp/sst2
14
  base_model:
15
  - openai-community/gpt2
16
+ pipeline_tag: text-generation
17
  ---
18
 
19
+ # GPT-2 SFT Model โ€“ Supervised Fine-Tuning for Positive Sentiment
20
 
21
  This model is the **first stage** in a 3-step RLHF (Reinforcement Learning from Human Feedback) pipeline using **GPT-2**. It has been fine-tuned on the **Stanford Sentiment Treebank v2 (SST2)** dataset, focusing on generating sentences with a positive sentiment tone.
22
 
23
  ---
24
 
25
+ ## Context
26
 
27
  This model is part of the following RLHF project structure:
28
 
 
34
 
35
  ---
36
 
37
+ ## Model Objective
38
 
39
  Train GPT-2 on sentiment-labeled sentences to mimic human-like, sentiment-aware generation.
40
 
 
42
  - **Output:** GPT-2 completes it with a positively-toned sentence.
43
 
44
  ---
45
+ ### Dataset
 
 
 
46
 
47
  - **Source:** `stanfordnlp/sst2`
48
  - **Type:** Movie review sentences
49
  - **Labels:** Positive and Negative
50
  - **Preprocessing:** Only positive samples retained for SFT
51
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ```python
54
  from transformers import AutoModelForCausalLM, AutoTokenizer
55
 
56
+ model = AutoModelForCausalLM.from_pretrained("Saif10/sft-model")
57
+ tokenizer = AutoTokenizer.from_pretrained("Saif10/sft-model")
58
 
59
  prompt = "The movie was"
60
  inputs = tokenizer(prompt, return_tensors="pt")
61
  outputs = model.generate(**inputs, max_new_tokens=30)
62
+ print(tokenizer.decode(outputs[0]))
63
+
64
+ ```
65
+
66
+ ## Author
67
+ Saif Rathod
68
+ - Hugging Face: Saif10
69
+ - GitHub: Saif-rathod