samkeet commited on
Commit
778291b
·
verified ·
1 Parent(s): d4050a7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -6
README.md CHANGED
@@ -70,10 +70,9 @@ GPT-124M is a lightweight generative language model fine-tuned on the `fineweb-e
70
  - **Validation Dataset:** 100 million tokens of HuggingFaceFW/fineweb-edu
71
 
72
  ## Usage
73
-
74
- ### Direct Use
75
  You can use this model for text generation using the `transformers` library.
76
 
 
77
  ```python
78
  # Import necessary modules from transformers
79
  from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
@@ -81,14 +80,42 @@ from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
81
  # Load tokenizer and model
82
  model_name = "samkeet/GPT_124M"
83
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
84
- model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
85
 
86
  # Create text generation pipeline
87
- pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, trust_remote_code=True, device="cpu")
88
 
89
  # Generate text
90
  result = pipe("Earth revolves around the", do_sample=True, max_length=40, temperature=0.9, top_p=0.5, top_k=50)
91
- print(result)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
92
  ```
93
 
94
  ### Fine-tuning & Downstream Use
@@ -169,4 +196,4 @@ If you use this model, please cite:
169
  ```
170
 
171
  ## Contact
172
- For inquiries, contact [Samkeet Sangai](https://www.linkedin.com/in/samkeet-sangai/).
 
70
  - **Validation Dataset:** 100 million tokens of HuggingFaceFW/fineweb-edu
71
 
72
  ## Usage
 
 
73
  You can use this model for text generation using the `transformers` library.
74
 
75
+ ### Method 1: Using Pipeline
76
  ```python
77
  # Import necessary modules from transformers
78
  from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
 
80
  # Load tokenizer and model
81
  model_name = "samkeet/GPT_124M"
82
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
 
83
 
84
  # Create text generation pipeline
85
+ pipe = pipeline("text-generation", model=model_name, tokenizer=tokenizer, trust_remote_code=True, device="cpu")
86
 
87
  # Generate text
88
  result = pipe("Earth revolves around the", do_sample=True, max_length=40, temperature=0.9, top_p=0.5, top_k=50)
89
+ print("Pipeline Output:", result)
90
+ ```
91
+
92
+ ### Method 1: Direct Generation
93
+ ```python
94
+ # Import necessary libraries
95
+ import torch
96
+
97
+ # Function for direct tokenization and text generation
98
+ def generate_text(input_text, device):
99
+ tokens = tokenizer.encode(input_text, return_tensors='pt').to(device)
100
+ model.to(device)
101
+
102
+ # Generate output
103
+ output = model.generate(
104
+ tokens,
105
+ do_sample=True,
106
+ max_length=40,
107
+ temperature=0.9,
108
+ top_p=0.5,
109
+ top_k=50,
110
+ )
111
+
112
+ # Decode generated text
113
+ generated_sentence = tokenizer.decode(output)
114
+ return generated_sentence
115
+
116
+ # Generate text
117
+ input_text = "Earth revolves around the"
118
+ print("Direct Model Output:", generate_text(input_text))
119
  ```
120
 
121
  ### Fine-tuning & Downstream Use
 
196
  ```
197
 
198
  ## Contact
199
+ For inquiries, contact [Samkeet Sangai](https://www.linkedin.com/in/samkeet-sangai/).