wizardoftrap commited on
Commit
7ab43e4
·
verified ·
1 Parent(s): f7e18f7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -17
README.md CHANGED
@@ -1,5 +1,4 @@
1
  ---
2
- license: mit
3
  tags:
4
  - gpt
5
  - language-model
@@ -36,20 +35,45 @@ The model uses a transformer architecture with:
36
 
37
  ## Usage
38
 
 
 
39
  ```python
40
- from transformers import AutoTokenizer, AutoModelForCausalLM
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
- model_id = "your-username/SP-LM-alpha"
43
- tokenizer = AutoTokenizer.from_pretrained(model_id)
44
- model = AutoModelForCausalLM.from_pretrained(model_id)
45
 
46
- # Generate text
47
  prompt = "Once upon a time"
48
  inputs = tokenizer(prompt, return_tensors="pt")
49
- outputs = model.generate(**inputs, max_length=100)
50
- print(tokenizer.decode(outputs[0]))
 
51
  ```
52
 
 
 
 
 
 
 
 
 
 
 
 
53
  ## Training Details
54
 
55
  - **Learning Rate**: 1e-4 with linear warmup and cosine annealing decay
@@ -57,12 +81,4 @@ print(tokenizer.decode(outputs[0]))
57
  - **Gradient Accumulation Steps**: 32
58
  - **Max Iterations**: 20000
59
  - **Optimizer**: AdamW with weight decay
60
- - **Mixed Precision**: bfloat16 / float16
61
-
62
- ## License
63
-
64
- MIT License
65
-
66
- ## Model Card Contact
67
-
68
- For questions or issues, please contact the model author.
 
1
  ---
 
2
  tags:
3
  - gpt
4
  - language-model
 
35
 
36
  ## Usage
37
 
38
+ ### Quick Start
39
+
40
  ```python
41
+ from transformers import AutoTokenizer
42
+ from huggingface_hub import hf_hub_download
43
+ from safetensors.torch import load_file
44
+ import json
45
+ import torch
46
+ from sp_lm import GPT
47
+
48
+ repo_id = "wizardoftrap/SP-LM-alpha"
49
+
50
+ tokenizer = AutoTokenizer.from_pretrained(repo_id)
51
+
52
+ config_dict = json.load(open(hf_hub_download(repo_id=repo_id, filename="config.json")))
53
+ config = type('Config', (), config_dict)()
54
 
55
+ model_weights = load_file(hf_hub_download(repo_id=repo_id, filename="model.safetensors"))
56
+ model = GPT(config)
57
+ model.load_state_dict(model_weights)
58
 
 
59
  prompt = "Once upon a time"
60
  inputs = tokenizer(prompt, return_tensors="pt")
61
+ with torch.no_grad():
62
+     generated_ids = model.generate(inputs["input_ids"], max_new_tokens=50, temperature=1.0, top_k=50)
63
+ print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
64
  ```
65
 
66
+ ### Installation
67
+
68
+ 1. Download `sp_lm.py` file from this repo for GPT model.
69
+
70
+ 2. Install required packages:
71
+ ```bash
72
+ pip install transformers safetensors huggingface-hub torch
73
+ ```
74
+
75
+ 3. Load and generate text as shown above
76
+
77
  ## Training Details
78
 
79
  - **Learning Rate**: 1e-4 with linear warmup and cosine annealing decay
 
81
  - **Gradient Accumulation Steps**: 32
82
  - **Max Iterations**: 20000
83
  - **Optimizer**: AdamW with weight decay
84
+ - **Mixed Precision**: bfloat16 / float16