wizardoftrap
/

SP-LM-alpha

Model card Files Files and versions

wizardoftrap commited on Jan 17

Commit

7ab43e4

·

verified ·

1 Parent(s): f7e18f7

Update README.md

Files changed (1) hide show

README.md +33 -17

README.md CHANGED Viewed

@@ -1,5 +1,4 @@
 ---
-license: mit
 tags:
 - gpt
 - language-model
@@ -36,20 +35,45 @@ The model uses a transformer architecture with:
 ## Usage
 ```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
-model_id = "your-username/SP-LM-alpha"
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = AutoModelForCausalLM.from_pretrained(model_id)
-# Generate text
 prompt = "Once upon a time"
 inputs = tokenizer(prompt, return_tensors="pt")
-outputs = model.generate(**inputs, max_length=100)
-print(tokenizer.decode(outputs[0]))
 ```
 ## Training Details
 - **Learning Rate**: 1e-4 with linear warmup and cosine annealing decay
@@ -57,12 +81,4 @@ print(tokenizer.decode(outputs[0]))
 - **Gradient Accumulation Steps**: 32
 - **Max Iterations**: 20000
 - **Optimizer**: AdamW with weight decay
-- **Mixed Precision**: bfloat16 / float16
-## License
-MIT License
-## Model Card Contact
-For questions or issues, please contact the model author.

 ---
 tags:
 - gpt
 - language-model
 ## Usage
+### Quick Start
 ```python
+from transformers import AutoTokenizer
+from huggingface_hub import hf_hub_download
+from safetensors.torch import load_file
+import json
+import torch
+from sp_lm import GPT
+repo_id = "wizardoftrap/SP-LM-alpha"
+tokenizer = AutoTokenizer.from_pretrained(repo_id)
+config_dict = json.load(open(hf_hub_download(repo_id=repo_id, filename="config.json")))
+config = type('Config', (), config_dict)()
+model_weights = load_file(hf_hub_download(repo_id=repo_id, filename="model.safetensors"))
+model = GPT(config)
+model.load_state_dict(model_weights)
 prompt = "Once upon a time"
 inputs = tokenizer(prompt, return_tensors="pt")
+with torch.no_grad():
+    generated_ids = model.generate(inputs["input_ids"], max_new_tokens=50, temperature=1.0, top_k=50)
+print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
 ```
+### Installation
+1. Download `sp_lm.py` file from this repo for GPT model.
+2. Install required packages:
+```bash
+pip install transformers safetensors huggingface-hub torch
+```
+3. Load and generate text as shown above
 ## Training Details
 - **Learning Rate**: 1e-4 with linear warmup and cosine annealing decay
 - **Gradient Accumulation Steps**: 32
 - **Max Iterations**: 20000
 - **Optimizer**: AdamW with weight decay
+- **Mixed Precision**: bfloat16 / float16