Update README.md
Browse files
README.md
CHANGED
|
@@ -70,6 +70,27 @@ chainlit run app_pretrain.py
|
|
| 70 |
|
| 71 |
This will launch a web application where you can input text and see the model's generated responses.
|
| 72 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
## Model Architecture and Objective
|
| 74 |
|
| 75 |
Stories-SLM uses a standard GPT decoder-only transformer architecture with:
|
|
|
|
| 70 |
|
| 71 |
This will launch a web application where you can input text and see the model's generated responses.
|
| 72 |
|
| 73 |
+
### Downloading from Huggingface 🤗
|
| 74 |
+
|
| 75 |
+
To interact with the model by downloading from huggingface:
|
| 76 |
+
|
| 77 |
+
- First clone the repo in the local
|
| 78 |
+
|
| 79 |
+
```bash
|
| 80 |
+
from transformer_blocks.gpt2 import GPT2
|
| 81 |
+
from gpt_Pretraining.text_generation import Text_Generation
|
| 82 |
+
|
| 83 |
+
model = GPT2.from_pretrained("NamrataThakur/Small_Language_Model_MHA_53M_Pretrained")
|
| 84 |
+
model.eval()
|
| 85 |
+
|
| 86 |
+
#---------------------------- Checking the generation to make everything is okay ---------------------------
|
| 87 |
+
generation = Text_Generation(model=model, device='cpu', tokenizer_model='gpt2',
|
| 88 |
+
arch_type='original')
|
| 89 |
+
start_context = "One day, a "
|
| 90 |
+
response = generation.text_generation(input_text=start_context, max_new_tokens = 160, temp = 0.5, top_k=10, kv_cache=False)
|
| 91 |
+
print(response)
|
| 92 |
+
```
|
| 93 |
+
|
| 94 |
## Model Architecture and Objective
|
| 95 |
|
| 96 |
Stories-SLM uses a standard GPT decoder-only transformer architecture with:
|