NamrataThakur commited on
Commit
e819bec
·
verified ·
1 Parent(s): a9b31d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -70,6 +70,27 @@ chainlit run app_pretrain.py
70
 
71
  This will launch a web application where you can input text and see the model's generated responses.
72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
  ## Model Architecture and Objective
74
 
75
  Stories-SLM uses a standard GPT decoder-only transformer architecture with:
 
70
 
71
  This will launch a web application where you can input text and see the model's generated responses.
72
 
73
+ ### Downloading from Huggingface 🤗
74
+
75
+ To interact with the model by downloading from huggingface:
76
+
77
+ - First clone the repo in the local
78
+
79
+ ```bash
80
+ from transformer_blocks.gpt2 import GPT2
81
+ from gpt_Pretraining.text_generation import Text_Generation
82
+
83
+ model = GPT2.from_pretrained("NamrataThakur/Small_Language_Model_MHA_53M_Pretrained")
84
+ model.eval()
85
+
86
+ #---------------------------- Checking the generation to make everything is okay ---------------------------
87
+ generation = Text_Generation(model=model, device='cpu', tokenizer_model='gpt2',
88
+ arch_type='original')
89
+ start_context = "One day, a "
90
+ response = generation.text_generation(input_text=start_context, max_new_tokens = 160, temp = 0.5, top_k=10, kv_cache=False)
91
+ print(response)
92
+ ```
93
+
94
  ## Model Architecture and Objective
95
 
96
  Stories-SLM uses a standard GPT decoder-only transformer architecture with: