dilip025 commited on
Commit
8cbc022
·
verified ·
1 Parent(s): e73154c

Upload hf_model/README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. hf_model/README.md +33 -0
hf_model/README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Mini GPT1 Clone
2
+
3
+ This is a decoder-only transformer model (GPT1-style) trained from scratch using PyTorch.
4
+
5
+ ## Model Details
6
+
7
+ - **Architecture**: Decoder-only Transformer
8
+ - **Layers**: 6
9
+ - **Embedding Size**: 512
10
+ - **Heads**: 8
11
+ - **Feedforward Dim**: 2048
12
+ - **Sequence Length**: 256
13
+ - **Vocab Size**: 35,000
14
+
15
+ ## Tokenizer
16
+
17
+ Trained using `ByteLevelBPETokenizer` from the `tokenizers` library.
18
+
19
+ ## Inference Example
20
+
21
+ ```python
22
+ from transformers import PreTrainedTokenizerFast, AutoModelForCausalLM
23
+ import torch
24
+
25
+ tokenizer = PreTrainedTokenizerFast(tokenizer_file="tokenizer/tokenizer.json")
26
+ model = AutoModelForCausalLM.from_pretrained("dilip025/mini-gpt1")
27
+
28
+ prompt = "Once upon a time,"
29
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
30
+ outputs = model.generate(input_ids, max_length=50)
31
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
32
+ License
33
+ MIT