Pavloria commited on
Commit
95cb943
·
verified ·
1 Parent(s): 7c2b43c

Update README.md

Browse files

Added full model documentation and metadata

Files changed (1) hide show
  1. README.md +35 -1
README.md CHANGED
@@ -13,4 +13,38 @@ pipeline_tag: text-generation
13
 
14
  # Mini Language Model
15
 
16
- This is a toy decoder-only model trained on Tiny Shakespeare.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  # Mini Language Model
15
 
16
+ ## 🧠 Model Description
17
+ This is a toy decoder-only language model based on a TransformerDecoder architecture. It was trained from scratch on the [Tiny Shakespeare dataset](https://huggingface.co/datasets/tiny_shakespeare) using PyTorch.
18
+
19
+ The goal was to explore autoregressive language modeling using minimal resources and libraries like torch.nn and transformers.
20
+
21
+ ## 🏋️ Training Details
22
+ - **Architecture**: TransformerDecoder
23
+ - **Tokenizer**: GPT2Tokenizer from Hugging Face
24
+ - **Vocabulary Size**: 50257 (from GPT-2)
25
+ - **Sequence Length**: 64 tokens
26
+ - **Batch Size**: 8
27
+ - **Epochs**: 5
28
+ - **Learning Rate**: 1e-3
29
+ - **Number of Parameters**: ~900k
30
+ - **Hardware**: Trained on CPU (Google Colab)
31
+
32
+ ## 📊 Evaluation
33
+ The model was evaluated on a 10% validation split. It shows consistent training and validation loss decrease, though it is not expected to produce coherent long text due to the small training size.
34
+
35
+ ## 📂 Intended Use
36
+ This model is intended for educational purposes only. It is **not suitable for production use**.
37
+
38
+ ## 🚫 Limitations
39
+ - Only trained on a tiny dataset
40
+ - Small architecture, limited capacity
41
+ - Limited ability to generalize or generate meaningful long text
42
+
43
+ ## 💬 Example Usage (Python)
44
+ python
45
+ from transformers import GPT2Tokenizer
46
+ from model import MiniDecoderModel # Assuming you restore the class
47
+
48
+ tokenizer = GPT2Tokenizer.from_pretrained("Pavloria/mini-language-model")
49
+ model = MiniDecoderModel(...) # Load your config
50
+ model.load_state_dict(torch.load("pytorch_model.bin"))