gopi87 commited on
Commit
a49cd12
·
verified ·
1 Parent(s): e8bf402

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -13,7 +13,7 @@ This is a **learning project** demonstrating how to train a transformer-based la
13
 
14
  - **Model Type:** Character-level Transformer Language Model
15
  - **Architecture:** 6-layer Transformer Encoder with causal masking
16
- - **Parameters:** ~1M parameters
17
  - **Training Data:** Shakespeare's plays (~1.1M characters)
18
  - **Framework:** PyTorch
19
  - **Training Time:** ~8 hours on single GPU
@@ -138,7 +138,7 @@ What light through yonder window breaks?
138
  - ❌ Not suitable for production use
139
 
140
  ### What This Model Is NOT
141
- - ❌ Not comparable to GPT-2, GPT-3, or modern LLMs
142
  - ❌ Not fine-tuned for instruction following
143
  - ❌ Not suitable for serious text generation applications
144
  - ❌ Not production-ready
@@ -184,7 +184,7 @@ This project was an educational exercise in:
184
 
185
  | Model | Parameters | Quality |
186
  |-------|------------|---------|
187
- | This Model | 1M | Low (educational) |
188
  | GPT-2 Small | 117M | High |
189
  | GPT-3 | 175B | Very High |
190
 
 
13
 
14
  - **Model Type:** Character-level Transformer Language Model
15
  - **Architecture:** 6-layer Transformer Encoder with causal masking
16
+ - **Parameters:** ~4M parameters
17
  - **Training Data:** Shakespeare's plays (~1.1M characters)
18
  - **Framework:** PyTorch
19
  - **Training Time:** ~8 hours on single GPU
 
138
  - ❌ Not suitable for production use
139
 
140
  ### What This Model Is NOT
141
+ - ❌ Not comparable to GPT-2, GPT-3, or modern LLMs (GPT-2 Small has 117M, ~30x larger)
142
  - ❌ Not fine-tuned for instruction following
143
  - ❌ Not suitable for serious text generation applications
144
  - ❌ Not production-ready
 
184
 
185
  | Model | Parameters | Quality |
186
  |-------|------------|---------|
187
+ | This Model | 4M | Low (educational) |
188
  | GPT-2 Small | 117M | High |
189
  | GPT-3 | 175B | Very High |
190