houcine-bdk commited on
Commit
0091501
·
verified ·
1 Parent(s): e0cd09a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -15,7 +15,7 @@ model-index:
15
 
16
  # NanoGPT Personal Experiment
17
 
18
- This repository contains my personal experiment with training and fine-tuning a GPT-2 style language model using the nanoGPT architecture. This project was undertaken as a learning exercise to understand transformer-based language models and explore the capabilities of modern AI architectures.
19
 
20
  ## Model Description
21
 
@@ -24,16 +24,14 @@ This model is based on the nanoGPT implementation, which is a minimal, clean imp
24
  ### Technical Details
25
 
26
  - Base Architecture: GPT-2
27
- - Implementation: nanoGPT
28
  - Training Infrastructure: 8x A100 80GB GPUs
29
  - Parameters: ~124M (similar to GPT-2 small)
30
 
31
  ### Training Process
32
 
33
  The model underwent a multi-stage training process:
34
- 1. Initial training on a subset of the OpenWebText dataset
35
- 2. Fine-tuning experiments on various datasets including Shakespeare's works
36
- 3. Experimentation with different hyperparameters and optimization techniques
37
 
38
  ### Features
39
 
 
15
 
16
  # NanoGPT Personal Experiment
17
 
18
+ This repository contains my personal experiment with training and fine-tuning a GPT-2 style language model. This project was undertaken as a learning exercise to understand transformer-based language models and explore the capabilities of modern AI architectures.
19
 
20
  ## Model Description
21
 
 
24
  ### Technical Details
25
 
26
  - Base Architecture: GPT-2
 
27
  - Training Infrastructure: 8x A100 80GB GPUs
28
  - Parameters: ~124M (similar to GPT-2 small)
29
 
30
  ### Training Process
31
 
32
  The model underwent a multi-stage training process:
33
+ - Initial training on a subset of the OpenWebText dataset
34
+ - Experimentation with different hyperparameters and optimization techniques
 
35
 
36
  ### Features
37