houcine-bdk
/

chatMachineProto

Text Generation

Model card Files Files and versions

houcine-bdk commited on Feb 5, 2025

Commit

0091501

·

verified ·

1 Parent(s): e0cd09a

Update README.md

Files changed (1) hide show

README.md +3 -5

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ model-index:
 # NanoGPT Personal Experiment
-This repository contains my personal experiment with training and fine-tuning a GPT-2 style language model using the nanoGPT architecture. This project was undertaken as a learning exercise to understand transformer-based language models and explore the capabilities of modern AI architectures.
 ## Model Description
@@ -24,16 +24,14 @@ This model is based on the nanoGPT implementation, which is a minimal, clean imp
 ### Technical Details
 - Base Architecture: GPT-2
-- Implementation: nanoGPT
 - Training Infrastructure: 8x A100 80GB GPUs
 - Parameters: ~124M (similar to GPT-2 small)
 ### Training Process
 The model underwent a multi-stage training process:
-1. Initial training on a subset of the OpenWebText dataset
-2. Fine-tuning experiments on various datasets including Shakespeare's works
-3. Experimentation with different hyperparameters and optimization techniques
 ### Features

 # NanoGPT Personal Experiment
+This repository contains my personal experiment with training and fine-tuning a GPT-2 style language model. This project was undertaken as a learning exercise to understand transformer-based language models and explore the capabilities of modern AI architectures.
 ## Model Description
 ### Technical Details
 - Base Architecture: GPT-2
 - Training Infrastructure: 8x A100 80GB GPUs
 - Parameters: ~124M (similar to GPT-2 small)
 ### Training Process
 The model underwent a multi-stage training process:
+- Initial training on a subset of the OpenWebText dataset
+- Experimentation with different hyperparameters and optimization techniques
 ### Features