MuzzammilShah commited on
Commit
b2f466a
·
verified ·
1 Parent(s): fc3fc2e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -67
README.md CHANGED
@@ -1,67 +1,33 @@
1
- ## SET 1 - MAKEMORE (PART 1) 🔗
2
-
3
- [![Documentation](https://img.shields.io/badge/Documentation-Available-blue)](https://muzzammilshah.github.io/Road-to-GPT/Makemore-part1/)
4
- ![Number of Commits](https://img.shields.io/github/commit-activity/m/MuzzammilShah/NeuralNetworks-LanguageModels-1?label=Commits)
5
- [![Last Commit](https://img.shields.io/github/last-commit/MuzzammilShah/NeuralNetworks-LanguageModels-1.svg?style=flat)](https://github.com/MuzzammilShah/NeuralNetworks-LanguageModels-1/commits/main)
6
- ![Project Status](https://img.shields.io/badge/Status-Done-success)
7
-
8
-  
9
-
10
- ### **Overview**
11
- Introduced to the concept of a bigram character-level language model, this repository explores its **training**, **sampling**, and **evaluation** processes. The model evaluation was conducted using the **Negative Log Likelihood (NLL)** loss to assess its quality.
12
-
13
- The model was trained in two distinct ways, both yielding identical results:
14
-
15
- 1. **Frequency-Based Approach**: Directly counting and normalizing bigram frequencies.
16
- 2. **Gradient-Based Optimization**: Optimizing the counts matrix using a gradient-based framework guided by minimizing the NLL loss.
17
-
18
- This demonstrated that **both methods converge to the same result**, showcasing their equivalence in achieving the desired outcome.
19
-
20
-  
21
-
22
- ### **🗂️Repository Structure**
23
-
24
- ```plaintext
25
- ├── .gitignore
26
- ├── A-Main-Notebook.ipynb
27
- ├── B-Main-Notebook.ipynb
28
- ├── C-Main-Notebook.ipynb
29
- ├── README.md
30
- ├── notes/
31
- │ ├── A-main-makemore-part1.md
32
- │ ├── B-main-makemore-part1.md
33
- │ ├── C-main-makemore-part1.md
34
- │ └── README.md
35
- └── names.txt
36
- ```
37
-
38
- - **Notes Directory**: Contains detailed notes corresponding to each notebook section.
39
- - **Jupyter Notebooks**: Step-by-step implementation and exploration of the bigram model.
40
- - **README.md**: Overview and guide for this repository.
41
- - **names.txt**: Supplementary data file used in training the model.
42
-
43
-  
44
-
45
- ### **📄Instructions**
46
-
47
- To get the best understanding:
48
-
49
- 1. Start by reading the notes in the `notes/` directory. Each section corresponds to a notebook for step-by-step explanations.
50
- 2. Open the corresponding Jupyter Notebook (e.g., `A-Main-Notebook.ipynb` for `A-main-makemore-part1.md`).
51
- 3. Follow the code and comments for a deeper dive into the implementation details.
52
-
53
-  
54
-
55
- ### **⭐Documentation**
56
-
57
- For a better reading experience and detailed notes, visit my **[Road to GPT Documentation Site](https://muzzammilshah.github.io/Road-to-GPT/)**.
58
-
59
- > **💡Pro Tip**: This site provides an interactive and visually rich explanation of the notes and code. It is highly recommended you view this project from there.
60
-
61
-  
62
-
63
-
64
- ### **✍🏻Acknowledgments**
65
- Notes and implementations inspired by the **Makemore - Part 1** video by [Andrej Karpathy](https://karpathy.ai/).
66
-
67
- For more of my projects, visit my [Portfolio Site](https://muhammedshah.com).
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - MuzzammilShah/people-names
5
+ language:
6
+ - en
7
+ model_name: Bigram Character-Level Language Model
8
+ library_name: pytorch
9
+ tags:
10
+ - makemore
11
+ - bigram
12
+ - language-model
13
+ - andrej-karpathy
14
+ ---
15
+
16
+ # Bigram Character-Level Language Model: Makemore (Part 1)
17
+
18
+ Introduced to the concept of a bigram character-level language model, this repository explores its **training**, **sampling**, and **evaluation** processes. The model evaluation was conducted using the **Negative Log Likelihood (NLL)** loss to assess its quality.
19
+
20
+ ## Overview
21
+ The model was trained in two distinct ways, both yielding identical results:
22
+ 1. **Frequency-Based Approach**: Directly counting and normalizing bigram frequencies.
23
+ 2. **Gradient-Based Optimization**: Optimizing the counts matrix using a gradient-based framework guided by minimizing the NLL loss.
24
+
25
+ This demonstrated that **both methods converge to the same result**, showcasing their equivalence in achieving the desired outcome.
26
+
27
+ ## Documentation
28
+ For a better reading experience and detailed notes, visit my **[Road to GPT Documentation Site](https://muzzammilshah.github.io/Road-to-GPT/Makemore-part1/)**.
29
+
30
+ ## Acknowledgments
31
+ Notes and implementations inspired by the **Makemore - Part 1** video by [Andrej Karpathy](https://karpathy.ai/).
32
+
33
+ For more of my projects, visit my [Portfolio Site](https://muhammedshah.com).