--- license: mit datasets: - MuzzammilShah/people-names language: - en model_name: Bigram Character-Level Language Model library_name: pytorch tags: - makemore - bigram - language-model - andrej-karpathy --- # Bigram Character-Level Language Model: Makemore (Part 1) Introduced to the concept of a bigram character-level language model, this repository explores its **training**, **sampling**, and **evaluation** processes. The model evaluation was conducted using the **Negative Log Likelihood (NLL)** loss to assess its quality. ## Overview The model was trained in two distinct ways, both yielding identical results: 1. **Frequency-Based Approach**: Directly counting and normalizing bigram frequencies. 2. **Gradient-Based Optimization**: Optimizing the counts matrix using a gradient-based framework guided by minimizing the NLL loss. This demonstrated that **both methods converge to the same result**, showcasing their equivalence in achieving the desired outcome. ## Documentation For a better reading experience and detailed notes, visit my **[Road to GPT Documentation Site](https://muzzammilshah.github.io/Road-to-GPT/Makemore-part1/)**. ## Acknowledgments Notes and implementations inspired by the **Makemore - Part 1** video by [Andrej Karpathy](https://karpathy.ai/). For more of my projects, visit my [Portfolio Site](https://muhammedshah.com).