File size: 1,016 Bytes
62f56c3
 
 
 
 
b9b3ba6
 
 
 
 
 
 
 
 
 
 
c043a9f
 
b9b3ba6
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
license: mit
datasets:
- allenai/c4
pipeline_tag: text-generation
---
Wigip-1: A 473M Parameter Language Model
This repository contains the code and documentation for Wigip-1, a ~500M parameter GPT-style language model built from scratch in JAX/Flax.

Project Overview
This project was an end-to-end journey into building and training a large language model on public resources. It involved:

Architecture: A 24-layer, 1280-embedding dimension Transformer.
Training: Trained on the C4 dataset for over 500,000 steps (~8 hours on a TPU v3-8).
Frameworks: Built with JAX, Flax, and Optax.
Deployment: A live demo was created using Gradio.
See this please:
https://github.com/skibidibrainrot20times/wigip-1-jax-model.git

My Journey
This project was a deep dive into the real-world challenges of MLOps, including debugging file corruption, solving JAX compiler errors (XlaRuntimeError), and managing long-running jobs in a cloud environment. It was built with the help of an AI assistant for debugging and guidance.