To full out README.md

by no-fund-ai-dev - opened Oct 5, 2025

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

-1

no-fund-ai-dev

Oct 5, 2025

Wigip-1: A 473M Parameter Language Model

This repository contains the code and documentation for Wigip-1, a ~500M parameter GPT-style language model built from scratch in JAX/Flax.

Project Overview

This project was an end-to-end journey into building and training a large language model on public resources. It involved:

Architecture: A 24-layer, 1280-embedding dimension Transformer.
Training: Trained on the C4 dataset for over 500,000 steps (~8 hours on a TPU v3-8).
Frameworks: Built with JAX, Flax, and Optax.
Deployment: A live demo was created using Gradio.

The trained model weights are hosted separately on the Hugging Face Hub, as they are too large for a standard Git repository:
https://huggingface.co/Nottybro/wigip-1

My Journey

This project was a deep dive into the real-world challenges of MLOps, including debugging file corruption, solving JAX compiler errors (XlaRuntimeError), and managing long-running jobs in a cloud environment. It was built with the help of an AI assistant for debugging and guidance.

To full out README.md0c2b4b26

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Cannot merge

This branch has merge conflicts in the following files:

README.md

· Sign up or log in to comment