Vjeong
/

LLM-1B-Lab

+---
+language:
+- en
+tags:
+- llama
+- decoder-only
+- educational
+- pretrained
+license: apache-2.0
+datasets:
+- HuggingFaceFW/fineweb-edu
+---
+# LLM-1B-Lab
+Educational implementation of a **1.1B parameter LLaMA-style Decoder-Only Transformer**,
+trained from scratch on [FineWeb-Edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu).
+## Model Details
+| Attribute | Value |
+|-----------|-------|
+| Parameters | ~1.1B |
+| Architecture | LLaMA-style (RMSNorm, RoPE, GQA, SwiGLU, Weight Tying) |
+| Hidden dim | 2048 |
+| Layers | 22 |
+| Attention heads | 16 (Q) / 4 (KV) |
+| Max sequence length | 2048 |
+| Vocab size | 32,000 |
+| Training steps | 20,000 |
+| Best val loss | 2.6276 (perplexity: 13.84) |
+## Training
+- **Dataset**: FineWeb-Edu (sample-10BT)
+- **Tokenizer**: Custom BPE (trained from dataset via `train_new` mode)
+- **Hardware**: Google Colab Pro+ (A100 40GB)
+- **Precision**: bfloat16 mixed precision
+- **Optimizer**: AdamW (lr=3e-4, weight_decay=0.1, beta2=0.95)
+- **Scheduler**: Cosine warmup (2000 warmup steps)
+- **Effective batch size**: 128
+## Usage
+```python
+import torch
+from safetensors.torch import load_file
+from huggingface_hub import hf_hub_download
+# 1. Load config and rebuild model
+from llm_lab.config import ModelConfig
+from llm_lab.model import LLMModel
+model = LLMModel(ModelConfig.base_1b())
+state_dict = load_file("model.safetensors")
+model.load_state_dict(state_dict, strict=False)  # strict=False for weight tying
+model.eval()
+# 2. Load tokenizer (custom BPE trained with tokenizers library)
+from tokenizers import Tokenizer
+import json
+tok_path = hf_hub_download(repo_id="Vjeong/LLM-1B-Lab", filename="tokenizer/tokenizer.json")
+meta_path = hf_hub_download(repo_id="Vjeong/LLM-1B-Lab", filename="tokenizer/tokenizer_meta.json")
+tokenizer = Tokenizer.from_file(tok_path)
+with open(meta_path) as f:
+    tok_meta = json.load(f)
+# 3. Generate text
+prompt = "The future of AI is"
+input_ids = torch.tensor([tokenizer.encode(prompt).ids])
+output = model.generate(input_ids, max_new_tokens=100, temperature=0.8, top_p=0.9)
+print(tokenizer.decode(output[0].tolist()))
+```
+## License
+Apache 2.0