--- library_name: pytorch pipeline_tag: text-generation tags: - text-generation - pytorch - fineweb-edu - ultrachat - homework datasets: - HuggingFaceFW/fineweb-edu - HuggingFaceH4/ultrachat_200k --- # Chat-Tuning-Homework This is a course-homework model repo containing both checkpoints and derived data artifacts. ## Contents - `model_base.pth`: 1.1M-step base model checkpoint in the homework's LLaMA-like single-file format. - `model_chat.pth`: chat-tuned checkpoint in the homework model format. - `params.json`: model architecture parameters used by the homework `LLM` loader. - `ultrachat_short.json`: filtered short-form UltraChat conversations used for chat tuning. - `ultrachat_dpo_pos.json`: positive DPO preference data. - `ultrachat_dpo_neg.json`: negative DPO preference data. ## Model Card ### Architecture The checkpoints use the Homework 5 transformer architecture with: - dimension: 1024 - feed-forward dimension: 4096 - heads: 16 - layers: 8 - maximum sequence length: 1024 - vocabulary size: 50432 These values are also stored in `params.json`. ### Training Summary - `model_base.pth` is the pretrained base checkpoint exported from the ~1.1T-token FineWebEDU run. - `model_chat.pth` is the chat-tuned checkpoint saved after supervised chat tuning on a subset of the ultrachat200k dataset. These file are intended for use with the homework's basic exercises. ## Data Card ### Data Sources - FineWebEDU for base pretraining - UltraChat 200k for chat tuning and preference-style data preparation ### Included Data Files - `ultrachat_short.json`: set of short chat-tuning responses selected from Ultrachat 200k - `ultrachat_dpo_pos.json`: preferred responses - `ultrachat_dpo_neg.json`: dispreferred responses ## File Format Notes - `model_base.pth` and `model_chat.pth` are PyTorch checkpoint dictionaries - attention weights are stored in the homework-compatible unpacked format - all exported weights are stored as `bfloat16`