| --- |
| library_name: pytorch |
| pipeline_tag: text-generation |
| tags: |
| - text-generation |
| - pytorch |
| - fineweb-edu |
| - ultrachat |
| - homework |
| datasets: |
| - HuggingFaceFW/fineweb-edu |
| - HuggingFaceH4/ultrachat_200k |
| --- |
| |
| # Chat-Tuning-Homework |
|
|
| This is a course-homework model repo containing both checkpoints and derived data artifacts. |
|
|
| ## Contents |
|
|
| - `model_base.pth`: 1.1M-step base model checkpoint in the homework's LLaMA-like single-file format. |
| - `model_chat.pth`: chat-tuned checkpoint in the homework model format. |
| - `params.json`: model architecture parameters used by the homework `LLM` loader. |
| - `ultrachat_short.json`: filtered short-form UltraChat conversations used for chat tuning. |
| - `ultrachat_dpo_pos.json`: positive DPO preference data. |
| - `ultrachat_dpo_neg.json`: negative DPO preference data. |
|
|
| ## Model Card |
|
|
| ### Architecture |
|
|
| The checkpoints use the Homework 5 transformer architecture with: |
|
|
| - dimension: 1024 |
| - feed-forward dimension: 4096 |
| - heads: 16 |
| - layers: 8 |
| - maximum sequence length: 1024 |
| - vocabulary size: 50432 |
|
|
| These values are also stored in `params.json`. |
|
|
| ### Training Summary |
|
|
| - `model_base.pth` is the pretrained base checkpoint exported from the ~1.1T-token FineWebEDU run. |
| - `model_chat.pth` is the chat-tuned checkpoint saved after supervised chat tuning on a subset of the ultrachat200k dataset. |
|
|
| These file are intended for use with the homework's basic exercises. |
|
|
| ## Data Card |
|
|
| ### Data Sources |
|
|
| - FineWebEDU for base pretraining |
| - UltraChat 200k for chat tuning and preference-style data preparation |
|
|
| ### Included Data Files |
|
|
| - `ultrachat_short.json`: set of short chat-tuning responses selected from Ultrachat 200k |
| - `ultrachat_dpo_pos.json`: preferred responses |
| - `ultrachat_dpo_neg.json`: dispreferred responses |
|
|
| ## File Format Notes |
|
|
| - `model_base.pth` and `model_chat.pth` are PyTorch checkpoint dictionaries |
| - attention weights are stored in the homework-compatible unpacked format |
| - all exported weights are stored as `bfloat16` |
|
|
|
|