| # Install and Run Guide | |
| This guide explains how to install dependencies and run the IMDB Transformer experiments in `assignment_llm_1/assignment_text`. | |
| First, enter this path using `cd`: | |
| ```bash | |
| cd assignment_llm_1/assignment_text | |
| ``` | |
| ## What is added in the code | |
| - Model-size experiment support in `assignment_text/code/c1.py`: | |
| - `small`: `d_model=64`, `num_heads=4`, `num_layers=1`, `d_ff=128` | |
| - `medium`: `d_model=128`, `num_heads=8`, `num_layers=2`, `d_ff=256` | |
| - `large`: `d_model=256`, `num_heads=8`, `num_layers=4`, `d_ff=512` | |
| - Automatic experiment report generation: | |
| - `assignment_text/saved_model/transformer_imdb_experiment_report.md` | |
| - Model-size selection in analysis script: | |
| - `python code/c1_analysis.py --model_size small|medium|large ...` | |
| - Some qualitative error-analysis instances are available in: | |
| - `assignment_text/documentation/error_analysis.json` | |
| ## 1) Go to the project folder | |
| ```bash | |
| cd ./assignment_llm_1/assignment_text | |
| ``` | |
| ## 2) Create and activate environment | |
| ### Option A: Conda (recommended if you use Conda) | |
| ```bash | |
| conda create -n transformer_hw python=3.10 -y | |
| conda activate transformer_hw | |
| python -m pip install --upgrade pip | |
| ``` | |
| ## 3) Install dependencies | |
| If there is a `requirements.txt` file in this folder, run: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ## 4) Train all model sizes (small, medium, large) | |
| Run training from the `code` directory: | |
| ```bash | |
| python code/c1.py | |
| ``` | |
| This will: | |
| - train `small`, `medium`, and `large` Transformer models, | |
| - save checkpoints under `assignment_llm_1/assignment_text/saved_model/`, | |
| - create a Markdown experiment report at: | |
| - `assignment_llm_1/assignment_text/saved_model/transformer_imdb_experiment_report.md` | |
| ## 5) Evaluate and analyze a selected model size | |
| From the same `code` directory: | |
| ```bash | |
| python code/c1_analysis.py --split test --model_size small --num_examples 5 | |
| python code/c1_analysis.py --split test --model_size medium --num_examples 5 | |
| python code/c1_analysis.py --split test --model_size large --num_examples 5 | |
| ``` | |
| Arguments: | |
| - `--split`: dataset split to evaluate (`test` or `train`) | |
| - `--model_size`: one of `small`, `medium`, `large` | |
| - `--num_examples`: number of misclassified examples to print | |
| ## 6) (Optional) Use a custom checkpoint path directly | |
| If you want to bypass `--model_size`, pass an explicit checkpoint: | |
| ```bash | |
| python code/c1_analysis.py \ | |
| --split test \ | |
| --checkpoint ../saved_model/transformer_imdb_large.pt \ | |
| --num_examples 5 | |
| ``` | |
| ## 7) Expected output files | |
| After running `c1.py`, these files should exist in `assignment_llm_1/assignment_text/saved_model/`: | |
| - `transformer_imdb_small.pt` | |
| - `transformer_imdb_medium.pt` | |
| - `transformer_imdb_large.pt` | |
| - `transformer_imdb.pt` (summary/compatibility checkpoint) | |
| - `transformer_imdb_experiment_report.md` (human-readable report) | |