File size: 2,837 Bytes
c165272
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# Install and Run Guide

This guide explains how to install dependencies and run the IMDB Transformer experiments in `assignment_llm_1/assignment_text`.

First, enter this path using `cd`:

```bash
cd assignment_llm_1/assignment_text
```

## What is added in the code

- Model-size experiment support in `assignment_text/code/c1.py`:
  - `small`: `d_model=64`, `num_heads=4`, `num_layers=1`, `d_ff=128`
  - `medium`: `d_model=128`, `num_heads=8`, `num_layers=2`, `d_ff=256`
  - `large`: `d_model=256`, `num_heads=8`, `num_layers=4`, `d_ff=512`
- Automatic experiment report generation:
  - `assignment_text/saved_model/transformer_imdb_experiment_report.md`
- Model-size selection in analysis script:
  - `python code/c1_analysis.py --model_size small|medium|large ...`
- Some qualitative error-analysis instances are available in:
  - `assignment_text/documentation/error_analysis.json`

## 1) Go to the project folder

```bash
cd ./assignment_llm_1/assignment_text
```

## 2) Create and activate environment

### Option A: Conda (recommended if you use Conda)

```bash
conda create -n transformer_hw python=3.10 -y
conda activate transformer_hw
python -m pip install --upgrade pip
```

## 3) Install dependencies

If there is a `requirements.txt` file in this folder, run:

```bash
pip install -r requirements.txt
```


## 4) Train all model sizes (small, medium, large)

Run training from the `code` directory:

```bash
python code/c1.py
```

This will:
- train `small`, `medium`, and `large` Transformer models,
- save checkpoints under `assignment_llm_1/assignment_text/saved_model/`,
- create a Markdown experiment report at:
  - `assignment_llm_1/assignment_text/saved_model/transformer_imdb_experiment_report.md`

## 5) Evaluate and analyze a selected model size

From the same `code` directory:

```bash
python code/c1_analysis.py --split test --model_size small --num_examples 5
python code/c1_analysis.py --split test --model_size medium --num_examples 5
python code/c1_analysis.py --split test --model_size large --num_examples 5
```

Arguments:
- `--split`: dataset split to evaluate (`test` or `train`)
- `--model_size`: one of `small`, `medium`, `large`
- `--num_examples`: number of misclassified examples to print

## 6) (Optional) Use a custom checkpoint path directly

If you want to bypass `--model_size`, pass an explicit checkpoint:

```bash
python code/c1_analysis.py \
  --split test \
  --checkpoint ../saved_model/transformer_imdb_large.pt \
  --num_examples 5
```

## 7) Expected output files

After running `c1.py`, these files should exist in `assignment_llm_1/assignment_text/saved_model/`:
- `transformer_imdb_small.pt`
- `transformer_imdb_medium.pt`
- `transformer_imdb_large.pt`
- `transformer_imdb.pt` (summary/compatibility checkpoint)
- `transformer_imdb_experiment_report.md` (human-readable report)