File size: 2,166 Bytes
a29dc33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
efc655d
a29dc33
 
 
 
 
 
 
 
 
45683de
a29dc33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
### Environment Setup

Download this directory to a local machine and set up [`uv`](https://docs.astral.sh/uv/).

1. **Install `uv`** (if you haven't already):
   ```bash
   curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh
   ```

2. **Sync the environment:**
   ```bash
   uv sync
   ```
   *(This automatically creates a virtual environment at `.venv` and strictly installs the dependencies locked in `uv.lock`.)*

3. **Activate the environment:**
   ```bash
   source .venv/bin/activate
   ```

### Evaluation Script

Run:

```bash
accelerate launch eval.py \
    --model cloverlm \
    --model_args "pretrained=daslab-testing/CloverLM,dtype=bfloat16,quartet_2_impl=quartet2,attn_backend=pytorch" \
    --tasks "arc_easy_mi,arc_challenge_mi,hellaswag,piqa" \
    --num_fewshot 0 \
    --include_path ./ \
    --trust_remote_code \
    --confirm_run_unsafe_code \
    --batch_size auto
```

### Expected Evaluation Results

```
|     Tasks      |Version|Filter|n-shot|    Metric     |   |Value |   |Stderr|
|----------------|------:|------|-----:|---------------|---|-----:|---|-----:|
|arc_challenge_mi|      1|none  |     0|acc            |↑  |0.4625|±  |0.0146|
|                |       |none  |     0|acc_mutual_info|↑  |0.5094|±  |0.0146|
|                |       |none  |     0|acc_norm       |↑  |0.4923|±  |0.0146|
|arc_easy_mi     |      1|none  |     0|acc            |↑  |0.7997|±  |0.0082|
|                |       |none  |     0|acc_mutual_info|↑  |0.7239|±  |0.0092|
|                |       |none  |     0|acc_norm       |↑  |0.7731|±  |0.0086|
|hellaswag       |      1|none  |     0|acc            |↑  |0.5392|±  |0.0050|
|                |       |none  |     0|acc_norm       |↑  |0.7167|±  |0.0045|
|piqa            |      1|none  |     0|acc            |↑  |0.7922|±  |0.0095|
|                |       |none  |     0|acc_norm       |↑  |0.8058|±  |0.0092|
```

### Alternative Backends

Replace `quartet_2_impl=quartet2` with `quartet_2_impl=pseudoquant` on non-Blackwell GPUs.
You can try `attn_backend=pytorch/flash2/flash3/flash4` if you have the corresponding libs installed.