File size: 2,166 Bytes
a29dc33 efc655d a29dc33 45683de a29dc33 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | ### Environment Setup
Download this directory to a local machine and set up [`uv`](https://docs.astral.sh/uv/).
1. **Install `uv`** (if you haven't already):
```bash
curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh
```
2. **Sync the environment:**
```bash
uv sync
```
*(This automatically creates a virtual environment at `.venv` and strictly installs the dependencies locked in `uv.lock`.)*
3. **Activate the environment:**
```bash
source .venv/bin/activate
```
### Evaluation Script
Run:
```bash
accelerate launch eval.py \
--model cloverlm \
--model_args "pretrained=daslab-testing/CloverLM,dtype=bfloat16,quartet_2_impl=quartet2,attn_backend=pytorch" \
--tasks "arc_easy_mi,arc_challenge_mi,hellaswag,piqa" \
--num_fewshot 0 \
--include_path ./ \
--trust_remote_code \
--confirm_run_unsafe_code \
--batch_size auto
```
### Expected Evaluation Results
```
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|----------------|------:|------|-----:|---------------|---|-----:|---|-----:|
|arc_challenge_mi| 1|none | 0|acc |↑ |0.4625|± |0.0146|
| | |none | 0|acc_mutual_info|↑ |0.5094|± |0.0146|
| | |none | 0|acc_norm |↑ |0.4923|± |0.0146|
|arc_easy_mi | 1|none | 0|acc |↑ |0.7997|± |0.0082|
| | |none | 0|acc_mutual_info|↑ |0.7239|± |0.0092|
| | |none | 0|acc_norm |↑ |0.7731|± |0.0086|
|hellaswag | 1|none | 0|acc |↑ |0.5392|± |0.0050|
| | |none | 0|acc_norm |↑ |0.7167|± |0.0045|
|piqa | 1|none | 0|acc |↑ |0.7922|± |0.0095|
| | |none | 0|acc_norm |↑ |0.8058|± |0.0092|
```
### Alternative Backends
Replace `quartet_2_impl=quartet2` with `quartet_2_impl=pseudoquant` on non-Blackwell GPUs.
You can try `attn_backend=pytorch/flash2/flash3/flash4` if you have the corresponding libs installed.
|