CloverLM / lm_eval /README.md
BlackSamorez's picture
Upload folder using huggingface_hub
efc655d verified

Environment Setup

Download this directory to a local machine and set up uv.

  1. Install uv (if you haven't already):

    curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh
    
  2. Sync the environment:

    uv sync
    

    (This automatically creates a virtual environment at .venv and strictly installs the dependencies locked in uv.lock.)

  3. Activate the environment:

    source .venv/bin/activate
    

Evaluation Script

Run:

accelerate launch eval.py \
    --model cloverlm \
    --model_args "pretrained=daslab-testing/CloverLM,dtype=bfloat16,quartet_2_impl=quartet2,attn_backend=pytorch" \
    --tasks "arc_easy_mi,arc_challenge_mi,hellaswag,piqa" \
    --num_fewshot 0 \
    --include_path ./ \
    --trust_remote_code \
    --confirm_run_unsafe_code \
    --batch_size auto

Expected Evaluation Results

|     Tasks      |Version|Filter|n-shot|    Metric     |   |Value |   |Stderr|
|----------------|------:|------|-----:|---------------|---|-----:|---|-----:|
|arc_challenge_mi|      1|none  |     0|acc            |↑  |0.4625|±  |0.0146|
|                |       |none  |     0|acc_mutual_info|↑  |0.5094|±  |0.0146|
|                |       |none  |     0|acc_norm       |↑  |0.4923|±  |0.0146|
|arc_easy_mi     |      1|none  |     0|acc            |↑  |0.7997|±  |0.0082|
|                |       |none  |     0|acc_mutual_info|↑  |0.7239|±  |0.0092|
|                |       |none  |     0|acc_norm       |↑  |0.7731|±  |0.0086|
|hellaswag       |      1|none  |     0|acc            |↑  |0.5392|±  |0.0050|
|                |       |none  |     0|acc_norm       |↑  |0.7167|±  |0.0045|
|piqa            |      1|none  |     0|acc            |↑  |0.7922|±  |0.0095|
|                |       |none  |     0|acc_norm       |↑  |0.8058|±  |0.0092|

Alternative Backends

Replace quartet_2_impl=quartet2 with quartet_2_impl=pseudoquant on non-Blackwell GPUs. You can try attn_backend=pytorch/flash2/flash3/flash4 if you have the corresponding libs installed.