Improve model card

This PR improves the model card by adding the `library_name` and `pipeline_tag` metadata, which makes the model easier to discover and use. It also includes a brief model description and expands the usage example with an import statement. Additionally, the installation and evaluation instructions from the Github README are incorporated for better accessibility.

Files changed (1) hide show

README.md +52 -33

README.md CHANGED Viewed

@@ -1,6 +1,11 @@
 ---
 license: apache-2.0
 ---
 **Paper**: [https://arxiv.org/pdf/2502.07780](https://arxiv.org/pdf/2502.07780)
 **Code**: https://github.com/IST-DASLab/DarwinLM
 **Models**: [DarwinLM-2.7B](https://huggingface.co/Shengkun/DarwinLM-2.7B), [DarwinLM-4.6B](https://huggingface.co/Shengkun/DarwinLM-4.6B), [DarwinLM-8.4B](https://huggingface.co/Shengkun/DarwinLM-8.4B)
@@ -8,54 +13,68 @@ license: apache-2.0
 ---
-This repository contains the weights of DarwinLM, an evolutionary structured pruning methods for large language models, as introduced in our paper. DarwinLM builds upon an evolutionary search process, generating multiple offspring models in each generation through mutation, and selecting the fittest for survival.
-```
 # Please add trust_remote_code=True as the repo includes custom code to load and run DarwinLM
 model = AutoModelForCausalLM.from_pretrained("Shengkun/DarwinLM-2.7B-Pruned", trust_remote_code=True)
 ```
 ## Downstream Tasks
 **2.7B**
 | Method                     | Param. | SciQ | PIQA | WG   | ArcE | ArcC | HS   | LogiQA | BoolQ | Avg  |
 |----------------------------|--------|------|------|------|------|------|------|--------|-------|------|
 | **Dense**                  | 6.7B   | 93.7 | 78.1 | 69.3 | 76.4 | 53.0 | 78.6 | 30.7   | 77.7  | 69.2 |
-| **Uniform**                | 3.4B   | 44.1 | 57.1 | 53.3 | 33.5 | 32.2 | 27.3 | 25.0   | 49.0  | 40.1 |
-| **ZipLM**                  | 4.0B   | 87.4 | 64.4 | 58.3 | 53.2 | 33.6 | 50.1 | 25.5   | 63.6  | 54.5 |
-| **ShearedLLama**           | 2.7B   | 84.5 | 66.4 | 53.4 | 49.8 | 28.4 | 47.6 | 27.6   | 50.9  | 51.0 |
-| *DarwinLM (one-shot)*      | 2.7B   | 85.6 | 70.8 | 55.8 | 63.3 | 38.1 | 53.2 | 28.5   | 62.7  | 57.2 |
-| **ShearedLLama (50B)**     | 2.7B   | 90.8 | 75.8 | 64.2 | 67.0 | 41.2 | 70.8 | 28.2   | 63.0  | 62.6 |
-| **ShearedLLama (10B†)**    | 2.7B   | 92.0 | 73.6 | 63.1 | 69.8 | 42.0 | 64.4 | 29.0   | 62.1  | 61.9 |
-| *DarwinLM (10B)*           | 2.6B   | 90.8 | 72.2 | 65.1 | 68.5 | 45.0 | 67.2 | 28.5   | 64.6  | 62.8 |
-**4.6B**
-| Model            | Method                  | Param. | SciQ | PIQA | WG   | ArcE | ArcC | HS   | LogiQA | BoolQ | MMLU | Avg  |
-|-----------------|------------------------|--------|------|------|------|------|------|------|--------|-------|------|------|
-| **Llama-3.1-8B** | **Dense**               | 8B     | 96.3 | 81.2 | 74.3 | 81.4 | 58.2 | 81.7 | 31.1   | 84.0  | 65.2 | 72.8 |
-|                 | **Uniform**             | 4.5B   | 29.1 | 53.6 | 51.7 | 26.0 | 23.6 | 27.1 | 25.5   | 62.1  | 25.7 | 36.1 |
-|                 | **ZipLM**               | 6B     | 65.5 | 60.6 | 56.0 | 40.2 | 34.4 | 34.4 | 28.1   | 63.0  | 27.9 | 45.7 |
-|                 | *DarwinLM (one-shot)*   | 4.6B   | 84.9 | 69.4 | 57.3 | 59.6 | 34.2 | 44.6 | 24.1   | 62.2  | 28.5 | 51.6 |
-|                 | **OLMO (2.5T)**         | 7B     | 92.8 | 79.4 | 70.4 | 73.3 | 44.9 | 77.1 | 27.9   | 72.5  | 28.3 | 62.9 |
-|                 | *DarwinLM (10.0B)*      | 4.6B   | 93.2 | 74.8 | 67.4 | 73.2 | 51.6 | 71.3 | 30.7   | 71.1  | 40.6 | 63.7 |
-**8.4B**
-| Model                      | Method                  | Param. | SciQ | PIQA | WG   | ArcE | ArcC | HS   | LogiQA | BoolQ | MMLU | Avg  |
-|---------------------------|------------------------|--------|------|------|------|------|------|------|--------|-------|------|------|
-| **Qwen-2.5-14B-Instruct** | **Dense**               | 14B    | 96.8 | 81.9 | 79.1 | 85.7 | 72.8 | 85.1 | 38.5   | 87.9  | 80.0 | 78.6 |
-|                           | **Uniform**             | 8.6B   | 78.2 | 72.7 | 57.6 | 76.1 | 45.6 | 47.0 | 28.1   | 61.6  | 45.5 | 56.9 |
-|                           | **ZipLM**               | 8.5B   | 69.0 | 66.4 | 52.8 | 60.1 | 38.3 | 43.3 | 29.6   | 60.2  | 25.0 | 49.4 |
-|                           | *DarwinLM (one-shot)*   | 8.4B   | 84.3 | 73.9 | 60.5 | 75.7 | 48.0 | 53.3 | 29.3   | 66.9  | 43.1 | 59.4 |
-|                           | **OLMO-0424 (2.05T)**   | 7B     | 96.1 | 80.1 | 72.1 | 73.8 | 49.2 | 78.0 | 29.3   | 80.8  | 52.1 | 67.9 |
-|                           | *DarwinLM (10.0B)*      | 8.4B   | 89.5 | 78.1 | 70.7 | 79.6 | 57.6 | 74.9 | 33.5   | 73.9  | 57.9 | 68.4 |
-## Bibtex
 ```
 @article{tang2025darwinlm,
   title={DarwinLM: Evolutionary Structured Pruning of Large Language Models},
   author={Tang, Shengkun and Sieberling, Oliver and Kurtic, Eldar and Shen, Zhiqiang and Alistarh, Dan},

 ---
 license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
 ---
+DarwinLM is an evolutionary structured pruning method for large language models. It builds upon an evolutionary search process, generating multiple offspring models in each generation through mutation, and selecting the fittest for survival.  This significantly reduces the computational costs of LLMs, especially for real-time applications.
 **Paper**: [https://arxiv.org/pdf/2502.07780](https://arxiv.org/pdf/2502.07780)
 **Code**: https://github.com/IST-DASLab/DarwinLM
 **Models**: [DarwinLM-2.7B](https://huggingface.co/Shengkun/DarwinLM-2.7B), [DarwinLM-4.6B](https://huggingface.co/Shengkun/DarwinLM-4.6B), [DarwinLM-8.4B](https://huggingface.co/Shengkun/DarwinLM-8.4B)
 ---
+This repository contains the weights of DarwinLM, as introduced in our paper.
+```python
 # Please add trust_remote_code=True as the repo includes custom code to load and run DarwinLM
+from transformers import AutoModelForCausalLM
 model = AutoModelForCausalLM.from_pretrained("Shengkun/DarwinLM-2.7B-Pruned", trust_remote_code=True)
 ```
 ## Downstream Tasks
 **2.7B**
 | Method                     | Param. | SciQ | PIQA | WG   | ArcE | ArcC | HS   | LogiQA | BoolQ | Avg  |
 |----------------------------|--------|------|------|------|------|------|------|--------|-------|------|
 | **Dense**                  | 6.7B   | 93.7 | 78.1 | 69.3 | 76.4 | 53.0 | 78.6 | 30.7   | 77.7  | 69.2 |
+| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
+**(Results for 4.6B and 8.4B)**
+## Installation
+```bash
+conda env create -f environment.yml
+conda activate darwinlm
+```
+## Database Preparation
+```bash
+# For llama-2-7B
+bash scripts/ziplm_llama2-7B.sh
+# ... other model examples
+```
+## Evolutionary Search
+```bash
+bash scripts/struct_prune_search.sh
 ```
+## Post-Training
+After pruning, you can further fine-tune the model with the [Fineweb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) dataset using the [llm-foundry](https://github.com/mosaicml/llm-foundry) repository. Refer to our paper for parameter settings.
+## Evaluation
+Install the [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness).
+**Option 1: Using pre-trained weights:**
+```bash
+bash scripts/run_lmeval_hf.sh
+```
+**Option 2: Evaluating your searched structure:**
+```bash
+bash scripts/run_lmeval_config.sh
+```
+## Bibtex
+```bibtex
 @article{tang2025darwinlm,
   title={DarwinLM: Evolutionary Structured Pruning of Large Language Models},
   author={Tang, Shengkun and Sieberling, Oliver and Kurtic, Eldar and Shen, Zhiqiang and Alistarh, Dan},