File size: 3,115 Bytes
8f94ed6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
tags:
  - sketchtune
  - sketch to adapt
library_name: transformers
---

# Fine-Tuned Model Checkpoints for *(ICML 2025) Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation*

This repository contains the fine-tuned model checkpoints used in our ICML 2025 paper: **Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation**.

The table below lists the available models along with their fine-tuning datasets, bit widths, groups per row, and training epochs.

| Model      | Dataset     | Bits | Groups Per Row (GPR) | Epochs |
| ---------- | ----------- | --------- | -------------------- | ------ |
| Llama-3-8B | Commonsense | INT4      | 4                    | 1,2   |
| Llama-3-8B | Math        | INT4      | 1,2,4,8           | 1,2,3,4    |
| Llama-2-7B | Commonsense | INT4      | 4                    | 1,2   |
| Llama-2-7B | Math        | INT4      | 1,2,4,8           | 1,2,3,4    |
| Llama-7B   | Commonsense | INT4      | 4                    | 1,2   |
| Llama-7B   | Math        | INT4      | 1,2,4,8           | 1,2,3,4    |
| Llama-13B  | Commonsense | INT4      | 4                    | 1,2   |
| Llama-13B  | Math        | INT4      | 1,2,4,8           | 1,2,3,4    |

For full details on how to reproduce the experiments, please refer to our GitHub repository:

👉 [https://github.com/LeanModels/SketchTune](https://github.com/LeanModels/SketchTune) 👈

### What is SketchTune?

SketchTune is a novel method for adapting large language models (LLMs) that focuses on reducing memory usage and improving speed while fine-tuning. Instead of adding low-rank adapters like LoRA or DoRA, it compresses the model's weights into compact, trainable "sketches" for downstream adaptation.

**Key benefits:**

* **Combines compression and adaptation** - SketchTune trains directly on compressed representations, removing the need for separate adapters. This saves memory, improves model performance and speed.
* **Avoids low-rank limits** - Low-rank adapters assume weight updates follow a low rank structure. SketchTune skips this assumption, using sketching to better capture complex changes in model weights.

**Performance highlights:**

* Even with base models that are **2.6–3.5× smaller**, SketchTune **outperforms LoRA, DoRA, and S2FT** on commonsense and math reasoning benchmarks.
* On the GSM8K math dataset, SketchTune achieves a **14.48% higher accuracy than LoftQ**, while training **7.3× fewer parameters**.

For a deep dive into how sketching works, including math details and extensive test results, check out our full paper: [https://arxiv.org/abs/2410.06364](https://arxiv.org/abs/2410.06364).

### Citation

If you find this work helpful, please consider citing our paper:
```bibtex
@inproceedings{
  zhang2025sketch,
  title={Sketch to Adapt: Fine-Tunable Sketches for Efficient {LLM} Adaptation},
  author={Tianyi Zhang and Junda Su and Aditya Desai and Oscar Wu and Zhaozhuo Xu and Anshumali Shrivastava},
  booktitle={Forty-second International Conference on Machine Learning},
  year={2025},
  url={https://openreview.net/forum?id=zZXOXhxO6I}
}
```