Upload folder using huggingface_hub

Browse files

Files changed (10) hide show

README.md +57 -0
commonsense/llama-13b_gpr4/epoch_0/sketched_params.pkl +3 -0
commonsense/llama-13b_gpr4/epoch_1/sketched_params.pkl +3 -0
commonsense/llama-2-7b_gpr4/epoch_0/sketched_params.pkl +3 -0
commonsense/llama-2-7b_gpr4/epoch_1/sketched_params.pkl +3 -0
commonsense/llama-3-8b_gpr4/epoch_0/sketched_params.pkl +3 -0
commonsense/llama-3-8b_gpr4/epoch_1/sketched_params.pkl +3 -0
commonsense/llama-7b_gpr4/epoch_0/sketched_params.pkl +3 -0
commonsense/llama-7b_gpr4/epoch_1/sketched_params.pkl +3 -0
config.json +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,57 @@

+---
+tags:
+  - sketchtune
+  - sketch to adapt
+library_name: transformers
+---
+# Fine-Tuned Model Checkpoints for *(ICML 2025) Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation*
+This repository contains the fine-tuned model checkpoints used in our ICML 2025 paper: **Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation**.
+The table below lists the available models along with their fine-tuning datasets, bit widths, groups per row, and training epochs.
+| Model      | Dataset     | Bits | Groups Per Row (GPR) | Epochs |
+| ---------- | ----------- | --------- | -------------------- | ------ |
+| Llama-3-8B | Commonsense | INT4      | 4                    | 1,2   |
+| Llama-3-8B | Math        | INT4      | 1,2,4,8           | 1,2,3,4    |
+| Llama-2-7B | Commonsense | INT4      | 4                    | 1,2   |
+| Llama-2-7B | Math        | INT4      | 1,2,4,8           | 1,2,3,4    |
+| Llama-7B   | Commonsense | INT4      | 4                    | 1,2   |
+| Llama-7B   | Math        | INT4      | 1,2,4,8           | 1,2,3,4    |
+| Llama-13B  | Commonsense | INT4      | 4                    | 1,2   |
+| Llama-13B  | Math        | INT4      | 1,2,4,8           | 1,2,3,4    |
+For full details on how to reproduce the experiments, please refer to our GitHub repository:
+👉 [https://github.com/LeanModels/SketchTune](https://github.com/LeanModels/SketchTune) 👈
+### What is SketchTune?
+SketchTune is a novel method for adapting large language models (LLMs) that focuses on reducing memory usage and improving speed while fine-tuning. Instead of adding low-rank adapters like LoRA or DoRA, it compresses the model's weights into compact, trainable "sketches" for downstream adaptation.
+**Key benefits:**
+* **Combines compression and adaptation** - SketchTune trains directly on compressed representations, removing the need for separate adapters. This saves memory, improves model performance and speed.
+* **Avoids low-rank limits** - Low-rank adapters assume weight updates follow a low rank structure. SketchTune skips this assumption, using sketching to better capture complex changes in model weights.
+**Performance highlights:**
+* Even with base models that are **2.6–3.5× smaller**, SketchTune **outperforms LoRA, DoRA, and S2FT** on commonsense and math reasoning benchmarks.
+* On the GSM8K math dataset, SketchTune achieves a **14.48% higher accuracy than LoftQ**, while training **7.3× fewer parameters**.
+For a deep dive into how sketching works, including math details and extensive test results, check out our full paper: [https://arxiv.org/abs/2410.06364](https://arxiv.org/abs/2410.06364).
+### Citation
+If you find this work helpful, please consider citing our paper:
+```bibtex
+@inproceedings{
+  zhang2025sketch,
+  title={Sketch to Adapt: Fine-Tunable Sketches for Efficient {LLM} Adaptation},
+  author={Tianyi Zhang and Junda Su and Aditya Desai and Oscar Wu and Zhaozhuo Xu and Anshumali Shrivastava},
+  booktitle={Forty-second International Conference on Machine Learning},
+  year={2025},
+  url={https://openreview.net/forum?id=zZXOXhxO6I}
+}
+```

commonsense/llama-13b_gpr4/epoch_0/sketched_params.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bd2914b27aa416d37a9566176599676d60ed801929d533d6b2cc9653cd1bd327
+size 272723337

commonsense/llama-13b_gpr4/epoch_1/sketched_params.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:49922fd7928ceeca7d412fceb058b32feec6795249349fdd982ea08cd2e63ab7
+size 272723351

commonsense/llama-2-7b_gpr4/epoch_0/sketched_params.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:640451787ea6fc22e0104da712cda1e0e23fd901bf5f5573827e01dcdf0059dc
+size 174138797

commonsense/llama-2-7b_gpr4/epoch_1/sketched_params.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b1f5b34baf35bdfc67cf615f0ab9ca72c16fc13a3730905e2f8d5998cc142f7f
+size 174138797

commonsense/llama-3-8b_gpr4/epoch_0/sketched_params.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:25f63c1c7c0cb97b74400d0ef60783e412f258fca2d96f7046725cfccd3df18f
+size 176235895

commonsense/llama-3-8b_gpr4/epoch_1/sketched_params.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9a65d96b0c5181d1534e335c22e63a6c187436e16103b87ff85e995737a772bc
+size 176235909

commonsense/llama-7b_gpr4/epoch_0/sketched_params.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e519c145a4e0c436581368e0a44c02a1796bd5954bf38b60810771f2ec449df8
+size 174138695

commonsense/llama-7b_gpr4/epoch_1/sketched_params.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c5d93ba18d23d4a215449bb053aad69ceb9a3a4daddeeb0ad9fa34a1a399f483
+size 174138661

config.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "model_type": "llama"
+}