yuccaaa commited on
Commit
38d2b03
·
verified ·
1 Parent(s): 91743d3

Upload ms-swift/docs/source_en/Instruction/Export-and-push.md with huggingface_hub

Browse files
ms-swift/docs/source_en/Instruction/Export-and-push.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Export and Push
2
+
3
+ ## Merge LoRA
4
+
5
+ - See [here](https://github.com/modelscope/ms-swift/blob/main/examples/export/merge_lora.sh).
6
+
7
+ ## Quantization
8
+
9
+ SWIFT supports quantization exports for AWQ, GPTQ, and BNB models. AWQ and GPTQ require a calibration dataset, which yields better quantization performance but takes longer to quantize. On the other hand, BNB does not require a calibration dataset and is quicker to quantize.
10
+
11
+ | Quantization Technique | Multimodal | Inference Acceleration | Continued Training |
12
+ | ---------------------- | ---------- | ---------------------- | ------------------ |
13
+ | GPTQ | ✅ | ✅ | ✅ |
14
+ | AWQ | ✅ | ✅ | ✅ |
15
+ | BNB | ❌ | ✅ | ✅ |
16
+
17
+ In addition to the SWIFT installation, the following additional dependencies need to be installed:
18
+
19
+ ```shell
20
+ # For AWQ quantization:
21
+ # The versions of autoawq and CUDA are correlated; please choose the version according to `https://github.com/casper-hansen/AutoAWQ`.
22
+ # If there are dependency conflicts with torch, please add the `--no-deps` option.
23
+ pip install autoawq -U
24
+
25
+ # For GPTQ quantization:
26
+ # The versions of auto_gptq and CUDA are correlated; please choose the version according to `https://github.com/PanQiWei/AutoGPTQ#quick-installation`.
27
+ pip install auto_gptq optimum -U
28
+
29
+ # For BNB quantization:
30
+ pip install bitsandbytes -U
31
+ ```
32
+
33
+ We provide a series of scripts to demonstrate SWIFT's quantization export capabilities:
34
+
35
+ - Supports [AWQ](https://github.com/modelscope/ms-swift/blob/main/examples/export/quantize/awq.sh)/[GPTQ](https://github.com/modelscope/ms-swift/blob/main/examples/export/quantize/gptq.sh)/[BNB](https://github.com/modelscope/ms-swift/blob/main/examples/export/quantize/bnb.sh) quantization exports.
36
+ - Multimodal quantization: Supports quantizing multimodal models using GPTQ and AWQ, with limited multimodal models supported by AWQ. Refer to [here](https://github.com/modelscope/ms-swift/tree/main/examples/export/quantize/mllm).
37
+ - Support for more model series: Supports quantization exports for [BERT](https://github.com/modelscope/ms-swift/tree/main/examples/export/quantize/bert) and [Reward Model](https://github.com/modelscope/ms-swift/tree/main/examples/export/quantize/reward_model).
38
+ - Models exported with SWIFT's quantization support inference acceleration using vllm/lmdeploy; they also support further SFT/RLHF using QLoRA.
39
+
40
+
41
+ ## Push Model
42
+
43
+ SWIFT supports re-pushing trained/quantized models to ModelScope/Hugging Face. By default, it pushes to ModelScope, but you can specify `--use_hf true` to push to Hugging Face.
44
+
45
+ ```shell
46
+ swift export \
47
+ --model output/vx-xxx/checkpoint-xxx \
48
+ --push_to_hub true \
49
+ --hub_model_id '<model-id>' \
50
+ --hub_token '<sdk-token>' \
51
+ --use_hf false
52
+ ```
53
+
54
+ Tips:
55
+
56
+ - You can use `--model <checkpoint-dir>` or `--adapters <checkpoint-dir>` to specify the checkpoint directory to be pushed. There is no difference between these two methods in the model pushing scenario.
57
+ - When pushing to ModelScope, you need to make sure you have registered for a ModelScope account. Your SDK token can be obtained from [this page](https://www.modelscope.cn/my/myaccesstoken). Ensure that the account associated with the SDK token has edit permissions for the organization corresponding to the model_id. The model pushing process will automatically create a model repository corresponding to the model_id (if it does not already exist), and you can use `--hub_private_repo true` to automatically create a private model repository.