| <!--Copyright 2023 The HuggingFace Team. All rights reserved. | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
| the License. You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
| specific language governing permissions and limitations under the License. | |
| โ ๏ธ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | |
| rendered properly in your Markdown viewer. | |
| --> | |
| # ์์ํ[[quantization]] | |
| ์์ํ ๊ธฐ๋ฒ์ ๊ฐ์ค์น์ ํ์ฑํ๋ฅผ 8๋นํธ ์ ์(int8)์ ๊ฐ์ ๋ ๋ฎ์ ์ ๋ฐ๋์ ๋ฐ์ดํฐ ํ์ ์ผ๋ก ํํํจ์ผ๋ก์จ ๋ฉ๋ชจ๋ฆฌ์ ๊ณ์ฐ ๋น์ฉ์ ์ค์ ๋๋ค. ์ด๋ฅผ ํตํด ์ผ๋ฐ์ ์ผ๋ก๋ ๋ฉ๋ชจ๋ฆฌ์ ์ฌ๋ฆด ์ ์๋ ๋ ํฐ ๋ชจ๋ธ์ ๋ก๋ํ ์ ์๊ณ , ์ถ๋ก ์๋๋ฅผ ๋์ผ ์ ์์ต๋๋ค. Transformers๋ AWQ์ GPTQ ์์ํ ์๊ณ ๋ฆฌ์ฆ์ ์ง์ํ๋ฉฐ, bitsandbytes๋ฅผ ํตํด 8๋นํธ์ 4๋นํธ ์์ํ๋ฅผ ์ง์ํฉ๋๋ค. | |
| Transformers์์ ์ง์๋์ง ์๋ ์์ํ ๊ธฐ๋ฒ๋ค์ [`HfQuantizer`] ํด๋์ค๋ฅผ ํตํด ์ถ๊ฐ๋ ์ ์์ต๋๋ค. | |
| <Tip> | |
| ๋ชจ๋ธ์ ์์ํํ๋ ๋ฐฉ๋ฒ์ ์ด [์์ํ](../quantization) ๊ฐ์ด๋๋ฅผ ํตํด ๋ฐฐ์ธ ์ ์์ต๋๋ค. | |
| </Tip> | |
| ## QuantoConfig[[transformers.QuantoConfig]] | |
| [[autodoc]] QuantoConfig | |
| ## AqlmConfig[[transformers.AqlmConfig]] | |
| [[autodoc]] AqlmConfig | |
| ## VptqConfig[[transformers.VptqConfig]] | |
| [[autodoc]] VptqConfig | |
| ## AwqConfig[[transformers.AwqConfig]] | |
| [[autodoc]] AwqConfig | |
| ## EetqConfig[[transformers.EetqConfig]] | |
| [[autodoc]] EetqConfig | |
| ## GPTQConfig[[transformers.GPTQConfig]] | |
| [[autodoc]] GPTQConfig | |
| ## BitsAndBytesConfig[[#transformers.BitsAndBytesConfig]] | |
| [[autodoc]] BitsAndBytesConfig | |
| ## HfQuantizer[[transformers.quantizers.HfQuantizer]] | |
| [[autodoc]] quantizers.base.HfQuantizer | |
| ## HqqConfig[[transformers.HqqConfig]] | |
| [[autodoc]] HqqConfig | |
| ## FbgemmFp8Config[[transformers.FbgemmFp8Config]] | |
| [[autodoc]] FbgemmFp8Config | |
| ## CompressedTensorsConfig[[transformers.CompressedTensorsConfig]] | |
| [[autodoc]] CompressedTensorsConfig | |
| ## TorchAoConfig[[transformers.TorchAoConfig]] | |
| [[autodoc]] TorchAoConfig | |