| # Prepare Models | |
| To support the evaluation of new models in OpenCompass, there are several ways: | |
| 1. HuggingFace-based models | |
| 2. API-based models | |
| 3. Custom models | |
| ## HuggingFace-based Models | |
| In OpenCompass, we support constructing evaluation models directly from HuggingFace's | |
| `AutoModel.from_pretrained` and `AutoModelForCausalLM.from_pretrained` interfaces. If the model to be | |
| evaluated follows the typical generation interface of HuggingFace models, there is no need to write code. You | |
| can simply specify the relevant configurations in the configuration file. | |
| Here is an example configuration file for a HuggingFace-based model: | |
| ```python | |
| # Use `HuggingFace` to evaluate models supported by AutoModel. | |
| # Use `HuggingFaceCausalLM` to evaluate models supported by AutoModelForCausalLM. | |
| from opencompass.models import HuggingFaceCausalLM | |
| models = [ | |
| dict( | |
| type=HuggingFaceCausalLM, | |
| # Parameters for `HuggingFaceCausalLM` initialization. | |
| path='huggyllama/llama-7b', | |
| tokenizer_path='huggyllama/llama-7b', | |
| tokenizer_kwargs=dict(padding_side='left', truncation_side='left'), | |
| max_seq_len=2048, | |
| batch_padding=False, | |
| # Common parameters shared by various models, not specific to `HuggingFaceCausalLM` initialization. | |
| abbr='llama-7b', # Model abbreviation used for result display. | |
| max_out_len=100, # Maximum number of generated tokens. | |
| batch_size=16, # The size of a batch during inference. | |
| run_cfg=dict(num_gpus=1), # Run configuration to specify resource requirements. | |
| ) | |
| ] | |
| ``` | |
| Explanation of some of the parameters: | |
| - `batch_padding=False`: If set to False, each sample in a batch will be inferred individually. If set to True, | |
| a batch of samples will be padded and inferred together. For some models, such padding may lead to | |
| unexpected results. If the model being evaluated supports sample padding, you can set this parameter to True | |
| to speed up inference. | |
| - `padding_side='left'`: Perform padding on the left side. Not all models support padding, and padding on the | |
| right side may interfere with the model's output. | |
| - `truncation_side='left'`: Perform truncation on the left side. The input prompt for evaluation usually | |
| consists of both the in-context examples prompt and the input prompt. If the right side of the input prompt | |
| is truncated, it may cause the input of the generation model to be inconsistent with the expected format. | |
| Therefore, if necessary, truncation should be performed on the left side. | |
| During evaluation, OpenCompass will instantiate the evaluation model based on the `type` and the | |
| initialization parameters specified in the configuration file. Other parameters are used for inference, | |
| summarization, and other processes related to the model. For example, in the above configuration file, we will | |
| instantiate the model as follows during evaluation: | |
| ```python | |
| model = HuggingFaceCausalLM( | |
| path='huggyllama/llama-7b', | |
| tokenizer_path='huggyllama/llama-7b', | |
| tokenizer_kwargs=dict(padding_side='left', truncation_side='left'), | |
| max_seq_len=2048, | |
| ) | |
| ``` | |
| ## API-based Models | |
| Currently, OpenCompass supports API-based model inference for the following: | |
| - OpenAI (`opencompass.models.OpenAI`) | |
| - ChatGLM (`opencompass.models.ZhiPuAI`) | |
| - ABAB-Chat from MiniMax (`opencompass.models.MiniMax`) | |
| - XunFei from XunFei (`opencompass.models.XunFei`) | |
| Let's take the OpenAI configuration file as an example to see how API-based models are used in the | |
| configuration file. | |
| ```python | |
| from opencompass.models import OpenAI | |
| models = [ | |
| dict( | |
| type=OpenAI, # Using the OpenAI model | |
| # Parameters for `OpenAI` initialization | |
| path='gpt-4', # Specify the model type | |
| key='YOUR_OPENAI_KEY', # OpenAI API Key | |
| max_seq_len=2048, # The max input number of tokens | |
| # Common parameters shared by various models, not specific to `OpenAI` initialization. | |
| abbr='GPT-4', # Model abbreviation used for result display. | |
| max_out_len=512, # Maximum number of generated tokens. | |
| batch_size=1, # The size of a batch during inference. | |
| run_cfg=dict(num_gpus=0), # Resource requirements (no GPU needed) | |
| ), | |
| ] | |
| ``` | |
| We have provided several examples for API-based models. Please refer to | |
| ```bash | |
| configs | |
| βββ eval_zhipu.py | |
| βββ eval_xunfei.py | |
| βββ eval_minimax.py | |
| ``` | |
| ## Custom Models | |
| If the above methods do not support your model evaluation requirements, you can refer to | |
| [Supporting New Models](../advanced_guides/new_model.md) to add support for new models in OpenCompass. | |