Any params/containers to deploy it from HF Inference Endpoints?

#7
by fabriciocarraro - opened

I've been trying to do it with default, vllm.0.18, vllm.nightly, and it fails before starting. Do I need a special container?

Endpoint failed to start | Check Logs
Exit code: 1. Reason: ^^^^^^ (APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 124, in build_async_engine_client_from_engine_args (APIServer pid=1) vllm_config = engine_args.create_engine_config(usage_context=usage_context) (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1549, in create_engine_config (APIServer pid=1) model_config = self.create_model_config() (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 1398, in create_model_config (APIServer pid=1) return ModelConfig( (APIServer pid=1) ^^^^^^^^^^^^ (APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/pydantic/_internal/_dataclasses.py", line 121, in init (APIServer pid=1) s.pydantic_validator.validate_python(ArgsKwargs(args, kwargs), self_instance=s) (APIServer pid=1) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig (APIServer pid=1) Value error, The checkpoint you are trying to load has model type gemma4 but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date. (APIServer pid=1) (APIServer pid=1) You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git [type=value_error, input_value=ArgsKwargs((), {'model': ...nderer_num_workers': 1}), input_type=ArgsKwargs] (APIServer pid=1) For further information visit https://errors.pydantic.dev/2.12/v/value_error

Sign up or log in to comment