Image_uri to deploy Ministral3-3B in Sagemaker?

#12
by tloiseausia - opened

Hi,
I am trying to deploy this model using Sagemaker.
I am facing an issue because whatever the uri_image I used I have the following errors:

This script:

from sagemaker.huggingface import get_huggingface_llm_image_uri
from sagemaker.huggingface.model import HuggingFaceModel

image_uri = get_huggingface_llm_image_uri(
    backend="huggingface",
    session=sess,
    region=sess.boto_region_name,
)

huggingface_model = HuggingFaceModel(
    role=role,
   image_uri = image_uri,
   env={
       'HF_MODEL_ID': 'mistralai/Ministral-3-3B-Instruct-2512',  # The Hugging Face model ID
       'HF_TASK': 'text-generation'  # The task type
   }
)

predictor = huggingface_model.deploy(
   initial_instance_count=1,
   instance_type="ml.g5.2xlarge",
   endpoint_name="ministral-3-3B",
   container_startup_health_check_timeout=200,
)"""

results in: "/usr/src/server/text_generation_server/server.py", line 266, in serve_inner model = get_model_with_lora_adapters( File "/usr/src/server/text_generation_server/models/init.py", line 1816, in get_model_with_lora_adapters model = get_model( File "/usr/src/server/text_generation_server/models/init.py", line 1797, in get_model raise ValueError(f"Unsupported model type {model_type}") Unsupported model type mistral3.

and this script:

from sagemaker.huggingface import get_huggingface_llm_image_uri
from sagemaker.huggingface.model import HuggingFaceModel

image_uri = get_huggingface_llm_image_uri(
    backend="lmi",
    session=sess,
    region=sess.boto_region_name,
)

huggingface_model = HuggingFaceModel(
    role=role,
    image_uri=image_uri,
    env={
        'HF_MODEL_ID': 'mistralai/Ministral-3-3B-Instruct-2512',
        'HF_TASK': 'text-generation'
    }
)

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.2xlarge",
    endpoint_name="ministral-3-3B",
    container_startup_health_check_timeout=200,
)

results in: [WARN ] PyProcess - W-72-42e698c2a5917b4-stderr: File "/usr/local/lib/python3.9/dist-packages/transformers/models/auto/configuration_auto.py", line 748, in __getitem__ [WARN ] PyProcess - W-72-42e698c2a5917b4-stderr: raise KeyError(key) [WARN ] PyProcess - W-72-42e698c2a5917b4-stderr: KeyError: 'mistral3'

Do you know if there is any container handling this architecture? Or is there another way to deploy such a model on Sagemaker?

Thanks in advance

I have found another error found above these ones:

[WARN ] PyProcess - W-65-42e698c2a5917b4-stderr: Downloading config.json: 0.00B [00:00, ?B/s] [WARN ] PyProcess - W-65-42e698c2a5917b4-stderr: Downloading config.json: 1.90kB [00:00, 11.2MB/s] [INFO ] PyProcess - W-65-42e698c2a5917b4-stdout: mistralai/Ministral-3-3B-Instruct-2512 does not contain a config.json or adapter_config.json for lora models. This is required for loading huggingface models

Which is weird as there is a config.json file in the repository

Sign up or log in to comment