SageMaker Deployment Error

#53
by seabasshn - opened

I tried deploying this model on a ml.g5.48xlarge instance in SM but I keep running into this error:

File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 161, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 310, in get_model
return FlashMixtral(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mixtral.py", line 21, in init
super(FlashMixtral, self).init(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py", line 318, in init
SLIDING_WINDOW_BLOCKS = math.ceil(config.sliding_window / BLOCK_SIZE)

TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

Error: ShardCannotStart

Any ideas as to what might be causing this issue?

I think this was fixed in the latest release of tgi πŸ€—

@ArthurZ What's the latest version? Im using 1.3.1

Thank you. It look like I'll just have to wait until the sagemaker sdk (v2.200.1) support the newest TGI version.

"Unsupported huggingface-llm version: 1.3.3. You may need to upgrade your SDK version (pip install -U sagemaker) for newer huggingface-llm versions. Supported huggingface-llm version(s): 0.6.0, 0.8.2, 0.9.3, 1.0.3, 1.1.0, 1.2.0, 1.3.1, 0.6, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3."

Sorry for that πŸ˜₯

@ArthurZ All good! Appreciate the help

Same problem here.

The new container has been deployed today, I was able to deploy it

Can you share what sagemaker SDK and TGI versions you used? I'm still getting the same error even after upgrading to v1.3.3 and v2.201.0, respectively.

Me with Sagemaker ver 2.202.1, still is not supporting 1.3.3 nor 1.3.4, can someone point me to the container or a link where to find it?

Me with Sagemaker ver 2.202.1, still is not supporting 1.3.3 nor 1.3.4, can someone point me to the container or a link where to find it?

I was able to temporally solve this issue reverting to an older commit. Check https://github.com/huggingface/text-generation-inference/issues/1342 for more information.

hey @seabasshn you might find Impulse AI (https://www.impulselabs.ai/) useful. we make it super easy to fine-tune and deploy open source models. hopefully you find it helpful! i know not relevant to your problem above but might be easier to use us to fine tune and deploy

docs: https://docs.impulselabs.ai/introduction
python sdk: https://pypi.org/project/impulse-api-sdk-python/

Sign up or log in to comment