Can't run model on the Kinara ara-2

#1
by mpro77 - opened

Hello,
I bought the Kinara ara-2 a few weeks ago and had a tuff time getting the driver running. I got the Drivers and SDK from Geniatech and there were so many errors. I had to recompile the driver many times and finally found a new Makefile on the web. I got the card up and running with a lot of work. Now I can't run any of these models because I can't convert them, the script that does it, llm_model_gen.py is missing and no other scripts work. Does anyone know if Kinara has a new SDK thats actually correct? I really want to get something running on the accelerator and see how fast it runs local LLM's. Anyone have any ideas?
Thanks!

Can you confirm the SDK version you are working with . The SDK release R1.3 has "llm_model_gen" executable. which you can use for compiling the llm model . You can follow the "README_LLM_generic_compilation.md" which mentions the steps for llm model compilation . Please reach out to Geniatech support team for any query .

Yes, its V1.3, I feel stupid, i was searching for anything with model, llm etc with a .py. This is what it says in the readme, but I saw in your response the script didn't have .py. I found the script right away. Thank you for that, I appreciate it. I tried to run it and it came up with the error "no module named docker", I looked at the python script and it has a bunch of docker commands. Does this need to be run in a docker container? I have sent Genaitech many emails and they never answer
Thank you for your help!
Reagards.
Mark

I think I figured out docker, it looks like it just needs to be installed, which I have done and ran the script. The script failed at resolving website "https://gitea.kinara.ai/eng/sw:r1.3-17thSept-final-14963e6d7c", does this site still exist on gitea? It doesn't seem to be a network problem, I gave my system full access through my firewalla firewall but I still can't ping it. I know the website is still active because a lot of people use it.
Thank you,
Mark

hi @mpro77

please run below command to load docker image gitea.kinara.ai/eng/sw:r1.3-17thSept-final-14963e6d7c

cd ara2-sdk-r1.3-lic

docker load -i dvdocker/ara2.tar

after above command, use below command to verify

docker image ls | grep gitea.kinara.ai/eng/sw

Hello,

I ran the docker command and it looked like it ran fine. When I tried to run the docker command to run a container for the image I received the below error:

WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

*** UPDATE ***
I also tried to run the below for arm64

docker load --input dvdocker/ara2.tar --platform linux/arm64
requested platform(s) ([linux/arm64]) not found: image might be filtered out

docker load --input dvdocker/ara2.tar --platform linux/arm/v8
requested platform(s) ([linux/arm64/v8]) not found: image might be filtered out

Is this image for an AMD64? Is there one for an ARM64 which is the platform I am running on. When I tried to run the command "./llm_model_gen --config llm_model
_config.yaml --mode 1" I received the below error:

ERROR:main:convert to onnx failed with error: command failed with exit code: 255
INFO:main:llmrun execution failed

I believe this is because the image is made for an AMD64.

I also receive the below error's when I run this script:

command to execute: /dv2/utils/llm_compiler/convert_to_onnx_llm/create_env_and_run.sh -m /home/mpro77/.cache/huggingface/hub/models--KinaraInc--llama3o1_8b_instruct_all4bit_blockAP/blobs -o /home/mpro77/kinara/models/llama3o1_8b/output -t gptq -f /home/mpro77/.cache/huggingface/hub/models--KinaraInc--llama3o1_8b_instruct_all4bit_blockAP/blobs/dequantized -g /home/mpro77/.cache/huggingface/hub/models--KinaraInc--llama3o1_8b_instruct_all4bit_blockAP/blobs
exec /dv2/utils/llm_compiler/convert_to_onnx_llm/create_env_and_run.sh: exec format error

I also wanted to give you a little background on what I am trying to do. I work for the Defense Health Agency and we use AI in a cloud which cost a lot of money. I am trying to do proof of concept to see if we can run local Medical LLM's on each computer with the Kinara ara-2 for less money. We have over 32 medical clinics that I would like to do this for. Once I can learn how to converts and run one of your LLM's then I can look into running a medical LLM locally. I am currently running this on Linux arm64/aarch64 and if I get it to work them I would like to port over to windows for a dell or HP computer which we use.

Thank you for you time and help,
Mark

Sign up or log in to comment