Spaces

TMVishnu
/

inference

Runtime error

App Files Files Community

runtime error

Exit code: 1. Reason: INFO 01-24 03:00:48 [importing.py:44] Triton is installed but 0 active driver(s) found (expected 1). Disabling Triton to prevent runtime errors. INFO 01-24 03:00:48 [importing.py:68] Triton not installed or not compatible; certain GPU-related functions will not be available. W0124 03:00:49.394000 1 torch/utils/cpp_extension.py:117] No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' INFO 01-24 03:00:49 [utils.py:263] non-default args: {'dtype': 'half', 'disable_log_stats': True, 'model': 'meta-llama/Llama-3.2-1B-Instruct'} Traceback (most recent call last): File "/app/app.py", line 14, in <module> llama = LLM( File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py", line 338, in init self.llm_engine = LLMEngine.from_engine_args( File "/usr/local/lib/python3.10/dist-packages/vllm/v1/engine/llm_engine.py", line 168, in from_engine_args vllm_config = engine_args.create_engine_config(usage_context) File "/usr/local/lib/python3.10/dist-packages/vllm/engine/arg_utils.py", line 1351, in create_engine_config device_config = DeviceConfig(device=cast(Device, current_platform.device_type)) File "/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_dataclasses.py", line 121, in init s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s) File "/usr/local/lib/python3.10/dist-packages/vllm/config/device.py", line 75, in __post_init__ self.device = torch.device(self.device_type) RuntimeError: Device string must not be empty

Container logs:

Fetching error logs...