runtime error
Exit code: 1. Reason: -00001-of-000001.safetensors: 2%|▏ | 154M/6.67G [00:02<01:25, 76.6MB/s][A model-00001-of-000001.safetensors: 16%|█▌ | 1.08G/6.67G [00:03<00:17, 327MB/s][A model-00001-of-000001.safetensors: 45%|████▍ | 3.00G/6.67G [00:05<00:05, 712MB/s][A model-00001-of-000001.safetensors: 74%|███████▍ | 4.92G/6.67G [00:06<00:01, 1.05GB/s][A model-00001-of-000001.safetensors: 100%|██████████| 6.67G/6.67G [00:07<00:00, 1.05GB/s][A model-00001-of-000001.safetensors: 100%|██████████| 6.67G/6.67G [00:07<00:00, 842MB/s] Downloading shards: 100%|██████████| 1/1 [00:07<00:00, 7.97s/it][A Downloading shards: 100%|██████████| 1/1 [00:07<00:00, 7.97s/it] Traceback (most recent call last): File "/home/user/app/app.py", line 20, in <module> model = AutoModel.from_pretrained(MODEL_NAME, _attn_implementation='flash_attention_2', torch_dtype=torch.bfloat16, trust_remote_code=True, use_safetensors=True) File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 559, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4091, in from_pretrained config = cls._autoset_attn_implementation( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1617, in _autoset_attn_implementation cls._check_and_enable_flash_attn_2( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1747, in _check_and_enable_flash_attn_2 raise ImportError(f"{preface} the package flash_attn seems to be not installed. {install_message}") ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash_attn seems to be not installed. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.
Container logs:
Fetching error logs...