Image-Text-to-Text
Transformers
TensorBoard
Safetensors
feature-extraction
conversational
custom_code
wincentIsMe commited on
Commit
6b3d970
·
verified ·
1 Parent(s): bdf9518

fix: import `flash_attn_varlen_func` from `flash_attn` instead of `transformers.modeling_flash_attention_utils`

Browse files

When I load this model from the `transformers`.
```python
from transformers import AutoTokenizer, AutoProcessor, AutoModelForCausalLM

model_path = "lmms-lab/LLaVA-OneVision-1.5-8B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
model_path, torch_dtype="auto", device_map="auto", trust_remote_code=True
)
processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)
```
The following error occurs.
```bash
ImportError: cannot import name 'flash_attn_varlen_func' from 'transformers.modeling_flash_attention_utils'
```
This is because the current `transformers` library no longer exposes the `flash_attn_varlen_func` API in the `transformers.modeling_flash_attention_utils` module.
The solution is to import the `flash_attn_varlen_func` API from `flash_attn`.


In fact, this bug has been fixed in [LLaVA-OneVision-1.5 GitHub Repo(fix_issue#31)](https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-1.5/pull/33).
However, it has not yet been synchronized to the Hugging Face repository.

Files changed (1) hide show
  1. modeling_llavaonevision1_5.py +2 -1
modeling_llavaonevision1_5.py CHANGED
@@ -46,7 +46,8 @@ from .configuration_llavaonevision1_5 import Llavaonevision1_5Config, LLaVAOneVi
46
 
47
 
48
  if is_flash_attn_available():
49
- from transformers.modeling_flash_attention_utils import _flash_attention_forward, flash_attn_varlen_func
 
50
 
51
  if is_torch_flex_attn_available():
52
  from torch.nn.attention.flex_attention import BlockMask
 
46
 
47
 
48
  if is_flash_attn_available():
49
+ from transformers.modeling_flash_attention_utils import _flash_attention_forward
50
+ from flash_attn import flash_attn_varlen_func
51
 
52
  if is_torch_flex_attn_available():
53
  from torch.nn.attention.flex_attention import BlockMask