Bug fix for Transformers v4.49.0 per https://huggingface.co/microsoft/Phi-3.5-vision-instruct/discussions/39/files; added a note to README.md to indicate this repo as a fork.
Browse files- README.md +2 -0
- modeling_phi3_v.py +1 -1
README.md
CHANGED
|
@@ -19,6 +19,8 @@ library_name: transformers
|
|
| 19 |
---
|
| 20 |
## Model Summary
|
| 21 |
|
|
|
|
|
|
|
| 22 |
Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision. The model belongs to the Phi-3 model family, and the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
|
| 23 |
|
| 24 |
🏡 [Phi-3 Portal](https://azure.microsoft.com/en-us/products/phi-3) <br>
|
|
|
|
| 19 |
---
|
| 20 |
## Model Summary
|
| 21 |
|
| 22 |
+
n.b., This is a fork of [microsoft/Phi-3.5-vision-instruct](https://huggingface.co/microsoft/Phi-3.5-vision-instruct) that fixes the bug in modeling_phi3_v.py per [this Discussion here.](https://huggingface.co/microsoft/Phi-3.5-vision-instruct/discussions/39/files)
|
| 23 |
+
|
| 24 |
Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision. The model belongs to the Phi-3 model family, and the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
|
| 25 |
|
| 26 |
🏡 [Phi-3 Portal](https://azure.microsoft.com/en-us/products/phi-3) <br>
|
modeling_phi3_v.py
CHANGED
|
@@ -1658,7 +1658,7 @@ class Phi3VForCausalLM(Phi3VPreTrainedModel):
|
|
| 1658 |
if isinstance(past_key_values, Cache):
|
| 1659 |
cache_length = past_key_values.get_seq_length()
|
| 1660 |
past_length = past_key_values.seen_tokens
|
| 1661 |
-
max_cache_length = past_key_values.
|
| 1662 |
else:
|
| 1663 |
cache_length = past_length = past_key_values[0][0].shape[2]
|
| 1664 |
max_cache_length = None
|
|
|
|
| 1658 |
if isinstance(past_key_values, Cache):
|
| 1659 |
cache_length = past_key_values.get_seq_length()
|
| 1660 |
past_length = past_key_values.seen_tokens
|
| 1661 |
+
max_cache_length = past_key_values.get_max_cache_shape()
|
| 1662 |
else:
|
| 1663 |
cache_length = past_length = past_key_values[0][0].shape[2]
|
| 1664 |
max_cache_length = None
|