Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
microsoft
/
Phi-4-multimodal-instruct
like
1.55k
Follow
Microsoft
17.5k
Automatic Speech Recognition
Transformers
Safetensors
24 languages
phi4mm
text-generation
nlp
code
audio
speech-summarization
speech-translation
visual-question-answering
phi-4-multimodal
phi
phi-4-mini
custom_code
arxiv:
2503.01743
arxiv:
2407.13833
License:
mit
Model card
Files
Files and versions
xet
Community
86
Deploy
Use this model
refs/pr/20
Phi-4-multimodal-instruct
12.9 GB
14 contributors
History:
18 commits
phh
Make Phi4MMForCausalLM.forward's num_logits_to_keep actually optional
bc541d0
verified
10 months ago
examples
Add examples
10 months ago
figures
Added model files
10 months ago
speech-lora
Added model files
10 months ago
vision-lora
Added model files
10 months ago
.gitattributes
1.61 kB
added technical report
10 months ago
CODE_OF_CONDUCT.md
444 Bytes
Added model files
10 months ago
LICENSE
1.14 kB
Added model files
10 months ago
README.md
54.8 kB
Update readme
10 months ago
SECURITY.md
2.66 kB
Added model files
10 months ago
SUPPORT.md
1.24 kB
Added model files
10 months ago
added_tokens.json
249 Bytes
Added model files
10 months ago
config.json
4.63 kB
Added model files
10 months ago
configuration_phi4mm.py
11 kB
Added model files
10 months ago
generation_config.json
190 Bytes
Added model files
10 months ago
merges.txt
2.42 MB
Added model files
10 months ago
model-00001-of-00003.safetensors
5 GB
xet
Added model files
10 months ago
model-00002-of-00003.safetensors
4.95 GB
xet
Added model files
10 months ago
model-00003-of-00003.safetensors
1.2 GB
xet
Added model files
10 months ago
model.safetensors.index.json
240 kB
Added model files
10 months ago
modeling_phi4mm.py
116 kB
Make Phi4MMForCausalLM.forward's num_logits_to_keep actually optional
10 months ago
phi_4_mm.tech_report.02252025.pdf
5.3 MB
xet
added technical report
10 months ago
preprocessor_config.json
482 Bytes
Added model files
10 months ago
processing_phi4mm.py
32.8 kB
Added model files
10 months ago
processor_config.json
121 Bytes
Added model files
10 months ago
sample_finetune_speech.py
16.7 kB
Added model files
10 months ago
sample_finetune_vision.py
19.6 kB
Added model files
10 months ago
sample_inference_phi4mm.py
10.5 kB
Added model files
10 months ago
special_tokens_map.json
473 Bytes
Added model files
10 months ago
speech_conformer_encoder.py
111 kB
Added model files
10 months ago
tokenizer.json
15.5 MB
xet
Added model files
10 months ago
tokenizer_config.json
3.25 kB
Added model files
10 months ago
vision_siglip_navit.py
78.2 kB
Added model files
10 months ago
vocab.json
3.91 MB
Added model files
10 months ago