Any-to-Any
Transformers
Safetensors
multilingual
minicpmo
feature-extraction
minicpm-o
omni
vision
ocr
multi-image
video
custom_code
audio
speech
voice cloning
live Streaming
realtime speech conversation
asr
tts
Instructions to use openbmb/MiniCPM-o-2_6 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openbmb/MiniCPM-o-2_6 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("openbmb/MiniCPM-o-2_6", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Is there a official script for inference or finetuning with audio modality?
#50
by JasonLee996 - opened
As title. Meanwhile, I find it said in README at github repo that the model could be tuned with Align-Anything, but in the funetune.py in offical repo, MiniCPM-o/finetune/finetune.py set "init_audio=False", so will it be OK if I use this script to tuning the audio pathway with this parameter "True"?
We have supported audio modality fine-tuning for MiniCPM-o on LLaMA-Factory. You can refer to the LLaMA-Factory documentation for processing audio datasets.π