Any-to-Any
Transformers
Safetensors
multilingual
minicpmo
feature-extraction
minicpm-o
omni
vision
ocr
multi-image
video
custom_code
audio
speech
voice cloning
live Streaming
realtime speech conversation
asr
tts
Instructions to use openbmb/MiniCPM-o-2_6 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openbmb/MiniCPM-o-2_6 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("openbmb/MiniCPM-o-2_6", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -63,7 +63,7 @@ MiniCPM-o 2.6 can be easily used in various ways: (1) [llama.cpp](https://github
|
|
| 63 |
- **Configurable Speech Modeling Design.** We devise a multimodal system prompt, including traditional text system prompt, and **a new audio system prompt to determine the assistant voice**. This enables flexible voice configurations in inference time, and also facilitates end-to-end voice cloning and description-based voice creation.
|
| 64 |
|
| 65 |
<div align="center">
|
| 66 |
-
<img src="https://github.com/OpenBMB/MiniCPM-o/raw/main/assets/minicpm-o-26-framework-v2.png" , width=
|
| 67 |
</div>
|
| 68 |
|
| 69 |
|
|
|
|
| 63 |
- **Configurable Speech Modeling Design.** We devise a multimodal system prompt, including traditional text system prompt, and **a new audio system prompt to determine the assistant voice**. This enables flexible voice configurations in inference time, and also facilitates end-to-end voice cloning and description-based voice creation.
|
| 64 |
|
| 65 |
<div align="center">
|
| 66 |
+
<img src="https://github.com/OpenBMB/MiniCPM-o/raw/main/assets/minicpm-o-26-framework-v2.png" , width=100%>
|
| 67 |
</div>
|
| 68 |
|
| 69 |
|