|
|
---
|
|
|
library_name: transformers
|
|
|
license: apache-2.0
|
|
|
tags:
|
|
|
- vision
|
|
|
- multimodal
|
|
|
- tiny-model
|
|
|
- minicpm
|
|
|
pipeline_tag: image-to-text
|
|
|
---
|
|
|
|
|
|
# Tiny MiniCPM-o-2_6 Model
|
|
|
|
|
|
A minimal, optimized version of MiniCPM-o-2_6 for testing and development purposes.
|
|
|
|
|
|
## Model Details
|
|
|
|
|
|
- **Model Size**: ~54 MB (PyTorch safetensors format)
|
|
|
- **Format**: PyTorch safetensors (not OpenVINO IR)
|
|
|
- **Vocabulary Size**: 50,000 tokens (reduced from 151,700)
|
|
|
- **Architecture**: MiniCPM-o-2_6 with optimized dimensions
|
|
|
|
|
|
## Model Configuration
|
|
|
|
|
|
- **hidden_size**: 128 (reduced from 168)
|
|
|
- **intermediate_size**: 8 (reduced from 16)
|
|
|
- **num_hidden_layers**: 2
|
|
|
- **num_attention_heads**: 2 (reduced from 28)
|
|
|
- **query_num**: 64
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
```python
|
|
|
from transformers import AutoProcessor, AutoModelForCausalLM
|
|
|
from PIL import Image
|
|
|
|
|
|
# Load processor and model
|
|
|
processor = AutoProcessor.from_pretrained("M-Ziyo/tiny-random-MiniCPM-o-2_6-mini", trust_remote_code=True)
|
|
|
model = AutoModelForCausalLM.from_pretrained("M-Ziyo/tiny-random-MiniCPM-o-2_6-mini", trust_remote_code=True)
|
|
|
|
|
|
# Prepare inputs
|
|
|
prompt = "<|im_start|>user\n(<image>./</image>)\nWhat is in the image?<|im_end|>\n<|im_start|>assistant\n"
|
|
|
image = Image.open("your_image.jpg")
|
|
|
|
|
|
inputs = processor([prompt], [image], return_tensors="pt")
|
|
|
|
|
|
# Generate
|
|
|
result = model.generate(**inputs, max_new_tokens=50)
|
|
|
decoded = processor.tokenizer.batch_decode(result[:, inputs["input_ids"].shape[1]:])
|
|
|
print(decoded)
|
|
|
```
|
|
|
|
|
|
## Model Features
|
|
|
|
|
|
- ✅ **PyTorch format** with safetensors (not OpenVINO IR)
|
|
|
- ✅ **Optimized size** (~54 MB vs original)
|
|
|
- ✅ **Weight copying** from original model for better output quality
|
|
|
- ✅ **Diverse output** (not just repetitive characters)
|
|
|
|
|
|
## Notes
|
|
|
|
|
|
- This is a minimal test model for development purposes
|
|
|
- Model weights are copied from the original model for better initialization
|
|
|
- Designed for testing Optimum-Intel integration
|
|
|
|
|
|
## Citation
|
|
|
|
|
|
Based on MiniCPM-o-2_6 from OpenBMB.
|
|
|
|
|
|
|