File size: 5,275 Bytes
3ff0e39 b9e6428 3ff0e39 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
---
license: apache-2.0
base_model: openbmb/MiniCPM-o-2_6
tags:
- vision
- text-generation
- multimodal
- minicpm
- tiny-model
- testing
- optimum-intel
pipeline_tag: text-generation
library_name: transformers
---
# tiny-random-MiniCPM-o-2_6
A minimal, randomly initialized version of MiniCPM-o-2_6 designed for testing and development purposes. This model maintains the same architecture as the original MiniCPM-o-2_6 but with drastically reduced dimensions to create a lightweight test model.
## Model Details
### Model Description
This is a tiny, randomly initialized version of the MiniCPM-o-2_6 multimodal model. It was created by scaling down the original model's dimensions while preserving the architecture structure. The model is intended for:
- Testing and development workflows
- Integration testing with Optimum-Intel
- Quick prototyping and experimentation
- CI/CD pipelines requiring lightweight models
**⚠️ Important:** This model is randomly initialized and should NOT be used for production inference. It is designed solely for testing purposes.
### Model Architecture
The model maintains the same architecture as MiniCPM-o-2_6 but with reduced dimensions:
**Language Model (LLM):**
- `hidden_size`: 40
- `num_hidden_layers`: 1
- `num_attention_heads`: 4
- `num_key_value_heads`: 2
- `intermediate_size`: 16
- `max_position_embeddings`: 128
- `vocab_size`: 151,700
**Vision Component:**
- `hidden_size`: 16
- `num_hidden_layers`: 1
- `num_attention_heads`: 4
- `intermediate_size`: 8
- `patch_size`: 14
**Audio/TTS Components:**
- Audio: Disabled (`init_audio: false`)
- TTS: Disabled (`init_tts: false`)
### Model Size
- **Total Parameters**: ~6.17M
- **Model Size**: ~12.4 MB (on disk)
- **Precision**: bfloat16
## Usage
### Basic Usage
```python
from transformers import AutoModel, AutoTokenizer, AutoProcessor
import torch
from PIL import Image
# Load model and tokenizer
model_id = "notlikejoe/tiny-random-MiniCPM-o-2_6"
model = AutoModel.from_pretrained(model_id, trust_remote_code=True, torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
# Prepare inputs
text = "Hello, how are you?"
image = Image.new('RGB', (224, 224), color='red') # Dummy image
# Process inputs
inputs = processor(text=text, images=image, return_tensors="pt")
# Forward pass
model.eval()
with torch.no_grad():
outputs = model(**inputs)
```
### With Optimum-Intel
This model is compatible with Optimum-Intel for OpenVINO optimization:
```python
from optimum.intel import OVModelForCausalLM
from transformers import AutoTokenizer
model_id = "notlikejoe/tiny-random-MiniCPM-o-2_6"
# Export to OpenVINO format
ov_model = OVModelForCausalLM.from_pretrained(
model_id,
export=True,
trust_remote_code=True
)
# Use for inference
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
```
## Model Validation
The model has been validated to ensure:
✅ Model loads successfully from Hugging Face
✅ Config, tokenizer, and processor load correctly
✅ Model structure matches expected architecture
✅ Compatible with Optimum-Intel export
✅ Forward pass completes without errors
✅ **OpenVINO compatibility fix applied**: Resampler `num_heads=0` issue resolved
### OpenVINO Compatibility Fix
This model includes a fix for the OpenVINO loading issue where `num_heads=0` would occur with small `embed_dim` values. The resampler's `num_heads` calculation has been patched to ensure it's always at least 1:
```python
# Original: num_heads = embed_dim // 128 # Would be 0 when embed_dim=40
# Fixed: num_heads = 1 if embed_dim < 128 else max(1, embed_dim // 128)
```
The `modeling_minicpmo.py` file included with this model contains this fix, ensuring compatibility with Optimum-Intel OpenVINO export and loading.
## Limitations
1. **Random Initialization**: This model is randomly initialized and will not produce meaningful outputs
2. **Reduced Dimensions**: The model dimensions are minimal and may not capture complex patterns
3. **Testing Only**: This model is intended for testing and development, not production use
4. **Limited Vocabulary**: The vocabulary has been reduced to 2000 entries for size optimization
## Training Details
This model was not trained. It is a randomly initialized, dimensionally-reduced version of MiniCPM-o-2_6 created for testing purposes.
### Training Data
N/A - Model is randomly initialized.
## Evaluation
This model is not intended for evaluation on standard benchmarks as it is randomly initialized.
## Citation
If you use this model, please cite the original MiniCPM-o-2_6 model:
```bibtex
@misc{minicpm-o-2_6,
title={MiniCPM-o-2_6},
author={OpenBMB},
year={2024},
howpublished={\url{https://huggingface.co/openbmb/MiniCPM-o-2_6}}
}
```
## Model Card Contact
For questions or issues related to this model, please open an issue in the repository.
## License
This model is licensed under the Apache 2.0 License, same as the base model.
|