File size: 5,275 Bytes
3ff0e39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b9e6428
 
 
 
 
 
 
 
 
 
 
 
3ff0e39
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
---

license: apache-2.0
base_model: openbmb/MiniCPM-o-2_6
tags:
- vision
- text-generation
- multimodal
- minicpm
- tiny-model
- testing
- optimum-intel
pipeline_tag: text-generation
library_name: transformers
---


# tiny-random-MiniCPM-o-2_6



A minimal, randomly initialized version of MiniCPM-o-2_6 designed for testing and development purposes. This model maintains the same architecture as the original MiniCPM-o-2_6 but with drastically reduced dimensions to create a lightweight test model.



## Model Details



### Model Description



This is a tiny, randomly initialized version of the MiniCPM-o-2_6 multimodal model. It was created by scaling down the original model's dimensions while preserving the architecture structure. The model is intended for:

- Testing and development workflows
- Integration testing with Optimum-Intel
- Quick prototyping and experimentation
- CI/CD pipelines requiring lightweight models

**⚠️ Important:** This model is randomly initialized and should NOT be used for production inference. It is designed solely for testing purposes.

### Model Architecture

The model maintains the same architecture as MiniCPM-o-2_6 but with reduced dimensions:



**Language Model (LLM):**

- `hidden_size`: 40
- `num_hidden_layers`: 1
- `num_attention_heads`: 4
- `num_key_value_heads`: 2
- `intermediate_size`: 16
- `max_position_embeddings`: 128
- `vocab_size`: 151,700

**Vision Component:**
- `hidden_size`: 16
- `num_hidden_layers`: 1
- `num_attention_heads`: 4
- `intermediate_size`: 8
- `patch_size`: 14

**Audio/TTS Components:**
- Audio: Disabled (`init_audio: false`)
- TTS: Disabled (`init_tts: false`)

### Model Size

- **Total Parameters**: ~6.17M
- **Model Size**: ~12.4 MB (on disk)
- **Precision**: bfloat16

## Usage

### Basic Usage

```python

from transformers import AutoModel, AutoTokenizer, AutoProcessor

import torch

from PIL import Image



# Load model and tokenizer

model_id = "notlikejoe/tiny-random-MiniCPM-o-2_6"

model = AutoModel.from_pretrained(model_id, trust_remote_code=True, torch_dtype=torch.bfloat16)

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)



# Prepare inputs

text = "Hello, how are you?"

image = Image.new('RGB', (224, 224), color='red')  # Dummy image



# Process inputs

inputs = processor(text=text, images=image, return_tensors="pt")



# Forward pass

model.eval()

with torch.no_grad():

    outputs = model(**inputs)

```

### With Optimum-Intel

This model is compatible with Optimum-Intel for OpenVINO optimization:

```python

from optimum.intel import OVModelForCausalLM

from transformers import AutoTokenizer



model_id = "notlikejoe/tiny-random-MiniCPM-o-2_6"



# Export to OpenVINO format

ov_model = OVModelForCausalLM.from_pretrained(

    model_id,

    export=True,

    trust_remote_code=True

)



# Use for inference

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

```

## Model Validation

The model has been validated to ensure:

✅ Model loads successfully from Hugging Face  
✅ Config, tokenizer, and processor load correctly  
✅ Model structure matches expected architecture  
✅ Compatible with Optimum-Intel export  
✅ Forward pass completes without errors  
✅ **OpenVINO compatibility fix applied**: Resampler `num_heads=0` issue resolved

### OpenVINO Compatibility Fix

This model includes a fix for the OpenVINO loading issue where `num_heads=0` would occur with small `embed_dim` values. The resampler's `num_heads` calculation has been patched to ensure it's always at least 1:

```python

# Original: num_heads = embed_dim // 128  # Would be 0 when embed_dim=40

# Fixed:    num_heads = 1 if embed_dim < 128 else max(1, embed_dim // 128)

```

The `modeling_minicpmo.py` file included with this model contains this fix, ensuring compatibility with Optimum-Intel OpenVINO export and loading.  

## Limitations

1. **Random Initialization**: This model is randomly initialized and will not produce meaningful outputs
2. **Reduced Dimensions**: The model dimensions are minimal and may not capture complex patterns
3. **Testing Only**: This model is intended for testing and development, not production use
4. **Limited Vocabulary**: The vocabulary has been reduced to 2000 entries for size optimization

## Training Details

This model was not trained. It is a randomly initialized, dimensionally-reduced version of MiniCPM-o-2_6 created for testing purposes.



### Training Data



N/A - Model is randomly initialized.



## Evaluation



This model is not intended for evaluation on standard benchmarks as it is randomly initialized.



## Citation



If you use this model, please cite the original MiniCPM-o-2_6 model:

```bibtex

@misc{minicpm-o-2_6,

  title={MiniCPM-o-2_6},

  author={OpenBMB},

  year={2024},

  howpublished={\url{https://huggingface.co/openbmb/MiniCPM-o-2_6}}

}

```

## Model Card Contact

For questions or issues related to this model, please open an issue in the repository.

## License

This model is licensed under the Apache 2.0 License, same as the base model.