tiny-random-MiniCPM-o-2_6 / README.md

arashkermani

Upload folder using huggingface_hub

924a12c verified 20 days ago

6.14 kB

license: apache-2.0
library_name: transformers
tags:
  - vision
  - image-text-to-text
  - multimodal
  - test-model
  - tiny-model
  - openvino
  - optimum-intel
pipeline_tag: image-text-to-text

Tiny Random MiniCPM-o-2_6

Model Description

This is a tiny random-initialized version of the openbmb/MiniCPM-o-2_6 multimodal vision-language model, designed specifically for testing and CI/CD purposes in the optimum-intel library.

⚠️ Important: This model has randomly initialized weights and is NOT intended for actual inference. It is designed solely for:

Testing model loading and export functionality
CI/CD pipeline validation
OpenVINO conversion testing
Quantization workflow testing

Model Specifications

Architecture: MiniCPM-o-2_6 (multimodal: vision + text + audio + TTS)
Parameters: 1,477,376 (~1.48M parameters)
Model Binary Size: 5.64 MB
Total Repository Size: ~21 MB
Original Model: openbmb/MiniCPM-o-2_6 (~18 GB)
Size Reduction: 853× smaller than the full model

Architecture Details

Language Model (LLM) Component

num_hidden_layers: 2 (reduced from 40)
hidden_size: 256 (reduced from 2048)
intermediate_size: 512 (reduced from 8192)
num_attention_heads: 4 (reduced from 32)
vocab_size: 320 (reduced from 151,700)
max_position_embeddings: 128 (reduced from 8192)

Vision Component (SigLIP-based)

hidden_size: 8
num_hidden_layers: 1

Audio Component (Whisper-based)

d_model: 64
encoder_layers: 1
decoder_layers: 1

TTS Component

hidden_size: 8
num_layers: 1

All architectural components are present but miniaturized to ensure API compatibility while drastically reducing compute requirements.

Usage

Loading with Transformers

from transformers import AutoModelForCausalLM, AutoProcessor
import torch

model_id = "arashkermani/tiny-random-MiniCPM-o-2_6"

# Load model
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    torch_dtype=torch.float32,
    device_map="cpu"
)

# Load processor
processor = AutoProcessor.from_pretrained(
    model_id,
    trust_remote_code=True
)

# Test forward pass
input_ids = torch.randint(0, 320, (1, 5))
position_ids = torch.arange(5).unsqueeze(0)

data = {
    "input_ids": input_ids,
    "pixel_values": [[]],
    "tgt_sizes": [[]],
    "image_bound": [[]],
    "position_ids": position_ids,
}

with torch.no_grad():
    outputs = model(data=data)

print(f"Logits shape: {outputs.logits.shape}")  # (1, 5, 320)

Using with Optimum-Intel (OpenVINO)

from optimum.intel.openvino import OVModelForVisualCausalLM
from transformers import AutoProcessor

model_id = "arashkermani/tiny-random-MiniCPM-o-2_6"

# Load model for OpenVINO
model = OVModelForVisualCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True
)

processor = AutoProcessor.from_pretrained(
    model_id,
    trust_remote_code=True
)

Export to OpenVINO

optimum-cli export openvino \
  -m arashkermani/tiny-random-MiniCPM-o-2_6 \
  minicpm-o-openvino \
  --task=image-text-to-text \
  --trust-remote-code

Intended Use

This model is intended exclusively for:

✅ Testing optimum-intel OpenVINO export functionality
✅ CI/CD pipeline validation
✅ Model loading and compatibility testing
✅ Quantization workflow testing
✅ Fast prototyping and debugging

Not intended for:

❌ Production inference
❌ Actual image-text-to-text tasks
❌ Model quality evaluation
❌ Benchmarking performance metrics

Training Details

This model was generated by:

Loading the config from optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6
Reducing all dimensions to minimal viable values
Initializing weights randomly using AutoModelForCausalLM.from_config()
Copying all necessary tokenizer, processor, and custom code files

No training was performed - all weights are randomly initialized.

Validation Results

The model has been validated to ensure:

✅ Loads with trust_remote_code=True
✅ Compatible with transformers AutoModel APIs
✅ Supports forward pass with expected input format
✅ Compatible with OpenVINO export via optimum-intel
✅ Includes all required custom modules and artifacts

See the validation report for detailed technical analysis.

Files Included

config.json - Model configuration
pytorch_model.bin - Model weights (5.64 MB)
generation_config.json - Generation parameters
preprocessor_config.json - Preprocessor configuration
processor_config.json - Processor configuration
tokenizer_config.json - Tokenizer configuration
tokenizer.json - Fast tokenizer
vocab.json - Vocabulary
merges.txt - BPE merges
Custom Python modules:
- modeling_minicpmo.py
- configuration_minicpm.py
- processing_minicpmo.py
- image_processing_minicpmv.py
- tokenization_minicpmo_fast.py
- modeling_navit_siglip.py
- resampler.py
- utils.py

Related Models

Original model: openbmb/MiniCPM-o-2_6
Previous test model: optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6

License

This model follows the same license as the original MiniCPM-o-2_6 model (Apache 2.0).

Citation

If you use this test model in your CI/CD or testing infrastructure, please reference:

@misc{tiny-minicpm-o-2_6,
  author = {Arash Kermani},
  title = {Tiny Random MiniCPM-o-2_6 for Testing},
  year = {2026},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/arashkermani/tiny-random-MiniCPM-o-2_6}}
}

Contact

For issues or questions about this test model, please open an issue in the optimum-intel repository.