README.md · arashkermani/tiny-random-MiniCPM-o-2

tiny-random-MiniCPM-o-2_6 / README.md

arashkermani

Upload folder using huggingface_hub

924a12c verified 20 days ago

preview code

raw

history blame

6.14 kB

	---
	license: apache-2.0
	library_name: transformers
	tags:
	- vision
	- image-text-to-text
	- multimodal
	- test-model
	- tiny-model
	- openvino
	- optimum-intel
	pipeline_tag: image-text-to-text
	---

	# Tiny Random MiniCPM-o-2_6

	## Model Description

	This is a tiny random-initialized version of the [openbmb/MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6) multimodal vision-language model, designed specifically for testing and CI/CD purposes in the [optimum-intel](https://github.com/huggingface/optimum-intel) library.

	⚠️ Important: This model has randomly initialized weights and is NOT intended for actual inference. It is designed solely for:
	- Testing model loading and export functionality
	- CI/CD pipeline validation
	- OpenVINO conversion testing
	- Quantization workflow testing

	## Model Specifications

	- Architecture: MiniCPM-o-2_6 (multimodal: vision + text + audio + TTS)
	- Parameters: 1,477,376 (~1.48M parameters)
	- Model Binary Size: 5.64 MB
	- Total Repository Size: ~21 MB
	- Original Model: [openbmb/MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6) (~18 GB)
	- Size Reduction: 853× smaller than the full model

	## Architecture Details

	### Language Model (LLM) Component
	- `num_hidden_layers`: 2 (reduced from 40)
	- `hidden_size`: 256 (reduced from 2048)
	- `intermediate_size`: 512 (reduced from 8192)
	- `num_attention_heads`: 4 (reduced from 32)
	- `vocab_size`: 320 (reduced from 151,700)
	- `max_position_embeddings`: 128 (reduced from 8192)

	### Vision Component (SigLIP-based)
	- `hidden_size`: 8
	- `num_hidden_layers`: 1

	### Audio Component (Whisper-based)
	- `d_model`: 64
	- `encoder_layers`: 1
	- `decoder_layers`: 1

	### TTS Component
	- `hidden_size`: 8
	- `num_layers`: 1

	All architectural components are present but miniaturized to ensure API compatibility while drastically reducing compute requirements.

	## Usage

	### Loading with Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoProcessor
	import torch

	model_id = "arashkermani/tiny-random-MiniCPM-o-2_6"

	# Load model
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	trust_remote_code=True,
	torch_dtype=torch.float32,
	device_map="cpu"
	)

	# Load processor
	processor = AutoProcessor.from_pretrained(
	model_id,
	trust_remote_code=True
	)

	# Test forward pass
	input_ids = torch.randint(0, 320, (1, 5))
	position_ids = torch.arange(5).unsqueeze(0)

	data = {
	"input_ids": input_ids,
	"pixel_values": [[]],
	"tgt_sizes": [[]],
	"image_bound": [[]],
	"position_ids": position_ids,
	}

	with torch.no_grad():
	outputs = model(data=data)

	print(f"Logits shape: {outputs.logits.shape}") # (1, 5, 320)
	```

	### Using with Optimum-Intel (OpenVINO)

	```python
	from optimum.intel.openvino import OVModelForVisualCausalLM
	from transformers import AutoProcessor

	model_id = "arashkermani/tiny-random-MiniCPM-o-2_6"

	# Load model for OpenVINO
	model = OVModelForVisualCausalLM.from_pretrained(
	model_id,
	trust_remote_code=True
	)

	processor = AutoProcessor.from_pretrained(
	model_id,
	trust_remote_code=True
	)
	```

	### Export to OpenVINO

	```bash
	optimum-cli export openvino \
	-m arashkermani/tiny-random-MiniCPM-o-2_6 \
	minicpm-o-openvino \
	--task=image-text-to-text \
	--trust-remote-code
	```

	## Intended Use

	This model is intended exclusively for:
	- ✅ Testing optimum-intel OpenVINO export functionality
	- ✅ CI/CD pipeline validation
	- ✅ Model loading and compatibility testing
	- ✅ Quantization workflow testing
	- ✅ Fast prototyping and debugging

	Not intended for:
	- ❌ Production inference
	- ❌ Actual image-text-to-text tasks
	- ❌ Model quality evaluation
	- ❌ Benchmarking performance metrics

	## Training Details

	This model was generated by:
	1. Loading the config from `optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6`
	2. Reducing all dimensions to minimal viable values
	3. Initializing weights randomly using `AutoModelForCausalLM.from_config()`
	4. Copying all necessary tokenizer, processor, and custom code files

	No training was performed - all weights are randomly initialized.

	## Validation Results

	The model has been validated to ensure:
	- ✅ Loads with `trust_remote_code=True`
	- ✅ Compatible with transformers AutoModel APIs
	- ✅ Supports forward pass with expected input format
	- ✅ Compatible with OpenVINO export via optimum-intel
	- ✅ Includes all required custom modules and artifacts

	See the [validation report](https://github.com/arashkermani/tiny-minicpm-o) for detailed technical analysis.

	## Files Included

	- `config.json` - Model configuration
	- `pytorch_model.bin` - Model weights (5.64 MB)
	- `generation_config.json` - Generation parameters
	- `preprocessor_config.json` - Preprocessor configuration
	- `processor_config.json` - Processor configuration
	- `tokenizer_config.json` - Tokenizer configuration
	- `tokenizer.json` - Fast tokenizer
	- `vocab.json` - Vocabulary
	- `merges.txt` - BPE merges
	- Custom Python modules:
	- `modeling_minicpmo.py`
	- `configuration_minicpm.py`
	- `processing_minicpmo.py`
	- `image_processing_minicpmv.py`
	- `tokenization_minicpmo_fast.py`
	- `modeling_navit_siglip.py`
	- `resampler.py`
	- `utils.py`

	## Related Models

	- Original model: [openbmb/MiniCPM-o-2_6](https://huggingface.co/openbmb/MiniCPM-o-2_6)
	- Previous test model: [optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6](https://huggingface.co/optimum-intel-internal-testing/tiny-random-MiniCPM-o-2_6)

	## License

	This model follows the same license as the original MiniCPM-o-2_6 model (Apache 2.0).

	## Citation

	If you use this test model in your CI/CD or testing infrastructure, please reference:

	```bibtex
	@misc{tiny-minicpm-o-2_6,
	author = {Arash Kermani},
	title = {Tiny Random MiniCPM-o-2_6 for Testing},
	year = {2026},
	publisher = {HuggingFace},
	howpublished = {\url{https://huggingface.co/arashkermani/tiny-random-MiniCPM-o-2_6}}
	}
	```

	## Contact

	For issues or questions about this test model, please open an issue in the [optimum-intel repository](https://github.com/huggingface/optimum-intel/issues).