devichand
/

MiniCPM_V_Noimg_ImgChat-7B

Model card Files Files and versions

MiniCPM_V_Noimg_ImgChat-7B / README.md

devichand's picture

Update README.md

2400d1c verified 4 days ago

|

history blame contribute delete

1.57 kB

	# Model Card — Qwen2-VL-ImgChat-2B

	## Model Details
	- Model Name: Qwen2-VL-ImgChat-2B
	- Model Type: Vision-Language Model fine-tuned for multimodal dialog auto-completion
	- Language(s): English
	- Base Model: Qwen2-VL-2B
	- Fine-tuning Dataset: ImageChat
	- License: Same as base model (Qwen2-VL license)
	- Repository: https://github.com/devichand579/MAC

	---

	## Intended Use

	### Direct Use
	This model generates conversational responses conditioned on both textual and visual context. It is suitable for:
	- Multimodal dialog systems
	- Image-grounded conversational agents
	- Research on multimodal auto-completion

	### Out-of-Scope Use
	The model is not intended for:
	- Medical, legal, or financial advice
	- Safety-critical decision-making
	- Autonomous systems requiring guaranteed correctness

	---

	## Limitations and Risks
	- Model outputs may contain inaccuracies or biases inherited from training data.
	- Performance depends on image relevance and dialogue context quality.
	- The model is not explicitly safety-filtered.

	---

	## How to Use

	Example usage with Hugging Face Transformers:

	```python
	from transformers import AutoProcessor, AutoModelForVision2Seq

	processor = AutoProcessor.from_pretrained("devichand/MiniCPM_V_Noimg_ImgChat-7B")
	model = AutoModelForVision2Seq.from_pretrained("devichand/MiniCPM_V_Noimg_ImgChat-7B")

	inputs = processor(images=your_image,
	text="Describe the image.",
	return_tensors="pt")

	outputs = model.generate(**inputs)
	print(processor.decode(outputs[0]))