devichand commited on
Commit
27858f2
·
verified ·
1 Parent(s): 6709e55

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Model Card — Qwen2-VL-ImgChat-2B
2
+
3
+ ## Model Details
4
+ - **Model Name:** Qwen2-VL-ImgChat-2B
5
+ - **Model Type:** Vision-Language Model fine-tuned for multimodal dialog auto-completion
6
+ - **Language(s):** English
7
+ - **Base Model:** Qwen2-VL-2B
8
+ - **Fine-tuning Dataset:** ImageChat
9
+ - **License:** Same as base model (Qwen2-VL license)
10
+ - **Repository:** https://github.com/devichand579/MAC
11
+
12
+ ---
13
+
14
+ ## Intended Use
15
+
16
+ ### Direct Use
17
+ This model generates conversational responses conditioned on both textual and visual context. It is suitable for:
18
+ - Multimodal dialog systems
19
+ - Image-grounded conversational agents
20
+ - Research on multimodal auto-completion
21
+
22
+ ### Out-of-Scope Use
23
+ The model is not intended for:
24
+ - Medical, legal, or financial advice
25
+ - Safety-critical decision-making
26
+ - Autonomous systems requiring guaranteed correctness
27
+
28
+ ---
29
+
30
+ ## Limitations and Risks
31
+ - Model outputs may contain inaccuracies or biases inherited from training data.
32
+ - Performance depends on image relevance and dialogue context quality.
33
+ - The model is not explicitly safety-filtered.
34
+
35
+ ---
36
+
37
+ ## How to Use
38
+
39
+ Example usage with Hugging Face Transformers:
40
+
41
+ ```python
42
+ from transformers import AutoProcessor, AutoModelForVision2Seq
43
+
44
+ processor = AutoProcessor.from_pretrained("devichand/MiniCPM_V_ImgChat-7B")
45
+ model = AutoModelForVision2Seq.from_pretrained("devichand/MiniCPM_V_ImgChat-7B")
46
+
47
+ inputs = processor(images=your_image,
48
+ text="Describe the image.",
49
+ return_tensors="pt")
50
+
51
+ outputs = model.generate(**inputs)
52
+ print(processor.decode(outputs[0]))