starride-teklia commited on
Commit
8a596b5
·
verified ·
1 Parent(s): 45a74c3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +147 -0
README.md ADDED
@@ -0,0 +1,147 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - fr
5
+ pipeline_tag: image-text-to-text
6
+ tags:
7
+ - multimodal
8
+ library_name: transformers
9
+ metrics:
10
+ - cer
11
+ - wer
12
+ - f1
13
+ base_model:
14
+ - Qwen/Qwen2.5-VL-7B-Instruct
15
+ ---
16
+
17
+ # Qwen2.5-VL-7B-Instruct Index Cards Nested
18
+
19
+ ## Introduction
20
+
21
+ This version of QWEN2.5-VL-7B is specialized for document parsing on French index cards.
22
+
23
+ ## Training
24
+
25
+ The model is a [QWEN2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) fine-tuned on French index cards using LoRA.
26
+
27
+ Parameters:
28
+ - Image width: 800 pixels
29
+ - LoRa rank: 8
30
+ - LoRa alpha: 32
31
+ - Epochs: 10 (about 4k steps)
32
+
33
+ Wandb: https://wandb.ai/starride-teklia/DAI-CReTDHI/runs/hk78u308
34
+
35
+ ## Evaluation
36
+
37
+ | Model | CER (%) | WER (%) | F1 @ 0 | F1 @ 0.3 | N images | N entities |
38
+ | ---------------------|---------|---------|--------|----------|----------|------------|
39
+ | QWEN2.5-VL-7B Flat | 10.23 | 18.07 | 83.6 | 91.96 | 55 | 808 |
40
+ | QWEN2.5-VL-7B Nested | 7.87 | 14.90 | 86.3 | 94.36 | 55 | 808 |
41
+
42
+ ### Usage
43
+
44
+ Here we show a code snippet to show you how to use the model with `transformers` and `qwen_vl_utils`:
45
+
46
+ * Prediction script
47
+
48
+ ```python
49
+ from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
50
+ from qwen_vl_utils import process_vision_info
51
+
52
+ model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
53
+ "starride-teklia/qwen2.5vl-index-cards-nested",
54
+ torch_dtype=torch.bfloat16,
55
+ attn_implementation="flash_attention_2",
56
+ device_map="auto",
57
+ )
58
+
59
+ processor = AutoProcessor.from_pretrained("starride-teklia/qwen2.5vl-index-cards-nested")
60
+
61
+ messages = [
62
+ {
63
+ "role": "user",
64
+ "content": [
65
+ {
66
+ "type": "image",
67
+ "image": "https://europe.iiif.teklia.com/iiif/2/dai-cretdhi%2FTours%2FAMT-LOTS_EC_NMD%2FEC_LOT_0281%2FFRAC037261_EC_LOT_0281_0097.JPG/full/800,/0/default.jpg",
68
+ },
69
+ {"type": "text", "text": "Extrait les informations en XML."},
70
+ ],
71
+ }
72
+ ]
73
+
74
+ # Preparation for inference
75
+ text = processor.apply_chat_template(
76
+ messages, tokenize=False, add_generation_prompt=True
77
+ )
78
+ image_inputs, video_inputs = process_vision_info(messages)
79
+ inputs = processor(
80
+ text=[text],
81
+ images=image_inputs,
82
+ videos=video_inputs,
83
+ padding=True,
84
+ return_tensors="pt",
85
+ )
86
+ inputs = inputs.to("cuda")
87
+
88
+ # Inference: Generation of the output
89
+ generated_ids = model.generate(**inputs, max_new_tokens=1024)
90
+ generated_ids_trimmed = [
91
+ out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
92
+ ]
93
+ output_text = processor.batch_decode(
94
+ generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
95
+ )
96
+ print(output_text[0])
97
+ ```
98
+
99
+ * Output
100
+
101
+ ```xml
102
+ <root>
103
+ <Décès>
104
+ <Défunt>
105
+ <Nom>Choisnard</Nom>
106
+ <Prénom>Marie Madelaine</Prénom>
107
+ <Sexe>F</Sexe>
108
+ <DateDeNaissance>23 juillet 1753</DateDeNaissance>
109
+ <LieuDeNaissance>Ambroise (Indre-et-Loire)</LieuDeNaissance>
110
+ <Profession>journalière</Profession>
111
+ <Statut>veuf(ve)</Statut>
112
+ </Défunt>
113
+ <Conjoint>
114
+ <Nom>Rocheriou</Nom>
115
+ <Prénom>Pierre</Prénom>
116
+ <Statut>décédé(e)</Statut>
117
+ </Conjoint>
118
+ <Père>
119
+ <Nom>Choisnard</Nom>
120
+ <Prénom>Michel</Prénom>
121
+ </Père>
122
+ <Mère>
123
+ <Nom>Dubeuf</Nom>
124
+ <Prénom>Louise</Prénom>
125
+ </Mère>
126
+ </Décès>
127
+ <Date>
128
+ <Année>1826</Année>
129
+ <Mois>septembre</Mois>
130
+ <Jour>5</Jour>
131
+ </Date>
132
+ </root>
133
+ ```
134
+
135
+ ## Citation
136
+
137
+ To cite the original QWEN2.5-VL model:
138
+
139
+ ```
140
+ @misc{qwen2.5-VL,
141
+ title = {Qwen2.5-VL},
142
+ url = {https://qwenlm.github.io/blog/qwen2.5-vl/},
143
+ author = {Qwen Team},
144
+ month = {January},
145
+ year = {2025}
146
+ }
147
+ ```