Commit
·
d464935
1
Parent(s):
a2ee636
small wording issue
Browse files
README.md
CHANGED
|
@@ -195,7 +195,7 @@ We use the following hyper and training parameters:
|
|
| 195 |
|
| 196 |
We start from the base IDEFICS models and fine-tune the models by unfreezing all the parameters (vision encoder, language model, cross-attentions). The mixture is composed of following English datasets:
|
| 197 |
|
| 198 |
-
| Data Source | Data Description | Number of
|
| 199 |
|-------------|----------------------------------------------|------------------------------|----------------|
|
| 200 |
| [M3IT](https://huggingface.co/datasets/MMInstruction/M3IT) | Prompted image-text academic datasets | 1.5M | 7.7% |
|
| 201 |
| [LRV-Instruction](https://huggingface.co/datasets/VictorSanh/LrvInstruction) | Triplets of image/question/answer | 155K | 1.7% |
|
|
|
|
| 195 |
|
| 196 |
We start from the base IDEFICS models and fine-tune the models by unfreezing all the parameters (vision encoder, language model, cross-attentions). The mixture is composed of following English datasets:
|
| 197 |
|
| 198 |
+
| Data Source | Data Description | Number of Unique Samples | Sampling ratio |
|
| 199 |
|-------------|----------------------------------------------|------------------------------|----------------|
|
| 200 |
| [M3IT](https://huggingface.co/datasets/MMInstruction/M3IT) | Prompted image-text academic datasets | 1.5M | 7.7% |
|
| 201 |
| [LRV-Instruction](https://huggingface.co/datasets/VictorSanh/LrvInstruction) | Triplets of image/question/answer | 155K | 1.7% |
|