Update README.md
Browse files
README.md
CHANGED
|
@@ -10,22 +10,30 @@ Multimodal Large Language Models (MM-LLMs) have seen significant advancements in
|
|
| 10 |
Authors: Jainaveen Sundaram, Ravishankar Iyer
|
| 11 |
|
| 12 |
|
| 13 |
-
### Training
|
| 14 |
Two step training pipeline outlined in the LLaVa1.5 paper, consisting of two phases: (1) A Pre-training phase for feature alignment followed by an (2) End-to-end instruction fine-tuning
|
| 15 |
The pre-training phase involves 1 epoch on a filtered subset of 595K Conceptual Captions [2], with only the projection layer weights updated. For instruction fine-tuning, we use 1 epoch of the LLaVa-Instruct-150K dataset, with both projection layer and LLM weights updated.
|
|
|
|
| 16 |
|
| 17 |
-
###
|
| 18 |
-
|
|
|
|
|
|
|
| 19 |
|
| 20 |
-
|
| 21 |
-
TODO - Add clear instructions on how to use the model
|
| 22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
-
## Evaluation Results
|
| 25 |
-
TODO - Add results data
|
| 26 |
|
| 27 |
## Model Sources
|
| 28 |
-
|
| 29 |
|
| 30 |
## Ethical Considerations
|
| 31 |
|
|
@@ -33,15 +41,15 @@ Intel is committed to respecting human rights and avoiding causing or contributi
|
|
| 33 |
|
| 34 |
| Ethical Considerations | Description |
|
| 35 |
| ----------- | ----------- |
|
| 36 |
-
| Data |
|
| 37 |
-
| Human life |
|
| 38 |
-
| Mitigations |
|
| 39 |
-
| Risks and harms |
|
| 40 |
| Use cases | - |
|
| 41 |
|
| 42 |
## Citation
|
| 43 |
|
| 44 |
-
|
| 45 |
|
| 46 |
## License
|
| 47 |
|
|
|
|
| 10 |
Authors: Jainaveen Sundaram, Ravishankar Iyer
|
| 11 |
|
| 12 |
|
| 13 |
+
### Training details and Evaluation
|
| 14 |
Two step training pipeline outlined in the LLaVa1.5 paper, consisting of two phases: (1) A Pre-training phase for feature alignment followed by an (2) End-to-end instruction fine-tuning
|
| 15 |
The pre-training phase involves 1 epoch on a filtered subset of 595K Conceptual Captions [2], with only the projection layer weights updated. For instruction fine-tuning, we use 1 epoch of the LLaVa-Instruct-150K dataset, with both projection layer and LLM weights updated.
|
| 16 |
+
For model evaluation, please refer to the linked technical report (coming soon!).
|
| 17 |
|
| 18 |
+
### How to use
|
| 19 |
+
Start off by cloning the repository:
|
| 20 |
+
git clone https://huggingface.co/IntelLabs/LlavaOLMoBitnet1B
|
| 21 |
+
cd LlavaOLMoBitnet1B
|
| 22 |
|
| 23 |
+
Install all the requirements by following instructions on requirements.txt
|
|
|
|
| 24 |
|
| 25 |
+
You are all set! Run inference by calling:
|
| 26 |
+
python llava_olmo.py
|
| 27 |
+
|
| 28 |
+
To pass in your own query, modify the following lines within the file:
|
| 29 |
+
|
| 30 |
+
#Define Image and Text inputs..
|
| 31 |
+
text = "Be concise. What are the four major tournaments of the sport shown in the image?"
|
| 32 |
+
url = "https://farm3.staticflickr.com/2157/2439959136_d932f4e816_z.jpg"
|
| 33 |
|
|
|
|
|
|
|
| 34 |
|
| 35 |
## Model Sources
|
| 36 |
+
Arxiv link for technical report coming soon!
|
| 37 |
|
| 38 |
## Ethical Considerations
|
| 39 |
|
|
|
|
| 41 |
|
| 42 |
| Ethical Considerations | Description |
|
| 43 |
| ----------- | ----------- |
|
| 44 |
+
| Data | The model was trained using the LLaVA-v1.5 data mixture as described above.|
|
| 45 |
+
| Human life | The model is not intended to inform decisions central to human life or flourishing. |
|
| 46 |
+
| Mitigations | No additional risk mitigation strategies were considered during model development. |
|
| 47 |
+
| Risks and harms | This model has not been assessed for harm or biases, and should not be used for sensitive applications where it may cause harm. |
|
| 48 |
| Use cases | - |
|
| 49 |
|
| 50 |
## Citation
|
| 51 |
|
| 52 |
+
Coming soon
|
| 53 |
|
| 54 |
## License
|
| 55 |
|