File size: 1,243 Bytes
b02277f 75cc1a2 c6b0ab8 dd703e6 856cfe3 4246556 dd703e6 4246556 dd60b3e c6b0ab8 856cfe3 a7ca61d 856cfe3 10c2b6c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
---
license: apache-2.0
language:
- en
base_model:
- allenai/Molmo-7B-D-0924
pipeline_tag: text-generation
library_name: peft
tags:
- lora
- finetune
- agent
---
Testing a QLoRA adaptor for [allenai/Molmo-7B-D-0924](https://huggingface.co/allenai/Molmo-7B-D-0924),
Targets attention layer of Transformer backbone and image pooling and projection layers of Vision backbone
Trained on 47 screenshots of a low-poly video game with ragdoll casualties
Evaluated on 44 screenshots of aforementioned video game
Molmo has an edge case where it declares there are no humans in an image:

This custom QLoRA successfully reduces the occurance of these cases

However, pointing to non-human objects is observed to increase.
Comparison of Model performance with and without QLora on Eval dataset
|Model| Molmo-7B-D | Molmo-7B-D w/ QLora |
|----------|------|------|
| Precision | 92.1 | 80.5 |
| Recall | 70.4 | 88.5 |
Dataset: [reubk/RavenfieldDataset](https://huggingface.co/datasets/reubk/RavenfieldDataset) |