CASLIE-L / README.md

nielsr HF Staff

Improve model card: Add pipeline tag, library name, links, and sample usage

276d113 verified 3 months ago

preview code

raw

history blame

2.21 kB

metadata

base_model:
  - meta-llama/Llama-2-13b-chat-hf
datasets:
  - NingLab/MMECInstruct
license: cc-by-4.0
library_name: transformers
pipeline_tag: image-text-to-text

CASLIE-L

This repository contains the models for "Captions Speak Louder than Images: Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data".

Project Page: https://ninglab.github.io/CASLIE/ Code Repository: https://github.com/ninglab/CASLIE

Introduction

Leveraging multimodal data to drive breakthroughs in e-commerce applications through Multimodal Foundation Models (MFMs) is gaining increasing attention. This work introduces MMECInstruct, the first-ever, large-scale, and high-quality multimodal instruction dataset for e-commerce. We also develop CASLIE, a simple, lightweight, yet effective framework for integrating multimodal information for e-commerce. Leveraging MMECInstruct, we fine-tune a series of e-commerce MFMs within CASLIE, denoted as CASLIE models.

CASLIE Models

The CASLIE-L model is instruction-tuned from the large base model Llama-2-13b-chat.

Sample Usage (Modality-unified Inference)

To conduct inference with the CASLIE models, refer to the following example directly from the official GitHub repository.

$model_path is the path of the instruction-tuned model.

$task specifies the task to be tested.

$output_path specifies the path where you want to save the inference output.

Example:

python inference.py --model_path NingLab/CASLIE-M --task answerability_prediction --output_path ap.json

Citation

@article{ling2024captions,
    title={Captions Speak Louder than Images (CASLIE): Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data},
    author={Ling, Xinyi and Peng, Bo and Du, Hanwen and Zhu, Zhihui and Ning, Xia},
    journal={arXiv preprint arXiv:2410.17337},
    year={2024}
}