NingLab
/

CASLIE-L

Safetensors

llama

Model card Files Files and versions

xet

Community

Improve model card: Add pipeline tag, library name, links, and sample usage

by nielsr HF Staff - opened Nov 14, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+26

-4

Files changed (1) hide show

README.md +26 -4

README.md CHANGED Viewed

@@ -1,18 +1,40 @@
 ---
-license: cc-by-4.0
-datasets:
-- NingLab/MMECInstruct
 base_model:
 - meta-llama/Llama-2-13b-chat-hf
 ---
 # CASLIE-L
-This repo contains the models for "Captions Speak Louder than Images (CASLIE): Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data"
 ## CASLIE Models
 The CASLIE-L model is instruction-tuned from the large base model [Llama-2-13b-chat](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf).
 ## Citation
 ```bibtex
 @article{ling2024captions,

 ---
 base_model:
 - meta-llama/Llama-2-13b-chat-hf
+datasets:
+- NingLab/MMECInstruct
+license: cc-by-4.0
+library_name: transformers
+pipeline_tag: image-text-to-text
 ---
 # CASLIE-L
+This repository contains the models for "[Captions Speak Louder than Images: Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data](https://huggingface.co/papers/2410.17337)".
+**Project Page**: [https://ninglab.github.io/CASLIE/](https://ninglab.github.io/CASLIE/)
+**Code Repository**: [https://github.com/ninglab/CASLIE](https://github.com/ninglab/CASLIE)
+## Introduction
+Leveraging multimodal data to drive breakthroughs in e-commerce applications through Multimodal Foundation Models (MFMs) is gaining increasing attention. This work introduces [MMECInstruct](https://huggingface.co/datasets/NingLab/MMECInstruct), the first-ever, large-scale, and high-quality multimodal instruction dataset for e-commerce. We also develop CASLIE, a simple, lightweight, yet effective framework for integrating multimodal information for e-commerce. Leveraging MMECInstruct, we fine-tune a series of e-commerce MFMs within CASLIE, denoted as CASLIE models.
 ## CASLIE Models
 The CASLIE-L model is instruction-tuned from the large base model [Llama-2-13b-chat](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf).
+## Sample Usage (Modality-unified Inference)
+To conduct inference with the CASLIE models, refer to the following example directly from the [official GitHub repository](https://github.com/ninglab/CASLIE#modality-unified-inference).
+`$model_path` is the path of the instruction-tuned model.
+`$task` specifies the task to be tested.
+`$output_path` specifies the path where you want to save the inference output.
+Example:
+```
+python inference.py --model_path NingLab/CASLIE-M --task answerability_prediction --output_path ap.json
+```
 ## Citation
 ```bibtex
 @article{ling2024captions,