LlamaLens / README.md

Update README.md

f85b1bf verified 11 months ago

16.7 kB

	---
	license: cc-by-nc-sa-4.0
	datasets:
	- QCRI/LlamaLens-English
	- QCRI/LlamaLens-Arabic
	- QCRI/LlamaLens-Hindi
	language:
	- ar
	- en
	- hi
	base_model:
	- meta-llama/Llama-3.1-8B-Instruct
	pipeline_tag: text-generation
	tags:
	- Social-Media
	- Hate-Speech
	- Summarization
	- offensive-language
	- News-Genre
	metrics:
	- accuracy
	- f1
	- rouge
	---
	# LlamaLens: Specialized Multilingual LLM forAnalyzing News and Social Media Content

	## Overview
	LlamaLens is a specialized multilingual LLM designed for analyzing news and social media content. It focuses on 18 NLP tasks, leveraging 52 datasets across Arabic, English, and Hindi.

	<p align="center">
	<picture>
	<img width="352" alt="capablities_tasks_datasets" src="./llamalens-avatar.png">
	</picture>
	</p>

	## Dataset
	The model was trained on the [LlamaLens dataset](https://huggingface.co/collections/QCRI/llamalens-672f7e0604a0498c6a2f0fe9).

	## To Replicate the Experiments
	The code to replicate the experiments is available on [GitHub](https://github.com/firojalam/LlamaLens).


	## Model Inference

	To utilize the LlamaLens model for inference, follow these steps:

	1. Install the Required Libraries:

	Ensure you have the necessary libraries installed. You can do this using pip:

	```bash
	pip install transformers torch
	```
	2. Load the Model and Tokenizer::
	Use the transformers library to load the LlamaLens model and its tokenizer:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Define model path
	MODEL_PATH = "QCRI/LlamaLens"

	# Load model and tokenizer
	device_map = "auto"
	model = AutoModelForCausalLM.from_pretrained(MODEL_PATH, device_map=device_map)
	tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH, trust_remote_code=True)
	tokenizer.pad_token = tokenizer.eos_token

	```
	3. Prepare the Input::
	Tokenize your input text:
	```python
	# Define task and input text
	task = "classification" # Change to "summarization" for summarization tasks
	instruction = (
	"Analyze the text and indicate if it shows an emotion, then label it as joy, love, fear,"
	" anger, sadness, or surprise. Return only the label without any explanation, justification, or additional text."
	)
	input_text = "I am not creating anything I feel satisfied with."
	output_prefix = "Summary: " if task == "summarization" else "Label: "

	# Define messages for chat-based prompt format
	messages = [
	{"role": "system", "content": "You are a social media expert providing accurate analysis and insights."},
	{"role": "user", "content": f"{instruction}\nInput: {input_text}"},
	{"role": "assistant", "content": output_prefix}
	]

	# Tokenize input
	input_ids = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=False,
	continue_final_message=True,
	tokenize=True,
	padding=True,
	return_tensors="pt"
	).to(model.device)



	```
	4. Generate the Output::
	Generate a response using the model:
	```python
	# Generate response
	outputs = model.generate(
	input_ids,
	max_new_tokens=128,
	do_sample=False,
	eos_token_id=tokenizer.eos_token_id,
	pad_token_id=tokenizer.eos_token_id,
	temperature=0.001
	)

	# Decode and print response
	response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
	print(response)
	```

	## Results

	Below, we present the performance of L-Lens: LlamaLens , where "Eng" refers to the English-instructed model and "Native" refers to the model trained with native language instructions. The results are compared against the SOTA (where available) and the Base: Llama-Instruct 3.1 baseline. The Δ (Delta) column indicates the difference between LlamaLens and the SOTA performance, calculated as (LlamaLens – SOTA).

	---

	## Arabic

	\| Task \| Dataset \| Metric \| SOTA \| Base \| L-Lens-Eng \| L-Lens-Native \| Δ (L-Lens (Eng) - SOTA) \|
	\|:----------------------------------:\|:--------------------------------------------:\|:----------:\|:--------:\|:---------------------:\|:---------------------:\|:--------------------:\|:------------------------:\|
	\| Attentionworthiness Detection \| CT22Attentionworthy \| W-F1 \| 0.412 \| 0.158 \| 0.425 \| 0.454 \| 0.013 \|
	\| Checkworthiness Detection \| CT24_checkworthy \| F1_Pos \| 0.569 \| 0.610 \| 0.502 \| 0.509 \| -0.067 \|
	\| Claim Detection \| CT22Claim \| Acc \| 0.703 \| 0.581 \| 0.734 \| 0.756 \| 0.031 \|
	\| Cyberbullying Detection \| ArCyc_CB \| Acc \| 0.863 \| 0.766 \| 0.870 \| 0.833 \| 0.007 \|
	\| Emotion Detection \| Emotional-Tone \| W-F1 \| 0.658 \| 0.358 \| 0.705 \| 0.736 \| 0.047 \|
	\| Emotion Detection \| NewsHeadline \| Acc \| 1.000 \| 0.406 \| 0.480 \| 0.458 \| -0.520 \|
	\| Factuality \| Arafacts \| Mi-F1 \| 0.850 \| 0.210 \| 0.771 \| 0.738 \| -0.079 \|
	\| Factuality \| COVID19Factuality \| W-F1 \| 0.831 \| 0.492 \| 0.800 \| 0.840 \| -0.031 \|
	\| Harmfulness Detection \| CT22Harmful \| F1_Pos \| 0.557 \| 0.507 \| 0.523 \| 0.535 \| -0.034 \|
	\| Hate Speech Detection \| annotated-hatetweets-4-classes \| W-F1 \| 0.630 \| 0.257 \| 0.526 \| 0.517 \| -0.104 \|
	\| Hate Speech Detection \| OSACT4SubtaskB \| Mi-F1 \| 0.950 \| 0.819 \| 0.955 \| 0.955 \| 0.005 \|
	\| News Categorization \| ASND \| Ma-F1 \| 0.770 \| 0.587 \| 0.919 \| 0.929 \| 0.149 \|
	\| News Categorization \| SANADAkhbarona-news-categorization \| Acc \| 0.940 \| 0.784 \| 0.954 \| 0.953 \| 0.014 \|
	\| News Categorization \| SANADAlArabiya-news-categorization \| Acc \| 0.974 \| 0.893 \| 0.987 \| 0.985 \| 0.013 \|
	\| News Categorization \| SANADAlkhaleej-news-categorization \| Acc \| 0.986 \| 0.865 \| 0.984 \| 0.982 \| -0.002 \|
	\| News Categorization \| UltimateDataset \| Ma-F1 \| 0.970 \| 0.376 \| 0.865 \| 0.880 \| -0.105 \|
	\| News Credibility \| NewsCredibilityDataset \| Acc \| 0.899 \| 0.455 \| 0.935 \| 0.933 \| 0.036 \|
	\| News Summarization \| xlsum \| R-2 \| 0.137 \| 0.034 \| 0.129 \| 0.130 \| -0.009 \|
	\| Offensive Language Detection \| ArCyc_OFF \| Ma-F1 \| 0.878 \| 0.489 \| 0.877 \| 0.879 \| -0.001 \|
	\| Offensive Language Detection \| OSACT4SubtaskA \| Ma-F1 \| 0.905 \| 0.782 \| 0.896 \| 0.882 \| -0.009 \|
	\| Propaganda Detection \| ArPro \| Mi-F1 \| 0.767 \| 0.597 \| 0.747 \| 0.731 \| -0.020 \|
	\| Sarcasm Detection \| ArSarcasm-v2 \| F1_Pos \| 0.584 \| 0.477 \| 0.520 \| 0.542 \| -0.064 \|
	\| Sentiment Classification \| ar_reviews_100k \| F1_Pos \| -- \| 0.681 \| 0.785 \| 0.779 \| -- \|
	\| Sentiment Classification \| ArSAS \| Acc \| 0.920 \| 0.603 \| 0.800 \| 0.804 \| -0.120 \|
	\| Stance Detection \| stance \| Ma-F1 \| 0.767 \| 0.608 \| 0.926 \| 0.881 \| 0.159 \|
	\| Stance Detection \| Mawqif-Arabic-Stance-main \| Ma-F1 \| 0.789 \| 0.764 \| 0.853 \| 0.826 \| 0.065 \|
	\| Subjectivity Detection \| ThatiAR \| f1_pos \| 0.800 \| 0.562 \| 0.441 \| 0.383 \| -0.359 \|

	---

	## English

	\| Task \| Dataset \| Metric \| SOTA \| Base \| L-Lens-Eng \| L-Lens-Native \| Δ (L-Lens (Eng) - SOTA) \|
	\|:----------------------------------:\|:--------------------------------------------:\|:----------:\|:--------:\|:---------------------:\|:---------------------:\|:--------------------:\|:------------------------:\|
	\| Checkworthiness Detection \| CT24_checkworthy \| f1_pos \| 0.753 \| 0.404 \| 0.942 \| 0.942 \| 0.189 \|
	\| Claim Detection \| claim-detection \| Mi-F1 \| -- \| 0.545 \| 0.864 \| 0.889 \| -- \|
	\| Cyberbullying Detection \| Cyberbullying \| Acc \| 0.907 \| 0.175 \| 0.836 \| 0.855 \| -0.071 \|
	\| Emotion Detection \| emotion \| Ma-F1 \| 0.790 \| 0.353 \| 0.803 \| 0.808 \| 0.013 \|
	\| Factuality \| News_dataset \| Acc \| 0.920 \| 0.654 \| 1.000 \| 1.000 \| 0.080 \|
	\| Factuality \| Politifact \| W-F1 \| 0.490 \| 0.121 \| 0.287 \| 0.311 \| -0.203 \|
	\| News Categorization \| CNN_News_Articles_2011-2022 \| Acc \| 0.940 \| 0.644 \| 0.970 \| 0.970 \| 0.030 \|
	\| News Categorization \| News_Category_Dataset \| Ma-F1 \| 0.769 \| 0.970 \| 0.824 \| 0.520 \| 0.055 \|
	\| News Genre Categorisation \| SemEval23T3-subtask1 \| Mi-F1 \| 0.815 \| 0.687 \| 0.241 \| 0.253 \| -0.574 \|
	\| News Summarization \| xlsum \| R-2 \| 0.152 \| 0.074 \| 0.182 \| 0.181 \| 0.030 \|
	\| Offensive Language Detection \| Offensive_Hateful_Dataset_New \| Mi-F1 \| -- \| 0.692 \| 0.814 \| 0.813 \| -- \|
	\| Offensive Language Detection \| offensive_language_dataset \| Mi-F1 \| 0.994 \| 0.646 \| 0.899 \| 0.893 \| -0.095 \|
	\| Offensive Language and Hate Speech \| hate-offensive-speech \| Acc \| 0.945 \| 0.602 \| 0.931 \| 0.935 \| -0.014 \|
	\| Propaganda Detection \| QProp \| Ma-F1 \| 0.667 \| 0.759 \| 0.963 \| 0.973 \| 0.296 \|
	\| Sarcasm Detection \| News-Headlines-Dataset-For-Sarcasm-Detection \| Acc \| 0.897 \| 0.668 \| 0.936 \| 0.947 \| 0.039 \|
	\| Sentiment Classification \| NewsMTSC-dataset \| Ma-F1 \| 0.817 \| 0.628 \| 0.751 \| 0.748 \| -0.066 \|
	\| Subjectivity Detection \| clef2024-checkthat-lab \| Ma-F1 \| 0.744 \| 0.535 \| 0.642 \| 0.628 \| -0.102 \|
	\|

	---

	## Hindi

	\| Task \| Dataset \| Metric \| SOTA \| Base \| L-Lens-Eng \| L-Lens-Native \| Δ (L-Lens (Eng) - SOTA) \|
	\|:----------------------------------:\|:--------------------------------------------:\|:----------:\|:--------:\|:---------------------:\|:---------------------:\|:--------------------:\|:------------------------:\|
	\| Factuality \| fake-news \| Mi-F1 \| -- \| 0.759 \| 0.994 \| 0.993 \| -- \|
	\| Hate Speech Detection \| hate-speech-detection \| Mi-F1 \| 0.639 \| 0.750 \| 0.963 \| 0.963 \| 0.324 \|
	\| Hate Speech Detection \| Hindi-Hostility-Detection-CONSTRAINT-2021 \| W-F1 \| 0.841 \| 0.469 \| 0.753 \| 0.753 \| -0.088 \|
	\| Natural Language Inference \| Natural Language Inference \| W-F1 \| 0.646 \| 0.633 \| 0.568 \| 0.679 \| -0.078 \|
	\| News Summarization \| xlsum \| R-2 \| 0.136 \| 0.078 \| 0.171 \| 0.170 \| 0.035 \|
	\| Offensive Language Detection \| Offensive Speech Detection \| Mi-F1 \| 0.723 \| 0.621 \| 0.862 \| 0.865 \| 0.139 \|
	\| Cyberbullying Detection \| MC_Hinglish1 \| Acc \| 0.609 \| 0.233 \| 0.625 \| 0.627 \| 0.016 \|
	\| Sentiment Classification \| Sentiment Analysis \| Acc \| 0.697 \| 0.552 \| 0.647 \| 0.654 \| -0.050

	## Paper
	For an in-depth understanding, refer to our paper: [LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content](https://arxiv.org/pdf/2410.15308).




	# License
	This model is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).


	# Citation
	Please cite [our paper](https://arxiv.org/pdf/2410.15308) when using this model:

	```
	@article{kmainasi2024llamalensspecializedmultilingualllm,
	title={LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content},
	author={Mohamed Bayan Kmainasi and Ali Ezzat Shahroor and Maram Hasanain and Sahinur Rahman Laskar and Naeemul Hassan and Firoj Alam},
	year={2024},
	journal={arXiv preprint arXiv:2410.15308},
	volume={},
	number={},
	pages={},
	url={https://arxiv.org/abs/2410.15308},
	eprint={2410.15308},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```