TwT-6
/

api-demo

Model card Files Files and versions

api-demo / opencompass-my-api /docs /en /MMBench.md

TwT-6's picture

Upload 2667 files

256a159 verified about 2 years ago

|

history blame contribute delete

3.96 kB

	# Evaluation pipeline on MMBench

	## Intro to each data sample in MMBench

	MMBecnh is split into dev and test split, and each data sample in each split contains the following field:

	```
	img: the raw data of an image
	question: the question
	options: the concated options
	category: the leaf category
	l2-category: the l2-level category
	options_dict: the dict contains all options
	index: the unique identifier of current question
	context (optional): the context to a question, which is optional.
	answer: the target answer to current question. (only exists in the dev split, and is keep confidential for the test split on our evaluation server)
	```

	## Load MMBench

	We provide a code snippet as an example of loading MMBench

	```python
	import base64
	import io
	import random

	import pandas as pd
	from PIL import Image
	from torch.utils.data import Dataset

	def decode_base64_to_image(base64_string):
	image_data = base64.b64decode(base64_string)
	image = Image.open(io.BytesIO(image_data))
	return image

	class MMBenchDataset(Dataset):
	def __init__(self,
	data_file,
	sys_prompt='There are several options:'):
	self.df = pd.read_csv(data_file, sep='\t')
	self.sys_prompt = sys_prompt

	def __len__(self):
	return len(self.df)

	def __getitem__(self, idx):
	index = self.df.iloc[idx]['index']
	image = self.df.iloc[idx]['image']
	image = decode_base64_to_image(image)
	question = self.df.iloc[idx]['question']
	answer = self.df.iloc[idx]['answer'] if 'answer' in self.df.iloc[0].keys() else None
	catetory = self.df.iloc[idx]['category']
	l2_catetory = self.df.iloc[idx]['l2-category']

	option_candidate = ['A', 'B', 'C', 'D', 'E']
	options = {
	cand: self.load_from_df(idx, cand)
	for cand in option_candidate
	if self.load_from_df(idx, cand) is not None
	}
	options_prompt = f'{self.sys_prompt}\n'
	for key, item in options.items():
	options_prompt += f'{key}. {item}\n'

	hint = self.load_from_df(idx, 'hint')
	data = {
	'img': image,
	'question': question,
	'answer': answer,
	'options': options_prompt,
	'category': catetory,
	'l2-category': l2_catetory,
	'options_dict': options,
	'index': index,
	'context': hint,
	}
	return data
	def load_from_df(self, idx, key):
	if key in self.df.iloc[idx] and not pd.isna(self.df.iloc[idx][key]):
	return self.df.iloc[idx][key]
	else:
	return None
	```

	## How to construct the inference prompt

	```python
	if data_sample['context'] is not None:
	prompt = data_sample['context'] + ' ' + data_sample['question'] + ' ' + data_sample['options']
	else:
	prompt = data_sample['question'] + ' ' + data_sample['options']
	```

	For example:
	Question: Which category does this image belong to?
	A. Oil Painting
	B. Sketch
	C. Digital art
	D. Photo

	<div align=center>
	<img src="https://github-production-user-asset-6210df.s3.amazonaws.com/34324155/255581681-1364ef43-bd27-4eb5-b9e5-241327b1f920.png" width="50%"/>
	</div>

	```python
	prompt = """
	###Human: Question: Which category does this image belong to?
	There are several options: A. Oil Painting, B. Sketch, C. Digital art, D. Photo
	###Assistant:
	"""
	```

	You can make custom modifications to the prompt

	## How to save results:

	You should dump your model's predictions into an excel(.xlsx) file, and this file should contain the following fields:

	```
	question: the question
	A: The first choice
	B: The second choice
	C: The third choice
	D: The fourth choice
	prediction: The prediction of your model to current question
	category: the leaf category
	l2_category: the l2-level category
	index: the question index
	```

	If there are any questions with fewer than four options, simply leave those fields blank.