File size: 1,549 Bytes
b0c0df0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# LMMs Eval Documentation

Welcome to the documentation for `lmms-eval` - a unified evaluation framework for Large Multimodal Models!

This framework enables consistent and reproducible evaluation of multimodal models across various tasks and modalities including images, videos, and audio.

## Overview

`lmms-eval` provides:
- Standardized evaluation protocols for multimodal models
- Support for image, video, and audio tasks
- Easy integration of new models and tasks
- Reproducible benchmarking with shareable configurations

Majority of this documentation is adapted from [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness/)

## Table of Contents

* **[Commands Guide](commands.md)** - Learn about command line flags and options
* **[Model Guide](model_guide.md)** - How to add and integrate new models
* **[Task Guide](task_guide.md)** - Create custom evaluation tasks
* **[Current Tasks](current_tasks.md)** - List of all supported evaluation tasks
* **[Run Examples](run_examples.md)** - Example commands for running evaluations
* **[Caching](caching.md)** - Enable and reload results from the JSONL cache
* **[Version 0.3 Features](lmms-eval-0.3.md)** - Audio evaluation and new features
* **[Throughput Metrics](throughput_metrics.md)** - Understanding performance metrics

## Additional Resources

* For dataset formatting tools, see [lmms-eval tools](https://github.com/EvolvingLMMs-Lab/lmms-eval/tree/main/tools)
* For the latest updates, visit our [GitHub repository](https://github.com/EvolvingLMMs-Lab/lmms-eval)