File size: 1,549 Bytes
b0c0df0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
# LMMs Eval Documentation
Welcome to the documentation for `lmms-eval` - a unified evaluation framework for Large Multimodal Models!
This framework enables consistent and reproducible evaluation of multimodal models across various tasks and modalities including images, videos, and audio.
## Overview
`lmms-eval` provides:
- Standardized evaluation protocols for multimodal models
- Support for image, video, and audio tasks
- Easy integration of new models and tasks
- Reproducible benchmarking with shareable configurations
Majority of this documentation is adapted from [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness/)
## Table of Contents
* **[Commands Guide](commands.md)** - Learn about command line flags and options
* **[Model Guide](model_guide.md)** - How to add and integrate new models
* **[Task Guide](task_guide.md)** - Create custom evaluation tasks
* **[Current Tasks](current_tasks.md)** - List of all supported evaluation tasks
* **[Run Examples](run_examples.md)** - Example commands for running evaluations
* **[Caching](caching.md)** - Enable and reload results from the JSONL cache
* **[Version 0.3 Features](lmms-eval-0.3.md)** - Audio evaluation and new features
* **[Throughput Metrics](throughput_metrics.md)** - Understanding performance metrics
## Additional Resources
* For dataset formatting tools, see [lmms-eval tools](https://github.com/EvolvingLMMs-Lab/lmms-eval/tree/main/tools)
* For the latest updates, visit our [GitHub repository](https://github.com/EvolvingLMMs-Lab/lmms-eval)
|