| # LMMs Eval Documentation | |
| Welcome to the documentation for `lmms-eval` - a unified evaluation framework for Large Multimodal Models! | |
| This framework enables consistent and reproducible evaluation of multimodal models across various tasks and modalities including images, videos, and audio. | |
| ## Overview | |
| `lmms-eval` provides: | |
| - Standardized evaluation protocols for multimodal models | |
| - Support for image, video, and audio tasks | |
| - Easy integration of new models and tasks | |
| - Reproducible benchmarking with shareable configurations | |
| Majority of this documentation is adapted from [lm-eval-harness](https://github.com/EleutherAI/lm-evaluation-harness/) | |
| ## Table of Contents | |
| * **[Commands Guide](commands.md)** - Learn about command line flags and options | |
| * **[Model Guide](model_guide.md)** - How to add and integrate new models | |
| * **[Task Guide](task_guide.md)** - Create custom evaluation tasks | |
| * **[Current Tasks](current_tasks.md)** - List of all supported evaluation tasks | |
| * **[Run Examples](run_examples.md)** - Example commands for running evaluations | |
| * **[Caching](caching.md)** - Enable and reload results from the JSONL cache | |
| * **[Version 0.3 Features](lmms-eval-0.3.md)** - Audio evaluation and new features | |
| * **[Throughput Metrics](throughput_metrics.md)** - Understanding performance metrics | |
| ## Additional Resources | |
| * For dataset formatting tools, see [lmms-eval tools](https://github.com/EvolvingLMMs-Lab/lmms-eval/tree/main/tools) | |
| * For the latest updates, visit our [GitHub repository](https://github.com/EvolvingLMMs-Lab/lmms-eval) | |