Upload folder using huggingface_hub

b386992 verified 6 months ago

877 Bytes

	Evaluation of Large Language Models with the NeMo 2.0
	=====================================================

	This directory contains Jupyter Notebook tutorials using the NeMo Framework for evaluating large language models (LLMs):

	1. mmlu.ipynb
	- Provides an overview of model deployment and available endpoints.
	- Demonstrates how to run MMLU evaluations for both completions and chat endpoints to assess model proficiency across diverse subjects.


	2. simple-evals.ipynb
	- Shows how to enable additional evaluation frameworks with the evaluation suite.
	- Uses NVIDIA Evals Factory Simple-Evals to demonstrate how to run evaluations for the HumanEval benchmark.

	3. wikitext.ipynb
	- Illustrates running evaluation tasks without predefined configurations.
	- Uses the WikiText benchmark as an example to define and execute a custom evaluation job.