|
|
--- |
|
|
library_name: transformers |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
- fr |
|
|
- de |
|
|
- es |
|
|
- it |
|
|
- pt |
|
|
- ru |
|
|
- zh |
|
|
- ja |
|
|
base_model: |
|
|
- mistralai/Mistral-Nemo-Instruct-2407 |
|
|
--- |
|
|
|
|
|
# Model Card for educa-ai-nemo-sft |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
`educa-ai-nemo-sft` is our SFT fine-tune of the powerful [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407), |
|
|
using our internal dataset which contains a unique mix of German and English instruction data covering a multitude of domains. |
|
|
In its creation we have paid special attention to data points that can improve performance in the educational field (text analysis, supporting students in completing textual tasks, ...). |
|
|
|
|
|
This is a preliminary release and subject to changes or updates. Additionally, we are publishing a preference-aligned updated version of this model in the near future. |
|
|
|
|
|
- **Developed by:** [Digital Learning GmbH](https://huggingface.co/DigitalLearningGmbH) |
|
|
- **Funded by [optional]:** [Digital Learning GmbH](https://huggingface.co/DigitalLearningGmbH) |
|
|
- **Shared by [optional]:** [Digital Learning GmbH](https://huggingface.co/DigitalLearningGmbH) |
|
|
- **Model type:** Transformer Decoder LLM |
|
|
- **Language(s) (NLP):** English, French, German, Spanish, Italian, Portuguese, Russian, Chinese, Japanese |
|
|
- **License:** [Apache License 2.0](https://choosealicense.com/licenses/apache-2.0/) |
|
|
- **Finetuned from model:** [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
|
|
|
|
|
## Uses |
|
|
|
|
|
As stated before, this is a preliminary release and we are still benchmarking the model as well as improving our datasets for possible further training. |
|
|
As such, we do not recommend using this model in a production setting yet and are looking forward to engaging with the community regarding possible downstream uses and improvements. |
|
|
|
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
Refer to the [original model card](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) for an overview of the general risks associated with using this model. |
|
|
As this version is only fine-tuned using SFT without any preference alignment, the model may output harmful data. Use is at your own discretion, taking into account the potential risks. |
|
|
|
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Refer to the [original model card](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) for code examples. |
|
|
Be aware that this model uses a slightly different chat template from the original: system prompts are placed before the first user prompt (before the first instance of `[INST]`). |
|
|
We include the updated template in the tokenizer config, so you can use `tokenizer.apply_chat_template`. |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
The model has been trained on a mix of some publically-available and permissively-licensed data as well as a majority of unique internal datasets which we have created. |
|
|
Our data encompasses examples of a length up to 16384 tokens, further enhancing the model's long-context capability. |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
**IMPORTANT:** We performed benchmarks using lighteval. The accuracy numbers obtained this way differ greatly from the base model's official benchmarks and those performed with different benchmark suites. |
|
|
Thus, we have run the same benchmarks using lighteval on the [base model](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) under the exact same conditions as well for comparison. |
|
|
**As of 2025-01-24, We are working on running these benchmarks again using a different suite as well as running more German-specific benchmarks.** |
|
|
|
|
|
### English Benchmarks |
|
|
| Benchmark | Mistral-Nemo-Instruct 2407 | educa-ai-nemo-sft | |
|
|
| --- | --- | --- | |
|
|
| HellaSwag (0-shot) | **44.33%** | 38.65% | |
|
|
| WinoGrande (0-shot) | 55.49% | **58.56%** | |
|
|
| OpenBookQA (0-shot) | **40.60%** | 36.40% | |
|
|
| CommonSenseQA (0-shot) | 37.26% | **39.31%** | |
|
|
| TruthfulQA (0-shot) | 56.12% | **59.94%** | |
|
|
| MMLU (5-shot) | 30.10% | **37.91%** | |
|
|
|
|
|
|
|
|
### Multilingual Benchmarks (MMLU) |
|
|
| Language | Mistral-Nemo-Instruct 2407 | educa-ai-nemo-sft | |
|
|
| --- | --- | --- | |
|
|
| French | **30.32%** | 29.05% | |
|
|
| German | 27.69% | **41.82%** | |
|
|
| Spanish | 24.69% | **30.25%** | |
|
|
| Italian | 31.29% | **34.81%** | |
|
|
| Portuguese | 24.16% | **28.81%** | |
|
|
| Chinese | 34.80% | **37.85%** | |
|
|
| Japanese | 34.27% | **35.18%** | |
|
|
|
|
|
|
|
|
## Model Card Authors [optional] |
|
|
|
|
|
This model card was written by [Lennard Michael Strohmeyer](https://huggingface.co/LenDigLearn) |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
[Lennard Michael Strohmeyer](https://huggingface.co/LenDigLearn) |