Update README.md
Browse files
README.md
CHANGED
|
@@ -17,7 +17,7 @@ metrics:
|
|
| 17 |
# Model Information
|
| 18 |
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
- It's trained on [distilabel-intel-orca-kto](https://huggingface.co/datasets/argilla/distilabel-intel-orca-kto).
|
| 23 |
|
|
@@ -27,7 +27,7 @@ We evaluated the model using the same test sets as used for the [Open LLM Leader
|
|
| 27 |
|
| 28 |
| hellaswag acc_norm | arc_challenge acc_norm | m_mmlu 5-shot acc | Average |
|
| 29 |
|:----------------------| :--------------- | :-------------------- | :------- |
|
| 30 |
-
| 0.7915 | 0.5606 | 0.6939 | 0.
|
| 31 |
|
| 32 |
|
| 33 |
## Usage
|
|
@@ -43,8 +43,8 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
| 43 |
|
| 44 |
device = "cpu" # if you want to use the gpu make sure to have cuda toolkit installed and change this to "cuda"
|
| 45 |
|
| 46 |
-
model = AutoModelForCausalLM.from_pretrained("MoxoffSpA/
|
| 47 |
-
tokenizer = AutoTokenizer.from_pretrained("MoxoffSpA/
|
| 48 |
|
| 49 |
question = """Quanto è alta la torre di Pisa?"""
|
| 50 |
context = """
|
|
@@ -78,7 +78,7 @@ print(trimmed_output)
|
|
| 78 |
|
| 79 |
## Bias, Risks and Limitations
|
| 80 |
|
| 81 |
-
|
| 82 |
responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). It is also unknown what the size and composition
|
| 83 |
of the corpus was used to train the base model, however it is likely to have included a mix of Web data and technical sources
|
| 84 |
like books and code.
|
|
|
|
| 17 |
# Model Information
|
| 18 |
|
| 19 |
|
| 20 |
+
Moxoff-Phi3Mini-KTO is an updated version of [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct), aligned with KTO and QLora.
|
| 21 |
|
| 22 |
- It's trained on [distilabel-intel-orca-kto](https://huggingface.co/datasets/argilla/distilabel-intel-orca-kto).
|
| 23 |
|
|
|
|
| 27 |
|
| 28 |
| hellaswag acc_norm | arc_challenge acc_norm | m_mmlu 5-shot acc | Average |
|
| 29 |
|:----------------------| :--------------- | :-------------------- | :------- |
|
| 30 |
+
| 0.7915 | 0.5606 | 0.6939 | 0.682 |
|
| 31 |
|
| 32 |
|
| 33 |
## Usage
|
|
|
|
| 43 |
|
| 44 |
device = "cpu" # if you want to use the gpu make sure to have cuda toolkit installed and change this to "cuda"
|
| 45 |
|
| 46 |
+
model = AutoModelForCausalLM.from_pretrained("MoxoffSpA/Moxoff-Phi3Mini-KTO")
|
| 47 |
+
tokenizer = AutoTokenizer.from_pretrained("MoxoffSpA/Moxoff-Phi3Mini-KTO")
|
| 48 |
|
| 49 |
question = """Quanto è alta la torre di Pisa?"""
|
| 50 |
context = """
|
|
|
|
| 78 |
|
| 79 |
## Bias, Risks and Limitations
|
| 80 |
|
| 81 |
+
Moxoff-Phi3Mini-KTO has not been aligned to human preferences for safety within the RLHF phase or deployed with in-the-loop filtering of
|
| 82 |
responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so). It is also unknown what the size and composition
|
| 83 |
of the corpus was used to train the base model, however it is likely to have included a mix of Web data and technical sources
|
| 84 |
like books and code.
|