Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: text-generation
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Orca 2
|
| 6 |
+
|
| 7 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
| 8 |
+
|
| 9 |
+
In Orca 2, we continue exploring how improved training signals can give smaller LMs enhanced reasoning abilities, typically
|
| 10 |
+
found only in much larger models. We seek to teach small LMs to employ different solution
|
| 11 |
+
strategies for different tasks, potentially different from the one used by the
|
| 12 |
+
larger model. For example, while larger models might provide a direct answer
|
| 13 |
+
to a complex task, smaller models may not have the same capacity. In Orca
|
| 14 |
+
2, we teach the model various reasoning techniques (step-by-step, recall
|
| 15 |
+
then generate, recall-reason-generate, direct answer, etc.). More crucially,
|
| 16 |
+
we aim to help the model learn to determine the most effective solution
|
| 17 |
+
strategy for each task. Orca 2 models were trained by continual training of LLaMA-2 base models of the same size.
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
## Model Details
|
| 21 |
+
|
| 22 |
+
Refer to LLaMA-2 for details on model architectures.
|
| 23 |
+
|
| 24 |
+
## Uses
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
## Bias, Risks, and Limitations
|
| 28 |
+
|
| 29 |
+
Orca 2, built upon the LLaMA 2 model family, retains many of its limitations, as well as the
|
| 30 |
+
common limitations of other large language models or limitation including by its training
|
| 31 |
+
process, including:
|
| 32 |
+
|
| 33 |
+
**Data Biases**: Large language models, trained on extensive data, can inadvertently carry
|
| 34 |
+
biases present in the source data. Consequently, the models may generate outputs that could
|
| 35 |
+
be potentially biased or unfair.
|
| 36 |
+
|
| 37 |
+
**Lack of Contextual Understanding**: Despite their impressive capabilities in language understanding and generation, these models exhibit limited real-world understanding, resulting
|
| 38 |
+
in potential inaccuracies or nonsensical responses.
|
| 39 |
+
|
| 40 |
+
**Lack of Transparency**: Due to the complexity and size, large language models can act
|
| 41 |
+
as “black boxes”, making it difficult to comprehend the rationale behind specific outputs or
|
| 42 |
+
decisions. We recommend reviewing transparency notes from Azure for more information.
|
| 43 |
+
|
| 44 |
+
**Content Harms**: There are various types of content harms that large language models
|
| 45 |
+
can cause. It is important to be aware of them when using these models, and to take
|
| 46 |
+
actions to prevent them. It is recommended to leverage various content moderation services
|
| 47 |
+
provided by different companies and institutions. On an important note, we hope for better
|
| 48 |
+
regulations and standards from government and technology leaders around content harms
|
| 49 |
+
for AI technologies in future. We value and acknowledge the important role that research
|
| 50 |
+
and open source community can play in this direction.
|
| 51 |
+
|
| 52 |
+
**Hallucination**: It is important to be aware and cautious not to entirely rely on a given
|
| 53 |
+
language model for critical decisions or information that might have deep impact as it is
|
| 54 |
+
not obvious how to prevent these models from fabricating content. Moreover, it is not clear
|
| 55 |
+
whether small models may be more susceptible to hallucination in ungrounded generation
|
| 56 |
+
use cases due to their smaller sizes and hence reduced memorization capacities. This is an
|
| 57 |
+
active research topic and we hope there will be more rigorous measurement, understanding
|
| 58 |
+
and mitigations around this topic.
|
| 59 |
+
|
| 60 |
+
**Potential for Misuse**: Without suitable safeguards, there is a risk that these models could
|
| 61 |
+
be maliciously used for generating disinformation or harmful content.
|
| 62 |
+
|
| 63 |
+
**Data Distribution**: Orca 2’s performance is likely to correlate strongly with the distribution
|
| 64 |
+
of the tuning data. This correlation might limit its accuracy in areas underrepresented in
|
| 65 |
+
the training dataset such as math, coding, and reasoning.
|
| 66 |
+
|
| 67 |
+
**System messages**: Orca 2 demonstrates variance in performance depending on the system
|
| 68 |
+
instructions. Additionally, the stochasticity introduced by the model size may lead to
|
| 69 |
+
generation of non-deterministic responses to different system instructions.
|
| 70 |
+
|
| 71 |
+
**Zero-Shot Settings**: Orca 2 was trained on data that mostly simulate zero-shot settings.
|
| 72 |
+
While the model demonstrate very strong performance in zero-shot settings, it does not show
|
| 73 |
+
the same gains of using few-shot learning compared to other, specially larger, models.
|
| 74 |
+
|
| 75 |
+
**Synthetic data**: As Orca 2 is trained on synthetic data, it could inherit both the advantages
|
| 76 |
+
and shortcomings of the models and methods used for data generation. We posit that Orca
|
| 77 |
+
2 benefits from the safety measures incorporated during training and safety guardrails (e.g.,
|
| 78 |
+
content filter) within the Azure OpenAI API. However, detailed studies are required for
|
| 79 |
+
better quantification of such risks.
|
| 80 |
+
|
| 81 |
+
This model is solely designed for research settings, and its testing has only been carried
|
| 82 |
+
out in such environments. It should not be used in downstream applications, as additional
|
| 83 |
+
analysis is needed to assess potential harm or bias in the proposed application.
|
| 84 |
+
|
| 85 |
+
## How to Get Started with the Model
|
| 86 |
+
|
| 87 |
+
Use the code below to get started with the model.
|
| 88 |
+
|
| 89 |
+
[More Information Needed]
|