VMware
/

open-llama-7b-v2-open-instruct

@@ -14,20 +14,22 @@ Instruction-tuned version of the fully trained Open LLama 7B v2  model. The mode
 - This model performs better on code compared to v1 due to the improvements made on the base model by the openlm-research team.
 - The instruction model is trained on an improved instruction tuning dataset compared to v1
-<b> NOTE </b> : The model was trained using the Alpaca prompt template
-<b> NOTE </b> : Fast tokenizer results in incorrect encoding, set the ```use_fast = False``` parameter, when instantiating the tokenizer
 ## License
-- <b>Commercially Viable</b>
 ## Datasets used for Fine-Tuning
-<b>Open-instruct</b>
-<b>Open-instruct-v1</b>
 - Mosaic/Dolly-HHRLHF + filtered  OASST1 - cc by 3.0
-<b>Subset of COT SUBMIX (FROM FLAN V2) Zeroshot examples</b>
 - ESNLI -  MIT
 - ECQA  - CDLA 1.0 - Sharing
 - Strategy  - MIT
@@ -35,7 +37,6 @@ Instruction-tuned version of the fully trained Open LLama 7B v2  model. The mode
 - gsmk8 - MIT
 - aqua  - MIT
 - qasc  - Apache 2.0
-<br>
 - Language Model, ([openlm-research/open_llama_v2_7b](https://huggingface.co/openlm-research/open_llama_v2_7b)) is under apache-2.0
 - Dataset ([VMware/open-instruct](https://huggingface.co/datasets/VMware/open-instruct)) is under cc-by-sa-3.0
@@ -46,6 +47,7 @@ Instruction-tuned version of the fully trained Open LLama 7B v2  model. The mode
 - Model Size: 7B parameters
 - Dataset: Open-instruct
 ## Use in Transformers
 ```
@@ -77,9 +79,10 @@ output = tokenizer.decode(output1[0])
 print(output)
 ```
-### Output
 Sure, I can help you with that!
 Attention mechanisms in transformer models are typically implemented using the attention mechanism in the self-attention layer. Self-attention allows the model to focus on different parts of the input sequence when processing it. This is achieved by computing a set of attention weights, which are used to weigh the contribution of each input element to the output.
@@ -129,8 +132,11 @@ The output of the `attention_weights` function is a NumPy tensor that represents
 I hope this helps!</s>
 <hr>
 ## Finetuning details
 The finetuning scripts will be available in our [RAIL Github Repository](https://github.com/vmware-labs/research-and-development-artificial-intelligence-lab/tree/main/instruction-tuning)
 ## Evaluation
-<B>TODO</B>

 - This model performs better on code compared to v1 due to the improvements made on the base model by the openlm-research team.
 - The instruction model is trained on an improved instruction tuning dataset compared to v1
+**NOTE**: The model was trained using the Alpaca prompt template
+**NOTE**: Fast tokenizer results in incorrect encoding, set the ```use_fast = False``` parameter, when instantiating the tokenizer
 ## License
+- **Commercially Viable**
 ## Datasets used for Fine-Tuning
+**Open-instruct**
+**Open-instruct-v1**
 - Mosaic/Dolly-HHRLHF + filtered  OASST1 - cc by 3.0
+**Subset of COT SUBMIX (FROM FLAN V2) Zeroshot examples**
 - ESNLI -  MIT
 - ECQA  - CDLA 1.0 - Sharing
 - Strategy  - MIT
 - gsmk8 - MIT
 - aqua  - MIT
 - qasc  - Apache 2.0
 - Language Model, ([openlm-research/open_llama_v2_7b](https://huggingface.co/openlm-research/open_llama_v2_7b)) is under apache-2.0
 - Dataset ([VMware/open-instruct](https://huggingface.co/datasets/VMware/open-instruct)) is under cc-by-sa-3.0
 - Model Size: 7B parameters
 - Dataset: Open-instruct
 ## Use in Transformers
 ```
 print(output)
 ```
+### Output
 Sure, I can help you with that!
 Attention mechanisms in transformer models are typically implemented using the attention mechanism in the self-attention layer. Self-attention allows the model to focus on different parts of the input sequence when processing it. This is achieved by computing a set of attention weights, which are used to weigh the contribution of each input element to the output.
 I hope this helps!</s>
 <hr>
 ## Finetuning details
 The finetuning scripts will be available in our [RAIL Github Repository](https://github.com/vmware-labs/research-and-development-artificial-intelligence-lab/tree/main/instruction-tuning)
 ## Evaluation
+**TODO**