neoncortex
/

mini-mistral-openhermes-2.5-chatml-test

@@ -1,5 +1,5 @@
 ---
-license: openrail
 datasets:
 - teknium/OpenHermes-2.5
 language:
@@ -9,8 +9,7 @@ pipeline_tag: text-generation
 ---
 # Model Card for neoncortex/mini-mistral-openhermes-2.5-chatml-test
-A tiny Mistral model trained on teknium/OpenHermes-2.5.
-This is epoch 5/9, so still some training to go.
 ## Model Details
@@ -40,7 +39,7 @@ So, here's the bits:
     {%- if message['role'] == 'system' -%}
         {{- '<|im_start|>system\n' + message['content'].rstrip() + '<|im_end|>\n' -}}
     {%- else -%}
-        {%- if message['role'] == 'user' -%}
             {{-'<|im_start|>human\n' + message['content'].rstrip() + '<|im_end|>\n'-}}
         {%- else -%}
             {{-'<|im_start|>assistant\n' + message['content'] + '<|im_end|>\n' -}}
@@ -71,36 +70,10 @@ Exclusively available right here on HuggingFace!
 If you wanna have a laugh at how bad it is then go ahead, but I wouldn't expect much from it.
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
 This model won't work well for pretty much everything, probably.
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
 Use the code below to get started with the model.
@@ -121,11 +94,11 @@ Use the code below to get started with the model.
 #### Preprocessing
-I took the OpenHermes 2.5 dataset formatted it with ChatML.
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times
@@ -134,10 +107,6 @@ steps: 140976
 batches per device: 6
 1.04it/s
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
 I tried to run evals but the eval suite just laughed at me.
@@ -148,21 +117,11 @@ Don't be rude.
 ## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
 - **Hardware Type:** I already told you. Try and keep up.
 - **Hours used:** ~45 x 2 I guess.
 - **Cloud Provider:** gronkomatic
 - **Compute Region:** myob
-- **Carbon Emitted:** Probably
-## Technical Specifications
-### Model Architecture and Objective
-[More Information Needed]
 ### Compute Infrastructure
@@ -176,14 +135,6 @@ I trained it on my PC with no side on it because I like to watch the GPUs do the
 The wonderful free stuff at HuggingFace (https://huggingface.co)[https://huggingface.co]: transformers, datasets, trl
-## Glossary
-IDGAF - I don't give a fuck
-## More Information
-[More Information Needed]
 ## Model Card Authors
 gronkomatic, unless you're offended by something, in which case it was hacked by hackers.

 ---
+license: apache-2.0
 datasets:
 - teknium/OpenHermes-2.5
 language:
 ---
 # Model Card for neoncortex/mini-mistral-openhermes-2.5-chatml-test
+A tiny Mistral model trained as an experiment on teknium/OpenHermes-2.5.
 ## Model Details
     {%- if message['role'] == 'system' -%}
         {{- '<|im_start|>system\n' + message['content'].rstrip() + '<|im_end|>\n' -}}
     {%- else -%}
+        {%- if message['role'] == 'human' -%}
             {{-'<|im_start|>human\n' + message['content'].rstrip() + '<|im_end|>\n'-}}
         {%- else -%}
             {{-'<|im_start|>assistant\n' + message['content'] + '<|im_end|>\n' -}}
 If you wanna have a laugh at how bad it is then go ahead, but I wouldn't expect much from it.
 ### Out-of-Scope Use
 This model won't work well for pretty much everything, probably.
 ## How to Get Started with the Model
 Use the code below to get started with the model.
 #### Preprocessing
+I took the OpenHermes 2.5 dataset and formatted it with ChatML.
 #### Training Hyperparameters
+- **Training regime:** bf16 mixed precision
 #### Speeds, Sizes, Times
 batches per device: 6
 1.04it/s
 ## Evaluation
 I tried to run evals but the eval suite just laughed at me.
 ## Environmental Impact
 - **Hardware Type:** I already told you. Try and keep up.
 - **Hours used:** ~45 x 2 I guess.
 - **Cloud Provider:** gronkomatic
 - **Compute Region:** myob
+- **Carbon Emitted:** Yes, definitely
 ### Compute Infrastructure
 The wonderful free stuff at HuggingFace (https://huggingface.co)[https://huggingface.co]: transformers, datasets, trl
 ## Model Card Authors
 gronkomatic, unless you're offended by something, in which case it was hacked by hackers.