four-two-labs
/

lynx-micro

Text Generation

text-generation-inference

Model card Files Files and versions

alimosavian commited on Jun 6, 2024

Commit

1097ce5

·

verified ·

1 Parent(s): 1ca8c20

Update README.md

Files changed (1) hide show

README.md +8 -5

README.md CHANGED Viewed

@@ -67,7 +67,7 @@ r = pipe(
     messages,
     max_length=4096,
     do_sample=False,
-    eos_token_id=tokenizer.vocab['<end_of_turn>']
 )
 ```
@@ -77,17 +77,20 @@ r = pipe(
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters

     messages,
     max_length=4096,
     do_sample=False,
+    eos_token_id=[tokenizer.vocab['<end_of_turn>'], tokenizer.eos_token_id],
 )
 ```
 ### Training Data
+The model has been on a proprietary dataset of ~1.35M examples consisting of
+ * High quality swedish instruct data
+   * Single turn
+   * Multi-turn
+ * High quality swe <-> eng translations
 ### Training Procedure
+For training we used hugginface Accelerate and TRL.
 #### Preprocessing [optional]
+For efficiency, we packed all the examples into 8K context windows, reducing the number examples to ~12% of their original count.
 #### Training Hyperparameters