OpenLLM-France
/

Lucie-7B-Instruct-human-data

@@ -39,6 +39,7 @@ Lucie-7B-Instruct-human-data is a fine-tuned version of [Lucie-7B](), an open-so
 Lucie-7B-Instruct-human-data is fine-tuned on human-produced instructions collected either from open annotation campaigns or by applying templates to extant datasets. The performance of Lucie-7B-Instruct-human-data falls below that of [Lucie-7B-Instruct](https://huggingface.co/OpenLLM-France/Lucie-7B-Instruct); the interest of the model is to show what can be done to fine-tune LLMs to follow instructions without appealing to third party LLMs.
 ## Training details
 ### Training data
@@ -66,11 +67,12 @@ And the following datasets developed for the Lucie instruct models:
 ### Training procedure
 The model architecture and hyperparameters are the same as for [Lucie-7B](https://huggingface.co/OpenLLM-France/Lucie-7B) during the annealing phase with the following exceptions:
-* context length: 4096
 * batch size: 1024
 * max learning rate: 3e-5
 * min learning rate: 3e-6
 ## Testing the model
@@ -153,7 +155,7 @@ Lucie-7B LLM and its training dataset
 ## Acknowledgements
-This work was performed using HPC resources from GENCI–IDRIS (Grant 2024-GC011015444).
 Lucie-7B was created by members of [LINAGORA](https://labs.linagora.com/) and the [OpenLLM-France](https://www.openllm-france.fr/) community, including in alphabetical order:
 Olivier Gouvert (LINAGORA),
@@ -176,6 +178,8 @@ and
 Olivier Ferret (CEA)
 for their helpful input.
 ## Contact
 contact@openllm-france.fr

 Lucie-7B-Instruct-human-data is fine-tuned on human-produced instructions collected either from open annotation campaigns or by applying templates to extant datasets. The performance of Lucie-7B-Instruct-human-data falls below that of [Lucie-7B-Instruct](https://huggingface.co/OpenLLM-France/Lucie-7B-Instruct); the interest of the model is to show what can be done to fine-tune LLMs to follow instructions without appealing to third party LLMs.
+While Lucie-7B-Instruct-human-data is trained on sequences of 4096 tokens, its base model, Lucie-7B has a context size of 32K tokens. Based on Needle-in-a-haystack evaluations, Lucie-7B-Instruct-human-data maintains the capacity of the base model to handle 32K-size context windows.
 ## Training details
 ### Training data
 ### Training procedure
 The model architecture and hyperparameters are the same as for [Lucie-7B](https://huggingface.co/OpenLLM-France/Lucie-7B) during the annealing phase with the following exceptions:
+* context length: 4096<sup>*</sup>
 * batch size: 1024
 * max learning rate: 3e-5
 * min learning rate: 3e-6
+<sup>*</sup>As noted above, while Lucie-7B-Instruct is trained on sequences of 4096 tokens, it maintains the capacity of the base model, Lucie-7B, to handle context sizes of up to 32K tokens.
 ## Testing the model
 ## Acknowledgements
+This work was performed using HPC resources from GENCI–IDRIS (Grant 2024-GC011015444). We gratefully acknowledge support from GENCI and IDRIS and from Pierre-François Lavallée (IDRIS) and Stephane Requena (GENCI) in particular.
 Lucie-7B was created by members of [LINAGORA](https://labs.linagora.com/) and the [OpenLLM-France](https://www.openllm-france.fr/) community, including in alphabetical order:
 Olivier Gouvert (LINAGORA),
 Olivier Ferret (CEA)
 for their helpful input.
+Finally, we thank the entire OpenLLM-France community, whose members have helped in diverse ways.
 ## Contact
 contact@openllm-france.fr