| | --- |
| | library_name: transformers |
| | license: mit |
| | language: |
| | - fr |
| | - en |
| | tags: |
| | - french |
| | - chocolatine |
| | datasets: |
| | - jpacifico/french-orca-dpo-pairs-revised |
| | pipeline_tag: text-generation |
| | --- |
| | |
| |
|
| | # Based on Original Model |
| | https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-v1.2 |
| | * *Duplicate here for testing/training purpose |
| | * *Thanks |
| | |
| | |
| | # Original Model Card for Chocolatine-3B-Instruct-DPO-v1.2 |
| | |
| | |
| | ### Chocolatine-3B-Instruct-DPO-v1.2 |
| | |
| | Best version of Chocolatine-3B for French. |
| | *The model supports 128K context length*. |
| | |
| | DPO fine-tuned of [microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) (3.82B params) |
| | using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset. |
| | Training in French also improves the model in English, surpassing the performances of its base model. |
| | |
| | |
| | ### MT-Bench-French |
| | |
| | Chocolatine-3B-Instruct-DPO-v1.2 is outperforming Phi-3-medium-4k-instruct (14B) and its base model Phi-3.5-mini-instruct on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench) and GPT-4-Turbo as LLM-judge. |
| | |
| | ``` |
| | ########## First turn ########## |
| | score |
| | model turn |
| | gpt-4o-mini 1 9.2875 |
| | Chocolatine-14B-Instruct-4k-DPO 1 8.6375 |
| | Chocolatine-14B-Instruct-DPO-v1.2 1 8.6125 |
| | Phi-3.5-mini-instruct 1 8.5250 |
| | Chocolatine-3B-Instruct-DPO-v1.2 1 8.3750 |
| | Phi-3-medium-4k-instruct 1 8.2250 |
| | gpt-3.5-turbo 1 8.1375 |
| | Chocolatine-3B-Instruct-DPO-Revised 1 7.9875 |
| | Daredevil-8B 1 7.8875 |
| | Meta-Llama-3.1-8B-Instruct 1 7.0500 |
| | vigostral-7b-chat 1 6.7875 |
| | Mistral-7B-Instruct-v0.3 1 6.7500 |
| | gemma-2-2b-it 1 6.4500 |
| | French-Alpaca-7B-Instruct_beta 1 5.6875 |
| | vigogne-2-7b-chat 1 5.6625 |
| | |
| | ########## Second turn ########## |
| | score |
| | model turn |
| | gpt-4o-mini 2 8.912500 |
| | Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500 |
| | Chocolatine-3B-Instruct-DPO-Revised 2 7.937500 |
| | Chocolatine-3B-Instruct-DPO-v1.2 2 7.862500 |
| | Phi-3-medium-4k-instruct 2 7.750000 |
| | Chocolatine-14B-Instruct-4k-DPO 2 7.737500 |
| | gpt-3.5-turbo 2 7.679167 |
| | Phi-3.5-mini-instruct 2 7.575000 |
| | Daredevil-8B 2 7.087500 |
| | Meta-Llama-3.1-8B-Instruct 2 6.787500 |
| | Mistral-7B-Instruct-v0.3 2 6.500000 |
| | vigostral-7b-chat 2 6.162500 |
| | gemma-2-2b-it 2 6.100000 |
| | French-Alpaca-7B-Instruct_beta 2 5.487395 |
| | vigogne-2-7b-chat 2 2.775000 |
| | |
| | ########## Average ########## |
| | score |
| | model |
| | gpt-4o-mini 9.100000 |
| | Chocolatine-14B-Instruct-DPO-v1.2 8.475000 |
| | Chocolatine-14B-Instruct-4k-DPO 8.187500 |
| | Chocolatine-3B-Instruct-DPO-v1.2 8.118750 |
| | Phi-3.5-mini-instruct 8.050000 |
| | Phi-3-medium-4k-instruct 7.987500 |
| | Chocolatine-3B-Instruct-DPO-Revised 7.962500 |
| | gpt-3.5-turbo 7.908333 |
| | Daredevil-8B 7.487500 |
| | Meta-Llama-3.1-8B-Instruct 6.918750 |
| | Mistral-7B-Instruct-v0.3 6.625000 |
| | vigostral-7b-chat 6.475000 |
| | gemma-2-2b-it 6.275000 |
| | French-Alpaca-7B-Instruct_beta 5.587866 |
| | vigogne-2-7b-chat 4.218750 |
| | ``` |
| | |
| | ### Usage |
| | |
| | You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_3B_inference_test_colab.ipynb) |
| | |
| | You can also run Chocolatine using the following code: |
| | |
| | ```python |
| | import transformers |
| | from transformers import AutoTokenizer |
| | |
| | # Format prompt |
| | message = [ |
| | {"role": "system", "content": "You are a helpful assistant chatbot."}, |
| | {"role": "user", "content": "What is a Large Language Model?"} |
| | ] |
| | tokenizer = AutoTokenizer.from_pretrained(new_model) |
| | prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False) |
| | |
| | # Create pipeline |
| | pipeline = transformers.pipeline( |
| | "text-generation", |
| | model=new_model, |
| | tokenizer=tokenizer |
| | ) |
| | |
| | # Generate text |
| | sequences = pipeline( |
| | prompt, |
| | do_sample=True, |
| | temperature=0.7, |
| | top_p=0.9, |
| | num_return_sequences=1, |
| | max_length=200, |
| | ) |
| | print(sequences[0]['generated_text']) |
| | ``` |
| | |
| | * **4-bit quantized version** is available here : [jpacifico/Chocolatine-3B-Instruct-DPO-v1.2-Q4_K_M-GGUF](https://huggingface.co/jpacifico/Chocolatine-3B-Instruct-DPO-v1.2-Q4_K_M-GGUF) |
| |
|
| | ### Limitations |
| |
|
| | The Chocolatine model is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance. |
| | It does not have any moderation mechanism. |
| |
|
| | - **Developed by:** Jonathan Pacifico, 2024 |
| | - **Model type:** LLM |
| | - **Language(s) (NLP):** French, English |
| | - **License:** MIT |