| | --- |
| | license: apache-2.0 |
| | library_name: peft |
| | tags: |
| | - code |
| | datasets: |
| | - mlabonne/CodeLlama-2-20k |
| | base_model: meta-llama/Llama-2-7b-hf |
| | --- |
| | |
| | # π¦π» CodeLlama |
| |
|
| | π [Article](https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32) | |
| | π» [Colab](https://colab.research.google.com/drive/1PEQyJO1-f6j0S_XJ8DV50NkpzasXkrzd?usp=sharing) | |
| | π [Script](https://gist.github.com/mlabonne/b5718e1b229ce6553564e3f56df72c5c) |
| |
|
| | <center><img src="https://i.imgur.com/yTPNIZj.png" width="300"></center> |
| |
|
| | `CodeLlama-7b` is a Llama 2 version of [**CodeAlpaca**](https://github.com/sahil280114/codealpaca). |
| |
|
| | ## π§ Training |
| |
|
| | This model is based on the `llama-2-7b-chat-hf` model, fine-tuned using QLoRA on the [`mlabonne/CodeLlama-2-20k`](https://huggingface.co/datasets/mlabonne/CodeLlama-2-20k) dataset. It was trained on an RTX 3090 and can be used for inference. |
| |
|
| | It was trained using this custom [`finetune_llama2.py`](https://gist.github.com/mlabonne/b5718e1b229ce6553564e3f56df72c5c) script as follows: |
| |
|
| | ``` bash |
| | python finetune_llama2.py --dataset_name=mlabonne/CodeLlama-2-20k --new_model=mlabonne/codellama-2-7b --bf16=True --learning_rate=2e-5 |
| | ``` |
| |
|
| | <center><img src="https://i.imgur.com/5Qx7Kzo.png"></center> |
| |
|
| | ## π» Usage |
| |
|
| | ``` python |
| | # pip install transformers accelerate |
| | |
| | from transformers import AutoTokenizer |
| | import transformers |
| | import torch |
| | |
| | model = "mlabonne/codellama-2-7b" |
| | prompt = "Write Python code to generate an array with all the numbers from 1 to 100" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(model) |
| | pipeline = transformers.pipeline( |
| | "text-generation", |
| | model=model, |
| | torch_dtype=torch.float16, |
| | device_map="auto", |
| | ) |
| | |
| | sequences = pipeline( |
| | f'<s>[INST] {prompt} [/INST]', |
| | do_sample=True, |
| | top_k=10, |
| | num_return_sequences=1, |
| | eos_token_id=tokenizer.eos_token_id, |
| | max_length=200, |
| | ) |
| | for seq in sequences: |
| | print(f"Result: {seq['generated_text']}") |
| | ``` |
| |
|
| | Ouput: |
| | ``` |
| | Here is a Python code to generate an array with all the numbers from 1 to 100: |
| | |
| | β
``` |
| | numbers = [] |
| | for i in range(1,101): |
| | numbers.append(i) |
| | β
``` |
| | |
| | This code generates an array with all the numbers from 1 to 100 in Python. It uses a loop that iterates over the range of numbers from 1 to 100, and for each number, it appends that number to the array 'numbers'. The variable 'numbers' is initialized to a list, and its length is set to 101 by using the range of numbers (0-99). |
| |
|
| | ```## Training procedure |
| | |
| | |
| | The following `bitsandbytes` quantization config was used during training: |
| | - load_in_8bit: False |
| | - load_in_4bit: True |
| | - llm_int8_threshold: 6.0 |
| | - llm_int8_skip_modules: None |
| | - llm_int8_enable_fp32_cpu_offload: False |
| | - llm_int8_has_fp16_weight: False |
| | - bnb_4bit_quant_type: nf4 |
| | - bnb_4bit_use_double_quant: True |
| | - bnb_4bit_compute_dtype: bfloat16 |
| | ### Framework versions |
| | |
| | - PEFT 0.5.0.dev0 |
| | |
| | - PEFT 0.5.0.dev0 |
| | ## Training procedure |
| | |
| | |
| | The following `bitsandbytes` quantization config was used during training: |
| | - load_in_8bit: False |
| | - load_in_4bit: True |
| | - llm_int8_threshold: 6.0 |
| | - llm_int8_skip_modules: None |
| | - llm_int8_enable_fp32_cpu_offload: False |
| | - llm_int8_has_fp16_weight: False |
| | - bnb_4bit_quant_type: nf4 |
| | - bnb_4bit_use_double_quant: True |
| | - bnb_4bit_compute_dtype: bfloat16 |
| | |