| --- |
| tags: |
| - generated_from_trainer |
| - code |
| - coding |
| - llama |
| model-index: |
| - name: gemma-2b-coder |
| results: [] |
| license: apache-2.0 |
| language: |
| - code |
| thumbnail: https://huggingface.co/mrm8488/gemma-2b-coder/resolve/main/logo.png |
| datasets: |
| - HuggingFaceH4/CodeAlpaca_20K |
| pipeline_tag: text-generation |
| --- |
| |
| <div style="text-align:center;width:250px;height:250px;"> |
| <img src="https://huggingface.co/mrm8488/gemma-2b-coder/resolve/main/logo.png" alt="gemma coder logo""> |
| </div> |
| |
|
|
| # Gemma Coder π©βπ» |
| **Gemma 2B** fine-tuned on the **CodeAlpaca 20k instructions dataset** by using the method **QLoRA** with [PEFT](https://github.com/huggingface/peft) library. |
|
|
| ## Model description π§ |
|
|
| [Gemma-2b](https://huggingface.co/google/gemma-2b) |
|
|
| Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone. |
|
|
|
|
| ## Training and evaluation data π |
|
|
| [CodeAlpaca_20K](https://huggingface.co/datasets/HuggingFaceH4/CodeAlpaca_20K): contains 20K instruction-following data used for fine-tuning the Code Alpaca model. |
|
|
|
|
| ### Training hyperparameters β |
|
|
| Training took 1h 40 min on Free Colab T4 GPU (16GB VRAM) with the following params: |
|
|
| ```py |
| num_train_epochs=2, |
| per_device_train_batch_size=2, |
| per_device_eval_batch_size=1, |
| gradient_accumulation_steps=32 |
| learning_rate=2.5e-5, |
| optim="paged_adamw_8bit", |
| logging_steps=5, |
| seed=66, |
| load_best_model_at_end=True, |
| save_strategy="steps", |
| save_steps=50, |
| evaluation_strategy="steps", |
| eval_steps=50, |
| save_total_limit=2, |
| remove_unused_columns=True, |
| fp16=True, |
| bf16=False |
| ``` |
|
|
| ### Training results ποΈ |
|
|
| | Step | Training Loss | Validation Loss | |
| |------|---------------|-----------------| |
| | 50 | 1.467800 | 1.450770 | |
| | 100 | 1.060000 | 1.064840 | |
| | 150 | 0.900200 | 0.922290 | |
| | 200 | 0.848400 | 0.879911 | |
| | 250 | 0.838100 | 0.867354 | |
|
|
|
|
|
|
| ### Eval results π |
|
|
| WIP |
|
|
|
|
| ### Example of usage π©βπ» |
| ```py |
| import torch |
| from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig |
| |
| model_id = "mrm8488/llama-2-coder-7b" |
| |
| tokenizer = AutoTokenizer.from_pretrained(model_id) |
| |
| model = AutoModelForCausalLM.from_pretrained(model_id).to("cuda") |
| |
| def create_prompt(instruction): |
| system = "You are a coding assistant that will help the user to resolve the following instruction:" |
| instruction = "### Instruction: " + instruction |
| return system + "\n" + instruction + "\n\n" + "### Solution:" + "\n" |
| |
| def generate( |
| instruction, |
| max_new_tokens=256, |
| temperature=0.1, |
| top_p=0.75, |
| top_k=40, |
| num_beams=2, |
| **kwargs, |
| ): |
| system = f"<bos><|system|>\nYou are a helpful coding assistant.<eos>\n" |
| prompt = f"{system}<|user|>\n{instruction}<eos>\n<|assistant|>\n" |
| inputs = tokenizer(prompt, return_tensors="pt") |
| input_ids = inputs["input_ids"].to("cuda") |
| attention_mask = inputs["attention_mask"].to("cuda") |
| generation_config = GenerationConfig( |
| temperature=temperature, |
| top_p=top_p, |
| top_k=top_k, |
| num_beams=num_beams, |
| **kwargs, |
| ) |
| with torch.no_grad(): |
| generation_output = model.generate( |
| input_ids=input_ids, |
| attention_mask=attention_mask, |
| generation_config=generation_config, |
| return_dict_in_generate=True, |
| max_new_tokens=max_new_tokens, |
| early_stopping=True |
| ) |
| s = generation_output.sequences[0] |
| output = tokenizer.decode(s, skip_special_tokens=True) |
| return output.split("<|assistant|>")[1] |
| |
| instruction = """ |
| Edit the following XML code to add a navigation bar to the top of a web page |
| <html> |
| <head> |
| <title>Maisa</title> |
| </head> |
| """ |
| print(generate(instruction)) |
| ``` |
|
|
| ### Citation |
|
|
| WIP |