Instructions to use mingyue0101/codellama-7b-matplotlib-assistant with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use mingyue0101/codellama-7b-matplotlib-assistant with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-Instruct-hf") model = PeftModel.from_pretrained(base_model, "mingyue0101/codellama-7b-matplotlib-assistant") - Notebooks
- Google Colab
- Kaggle
| library_name: peft | |
| base_model: codellama/CodeLlama-7b-Instruct-hf | |
| tags: | |
| - instruction-tuning | |
| - qlora | |
| - code-llama | |
| - text-generation | |
| language: | |
| - en | |
| datasets: | |
| - mingyue0101/prompt_code_parquet | |
| - mingyue0101/prompts_modi | |
| license: apache-2.0 | |
| # Model Card for codellama-7b-matplotlib-assistant | |
| This model is a fine-tuned version of `codellama/CodeLlama-7b-Instruct-hf` designed to enhance instruction-following capabilities. It was developed as part of a Master's thesis project. | |
| ## Model Details | |
| ### Model Description | |
| The `codellama-7b-matplotlib-assistant` model is a large language model fine-tuned using the QLoRA (4-bit Quantization + LoRA) technique. The goal of this model was to adapt the base CodeLlama model to better follow user instructions while maintaining its coding and reasoning capabilities. | |
| - **Developed by:** mingyue0101 | |
| - **Model type:** Causal Language Model (Fine-tuned with PEFT/LoRA) | |
| - **Language(s) (NLP):** English, Chinese | |
| - **License:** Apache-2.0 (inherited from CodeLlama) | |
| - **Finetuned from model:** codellama/CodeLlama-7b-Instruct-hf | |
| ### Model Sources | |
| - **Repository:** https://huggingface.co/mingyue0101/codellama-7b-matplotlib-assistant | |
| - **Dataset:** https://huggingface.co/datasets/mingyue0101/prompt_code_parquet | |
| ## Uses | |
| ### Direct Use | |
| The model can be used for text generation, code assistance, and general-purpose instruction following. It is particularly suited for tasks where a balance of technical coding knowledge and conversational instruction following is required. | |
| ### Out-of-Scope Use | |
| The model should not be used for high-stakes decision-making, generating malicious code, or any application that violates the safety guidelines of the base CodeLlama model. | |
| ## Bias, Risks, and Limitations | |
| This model may inherit biases present in the training data or the base model. Since it was fine-tuned on a specific dataset (`parquet02`), it might exhibit limitations when handling domains outside of its training distribution. Users should expect potential hallucinations in complex reasoning tasks. | |
| ### Recommendations | |
| Users are encouraged to use safety filters when deploying this model in production and to perform domain-specific evaluation before use. | |
| ## How to Get Started with the Model | |
| Use the code below to load the model in 4-bit precision: | |
| ```python | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig | |
| from peft import PeftModel | |
| model_id = "codellama/CodeLlama-7b-Instruct-hf" | |
| peft_model_id = "mingyue0101/codellama-7b-matplotlib-assistant" | |
| # Load 4-bit configuration | |
| bnb_config = BitsAndBytesConfig( | |
| load_in_4bit=True, | |
| bnb_4bit_quant_type="nf4", | |
| bnb_4bit_compute_dtype=torch.float16, | |
| ) | |
| # Load base model and tokenizer | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| base_model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| quantization_config=bnb_config, | |
| device_map="auto" | |
| ) | |
| # Load the fine-tuned adapter | |
| model = PeftModel.from_pretrained(base_model, peft_model_id) | |
| # Inference | |
| prompt = "Write a Python function to sort a list." | |
| inputs = tokenizer(prompt, return_tensors="pt").to("cuda") | |
| outputs = model.generate(**inputs, max_new_tokens=128) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| ## Training Details | |
| ### Training Data | |
| The model was trained on the `mingyue0101/parquet02` dataset. This dataset contains instruction-response pairs formatted for Supervised Fine-Tuning (SFT). | |
| ### Training Procedure | |
| **Training Hyperparameters** | |
| - Training regime: QLoRA 4-bit (NF4) mixed precision (fp16) | |
| - Learning rate: 2e-4 | |
| - Optimizer: paged_adamw_32bit | |
| - Batch size: 4 | |
| - Epochs: 1 | |
| - LoRA Rank (r): 64 | |
| - LoRA Alpha: 16 | |
| - LoRA Dropout: 0.1 | |
| - LR Scheduler: constant | |
| - Warmup Ratio: 0.03 | |
| ## Technical Specifications | |
| ### Model Architecture and Objective | |
| Based on the Llama 2 architecture, this model utilizes grouped-query attention (GQA) and rotary positional embeddings (RoPE), fine-tuned with a causal language modeling objective. | |
| ### Compute Infrastructure | |
| ### Software | |
| - PEFT 0.10.0 | |
| - Transformers | |
| - Bitsandbytes | |
| - TRL (SFTTrainer) |