| --- |
| base_model: meta-llama/Llama-2-7b-hf |
| library_name: peft |
| license: llama2 |
| datasets: |
| - timdettmers/openassistant-guanaco |
| language: |
| - en |
| - th |
| - zh |
| metrics: |
| - accuracy |
| pipeline_tag: question-answering |
| --- |
| |
|
|
|
|
| ## Model Details |
|
|
| ### Model Description |
|
|
| <!-- Provide a longer summary of what this model is. --> |
|
|
|
|
| - **Developed by:** [Jixin Yang @ HKUST] |
| - **Model type:** [PEFT (LoRA) fine-tuned LLaMA-2 7B for backward text generation] |
| - **Finetuned from model [optional]:** [meta-llama/Llama-2-7b-hf] |
|
|
|
|
|
|
| ## Uses |
| This model is designed for backward text generation - given an output text, it generates the corresponding input. |
|
|
|
|
|
|
| ## How to Get Started with the Model |
|
|
| Use the code below to get started with the model. |
|
|
| from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
| model_name = "jasperyeoh2/llama2-7b-backward-model" |
| tokenizer = AutoTokenizer.from_pretrained(model_name) |
| model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto") |
|
|
| input_text = "Output text to reverse" |
| inputs = tokenizer(input_text, return_tensors="pt").to("cuda") |
| outputs = model.generate(**inputs, max_new_tokens=50) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| |
| ## Training Details |
| |
| ### Training Data |
| |
| <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
| |
| - Dataset: [OpenAssistant-Guanaco](https://huggingface.co/datasets/timdettmers/openassistant-guanaco) |
| - Number of examples used: ~3,200 |
| - Task: Instruction Backtranslation (Answer → Prompt) |
| |
| ### Training Procedure |
| |
| <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
| |
| #### Preprocessing [optional] |
| |
| - Method: PEFT with LoRA (Low-Rank Adaptation) |
| - Quantization: 4-bit (NF4) |
| - LoRA config: |
| - `r`: 8 |
| - `alpha`: 16 |
| - `target_modules`: ["q_proj", "v_proj"] |
| - `dropout`: 0.05 |
| - Max sequence length: 512 tokens |
| - Epochs: 10 |
| - Batch size: 2 |
| - Gradient accumulation steps: 8 |
| - Effective batch size: 16 |
| - Learning rate: 2e-5 |
| - Scheduler: linear with warmup |
| - Optimizer: AdamW |
| - Early stopping: enabled (patience=2) |
| |
| |
| #### Metrics |
| |
| <!-- These are the evaluation metrics being used, ideally with a description of why. --> |
| |
| [wandb: https://wandb.ai/jyang577-hong-kong-university-of-science-and-technology/huggingface?nw=nwuserjyang577] |
| |
| ### Results |
| |
| [- Final eval loss: ~1.436 |
| - Final train loss: ~1.4 |
| - Training completed in ~8 epochs] |
| |
| |
| |
| ### Compute Infrastructure |
| |
| - GPU: 1× NVIDIA A800 (80GB) |
| - CUDA Version: 12.1 |
| |
| #### Software |
| |
| - OS: Ubuntu 20.04 |
| - Python: 3.10 |
| - Transformers: 4.38.2 |
| - PEFT: 0.15.1 |
| - Accelerate: 0.28.0 |
| - BitsAndBytes: 0.41.2] |
| |
| #### Hardware |
| |
| NVIDIA A800 GPU |
| |
| |
| ### Framework versions |
| |
| - PEFT 0.15.1 |