| --- |
| library_name: peft |
| datasets: |
| - b-mc2/sql-create-context |
| language: |
| - en |
| metrics: |
| - rouge |
| --- |
| |
| # GPT-2 Medium |
|
|
| ## Model Details |
|
|
| **Model Description:** GPT-2 Medium is the **355M parameter** version of GPT-2, a transformer-based language model created and released by OpenAI. The model is a pretrained model on English language using a causal language modeling (CLM) objective. |
|
|
|
|
| ## Training Data |
| the model is trained on 'b-mc2/sql-create-context' dataset upto 5000rows |
|
|
| ## Training procedure |
|
|
|
|
| The following `bitsandbytes` quantization config was used during training: |
| - quant_method: bitsandbytes |
| - load_in_8bit: False |
| - load_in_4bit: True |
| - llm_int8_threshold: 6.0 |
| - llm_int8_skip_modules: None |
| - llm_int8_enable_fp32_cpu_offload: False |
| - llm_int8_has_fp16_weight: False |
| - bnb_4bit_quant_type: nf4 |
| - bnb_4bit_use_double_quant: True |
| - bnb_4bit_compute_dtype: float16 |
| |
| The following `bitsandbytes` quantization config was used during training: |
| - quant_method: bitsandbytes |
| - load_in_8bit: False |
| - load_in_4bit: True |
| - llm_int8_threshold: 6.0 |
| - llm_int8_skip_modules: None |
| - llm_int8_enable_fp32_cpu_offload: False |
| - llm_int8_has_fp16_weight: False |
| - bnb_4bit_quant_type: nf4 |
| - bnb_4bit_use_double_quant: True |
| - bnb_4bit_compute_dtype: float16 |
| ### Framework versions |
|
|
| - PEFT 0.5.0 |
|
|
| - PEFT 0.5.0 |