jjae's picture
Update README.md
23aee3b verified
---
license: mit
language:
- ko
base_model:
- K-intelligence/Midm-2.0-Base-Instruct
tags:
- Korean
- Culture
---
# Midm-KCulture-2.0-Base-Instruct
- This model is fine-tuned from KT/Midm-2.0-Base-Instruct on the 'Korean Culture Q&A Corpus' using the LoRA (Low-Rank Adaptation) methodology.
## GitHub
Check out the full training code [here](https://github.com/dahlia52/KR-Culture-QA/tree/main).
## Training Hyperparameters
| Hyperparameter | Value |
| :---------------------------- | :---------------------------- |
| **SFTConfig** | |
| `torch_dtype` | `bfloat16` |
| `seed` | `42` |
| `epoch` | `3` |
| `per_device_train_batch_size` | `2` |
| `per_device_eval_batch_size` | `2` |
| `learning_rate` | `0.0002` |
| `lr_scheduler_type` | `"linear"` |
| `max_grad_norm` | `1.0` |
| `neftune_noise_alpha` | `None` |
| `gradient_accumulation_steps` | `1` |
| `gradient_checkpointing` | `False` |
| `max_seq_length` | `1024` |
| **LoraConfig** | |
| `r` | `16` |
| `lora_alpha` | `16` |
| `lora_dropout` | `0.1` |
| `target_modules` | `["q_proj", "v_proj"]` |
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "jjae/Midm-KCulture-2.0-Base-Instruct"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
```