jjae
/

Midm-KCulture-2.0-Base-Instruct

Model card Files Files and versions

Midm-KCulture-2.0-Base-Instruct / README.md

jjae's picture

Update README.md

23aee3b verified 6 months ago

|

history blame contribute delete

2.06 kB

	---
	license: mit
	language:
	- ko
	base_model:
	- K-intelligence/Midm-2.0-Base-Instruct
	tags:
	- Korean
	- Culture
	---


	# Midm-KCulture-2.0-Base-Instruct
	- This model is fine-tuned from KT/Midm-2.0-Base-Instruct on the 'Korean Culture Q&A Corpus' using the LoRA (Low-Rank Adaptation) methodology.

	## GitHub
	Check out the full training code [here](https://github.com/dahlia52/KR-Culture-QA/tree/main).

	## Training Hyperparameters

	\| Hyperparameter \| Value \|
	\| :---------------------------- \| :---------------------------- \|
	\| SFTConfig \| \|
	\| `torch_dtype` \| `bfloat16` \|
	\| `seed` \| `42` \|
	\| `epoch` \| `3` \|
	\| `per_device_train_batch_size` \| `2` \|
	\| `per_device_eval_batch_size` \| `2` \|
	\| `learning_rate` \| `0.0002` \|
	\| `lr_scheduler_type` \| `"linear"` \|
	\| `max_grad_norm` \| `1.0` \|
	\| `neftune_noise_alpha` \| `None` \|
	\| `gradient_accumulation_steps` \| `1` \|
	\| `gradient_checkpointing` \| `False` \|
	\| `max_seq_length` \| `1024` \|
	\| LoraConfig \| \|
	\| `r` \| `16` \|
	\| `lora_alpha` \| `16` \|
	\| `lora_dropout` \| `0.1` \|
	\| `target_modules` \| `["q_proj", "v_proj"]` \|

	## Usage
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	model_name = "jjae/Midm-KCulture-2.0-Base-Instruct"
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.bfloat16,
	trust_remote_code=True,
	device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	```