Instructions to use Changgil/k2s3_test_24001 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Changgil/k2s3_test_24001 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Changgil/k2s3_test_24001")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Changgil/k2s3_test_24001")
model = AutoModelForCausalLM.from_pretrained("Changgil/k2s3_test_24001")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Changgil/k2s3_test_24001 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Changgil/k2s3_test_24001"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Changgil/k2s3_test_24001",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Changgil/k2s3_test_24001

SGLang

How to use Changgil/k2s3_test_24001 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Changgil/k2s3_test_24001" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Changgil/k2s3_test_24001",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Changgil/k2s3_test_24001" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Changgil/k2s3_test_24001",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Changgil/k2s3_test_24001 with Docker Model Runner:
```
docker model run hf.co/Changgil/k2s3_test_24001
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Developed by :

Changgil Song

Model Number:

k2s3_test_24001

Base Model :

meta-llama/Llama-2-13b-chat-hf

Training Data

The model was trained on a diverse dataset comprising approximately 800 million tokens, including the Standard Korean Dictionary, KULLM training data from Korea University, dissertation abstracts from master's and doctoral theses, and Korean language samples from AI Hub.
이 모델은 표준대국어사전, 고려대 KULLM의 훈련 데이터, 석박사학위자 서지정보 논문초록, ai_hub의 한국어 데이터 샘플들을 포함하여 약 8억 개의 토큰으로 구성된 다양한 데이터셋에서 훈련되었습니다.

Training Method

This model was fine-tuned on the "meta-llama/Llama-2-13b-chat-hf" base model using PEFT (Parameter-Efficient Fine-Tuning) LoRA (Low-Rank Adaptation) techniques.
이 모델은 "meta-llama/Llama-2-13b-chat-hf" 기반 모델을 PEFT LoRA를 사용하여 미세조정되었습니다.

Hardware and Software

Hardware: Utilized two A100 (80G*2EA) GPUs for training.
Training Factors: This model was fine-tuned using PEFT LoRA with the HuggingFace SFTtrainer and applied fsdp. Key parameters included LoRA r = 8, LoRA alpha = 16, trained for 2 epochs, batch size of 1, and gradient accumulation of 32.
이 모델은 PEFT LoRA를 사용하여 HuggingFace SFTtrainer와 fsdp를 적용하여 미세조정되었습니다. 주요 파라미터로는 LoRA r = 8, LoRA alpha = 16, 2 에폭 훈련, 배치 크기 1, 그리고 그라디언트 누적 32를 포함합니다.

Caution

For fine-tuning this model, it is advised to consider the specific parameters used during training, such as LoRA r and LoRA alpha values, to ensure compatibility and optimal performance.
이 모델을 미세조정할 때는 LoRA r 및 LoRA alpha 값과 같이 훈련 중에 사용된 특정 파라미터를 고려하는 것이 좋습니다. 이는 호환성 및 최적의 성능을 보장하기 위함입니다.

Additional Information

The training leveraged the fsdp (Fully Sharded Data Parallel) feature through the HuggingFace SFTtrainer for efficient memory usage and accelerated training.
훈련은 HuggingFace SFTtrainer를 통한 fsdp 기능을 활용하여 메모리 사용을 효율적으로 하고 훈련 속도를 가속화했습니다.

Downloads last month: 86

Model tree for Changgil/k2s3_test_24001

Quantizations

1 model

Changgil
/

k2s3_test_24001

Developed by :

Model Number:

Base Model :

Training Data

Training Method

Hardware and Software

Caution

Additional Information

Model tree for Changgil/k2s3_test_24001

Spaces using Changgil/k2s3_test_24001 9