Instructions to use ping98k/typhoon-7b-rag-instruct-th with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ping98k/typhoon-7b-rag-instruct-th with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ping98k/typhoon-7b-rag-instruct-th")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ping98k/typhoon-7b-rag-instruct-th")
model = AutoModelForCausalLM.from_pretrained("ping98k/typhoon-7b-rag-instruct-th")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use ping98k/typhoon-7b-rag-instruct-th with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ping98k/typhoon-7b-rag-instruct-th"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ping98k/typhoon-7b-rag-instruct-th",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/ping98k/typhoon-7b-rag-instruct-th

SGLang

How to use ping98k/typhoon-7b-rag-instruct-th with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ping98k/typhoon-7b-rag-instruct-th" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ping98k/typhoon-7b-rag-instruct-th",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ping98k/typhoon-7b-rag-instruct-th" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ping98k/typhoon-7b-rag-instruct-th",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use ping98k/typhoon-7b-rag-instruct-th with Docker Model Runner:
```
docker model run hf.co/ping98k/typhoon-7b-rag-instruct-th
```

สนใจวิธีการทำ Fine-Tune จาก typhoon-7b ด้วย dolly-th ครับ

by atipasvanund - opened Feb 22, 2024

Discussion

atipasvanund

Feb 22, 2024

เพราะผลออกมาแล้วใช้ได้ดีมากเลย พอทราบว่าจะสามารถอธิบายวิธีการ หรือแนะนำให้ผมไปศึกษามาจากไหนได้ไหมครับ

ping98k

Owner Feb 23, 2024

•

edited Feb 23, 2024

ตัวนี้ finetune ด้วย https://github.com/OpenAccess-AI-Collective/axolotl ครับ
มันจะใช้ config.yml เป็นตัว set ค่าต่างๆ ไม่ต้องเขียน code เอง
อันนี้ example จาก axolotl

ส่วน ของ config ที่ใช้ finetune ตัวนี้ จะอยู่ใน readme ตรงที่เขียนว่า See axolotl config

dataset ใช้ ping98k/dolly-th แต่ว่ามีการ mix กันใหม่ แล้วใส่ พวก START OF DOCUMENT NEXT DOCUMENT ลงไปเพิ่มครับ

dataset mix แล้วเป็นตัวนี้ ping98k/dolly-rag-instruct-th

atipasvanund

Feb 27, 2024

ขอบคุณมากครับ ผมมือใหม่ขัดขับมากๆ ครับ

ผมทดลองทำแล้ว บนเครื่องของตัวเองและ RTX 4090 ที่มี VRAM อยู่ 24GB

ปรากฎว่า VRAM เต็ม เลยไม่ทราบว่า สำหรับ ping98k/dolly-rag-instruct-th ได้ Qunatize หรือ QLoRA ด้วยหรือเปล่าครับ และทางคุณ ping98K ทำบน GPU อะไร ที่มี VRAM เท่าไหร่ครับ

ping98k

Owner Mar 2, 2024

•

edited Mar 2, 2024

ตัวนี้เป็น full finetune ใช้ A100 80GB ครับ

ถ้าจะใช้ vram 24GB ต้องเป็น QLoRA 4bit ครับ ลองดูได้จาก ping98k/typhoon-thai-food-lora ตัวนี้จะใช้ RTX 3090 24GB แล้วก็ base model เป็น typhoon เหมือนกันคับ

config หลักๆจะเป็น 3 ตัวนี้ครับ

sequence_len: 4096
load_in_4bit: true
micro_batch_size: 2

ping98k changed discussion status to closed Apr 6, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment