bitext/Bitext-customer-support-llm-chatbot-training-dataset
Viewer • Updated • 26.9k • 6.97k • 172
How to use Ellight/Llama-3.2-1B-Customer-support-Chatbot with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Ellight/Llama-3.2-1B-Customer-support-Chatbot") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Ellight/Llama-3.2-1B-Customer-support-Chatbot", dtype="auto")How to use Ellight/Llama-3.2-1B-Customer-support-Chatbot with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Ellight/Llama-3.2-1B-Customer-support-Chatbot"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Ellight/Llama-3.2-1B-Customer-support-Chatbot",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/Ellight/Llama-3.2-1B-Customer-support-Chatbot
How to use Ellight/Llama-3.2-1B-Customer-support-Chatbot with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Ellight/Llama-3.2-1B-Customer-support-Chatbot" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Ellight/Llama-3.2-1B-Customer-support-Chatbot",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Ellight/Llama-3.2-1B-Customer-support-Chatbot" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Ellight/Llama-3.2-1B-Customer-support-Chatbot",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use Ellight/Llama-3.2-1B-Customer-support-Chatbot with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Ellight/Llama-3.2-1B-Customer-support-Chatbot to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Ellight/Llama-3.2-1B-Customer-support-Chatbot to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Ellight/Llama-3.2-1B-Customer-support-Chatbot to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="Ellight/Llama-3.2-1B-Customer-support-Chatbot",
max_seq_length=2048,
)How to use Ellight/Llama-3.2-1B-Customer-support-Chatbot with Docker Model Runner:
docker model run hf.co/Ellight/Llama-3.2-1B-Customer-support-Chatbot
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
The Llama model is trained using Bitext-customer-support-llm-chatbot-training-dataset for generating more accurate responses to customer queries. The model can be further finetuned based on specific companies data.
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Ellight/Llama-3.2-1B-Customer-support-Chatbot", # YOUR MODEL YOU USED FOR TRAINING
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Response:
{}"""
inputs = tokenizer(
[
alpaca_prompt.format(
"I want assistance to cancel order", # instruction
"", # output - leave this blank for generation!
)
], return_tensors = "pt").to("cuda")
from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 512)
Model Response:
<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
I want assistance to cancel order
### Response:
I understand that you need assistance in canceling your order. I'm here to guide you through the process. To cancel your order, you can follow these steps:
1. Log in to your account on our website.
2. Navigate to the "Orders" or "My Account" section.
3. Locate your order and click on it.
4. On the order details page, locate the "Cancel Order" or "Cancel Order" option.
5. Click on it to initiate the cancellation process.
If you encounter any difficulties or have any questions during the cancellation process, please don't hesitate to reach out to us. We're here to assist you every step of the way.<|end_of_text|>