socratesft/SocSci210
Viewer • Updated • 2.9M • 2.08k • 11
How to use socratesft/socrates-llama3-8b-sft with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="socratesft/socrates-llama3-8b-sft")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("socratesft/socrates-llama3-8b-sft")
model = AutoModelForCausalLM.from_pretrained("socratesft/socrates-llama3-8b-sft")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use socratesft/socrates-llama3-8b-sft with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "socratesft/socrates-llama3-8b-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "socratesft/socrates-llama3-8b-sft",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/socratesft/socrates-llama3-8b-sft
How to use socratesft/socrates-llama3-8b-sft with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "socratesft/socrates-llama3-8b-sft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "socratesft/socrates-llama3-8b-sft",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "socratesft/socrates-llama3-8b-sft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "socratesft/socrates-llama3-8b-sft",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use socratesft/socrates-llama3-8b-sft with Docker Model Runner:
docker model run hf.co/socratesft/socrates-llama3-8b-sft
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "socratesft/socrates-llama3-8b-sft"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# Example usage
prompt_system = "You are simulating a survey respondent. Answer exactly as instructed, following the specified response format without additional commentary."
prompt_user = """You are a survey respondent with the following demographic profile:
- Age: 31
- Gender: Male
- Education: Post grad study/professional degree
- Employment: Employed as paid employee
- Marital Status: Living with partner
- Housing Ownership: Rented for cash
- Housing Type: A one-family house detached from any other house
- Location: Kentucky
- Metro Status: Metro Area
- Income: 100-124K
- Internet Access: Internet Household
- Household Size: 2
- Phone Service: Cellphone only
Read the question below and answer exactly as this person would. Follow the response instructions precisely.
You read “There is a new halfway house opening in your neighborhood where recently released felons will live. The director is letting neighbors select applicants to live at the house, and you have the choice between the following two candidates:” Candidate 1: Sex: Male; Crime: Nonviolent burglary; Education: Vocational training; Race: Latino; Age: 35 years old; Previous work: Seasonal/part-time employment; Job seeking: Going to temp agencies; Family status: Divorced, no children. Candidate 2: Sex: Male; Crime: Nonviolent burglary; Education: GED; Race: Black; Age: 22 years old; Previous work: Steady full-time employment; Job seeking: Submitting resumes; Family status: Divorced, no children. You were then asked: “If you had to choose between them, which of the two candidates should be admitted to the halfway house in your neighborhood?” Only return 1 to choose the first candidate; 2 to choose the second candidate, nothing else."""
messages = [
{"role": "system", "content": prompt_system},
{"role": "user", "content": prompt_user}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
outputs = model.generate(input_ids, max_new_tokens=500)
generated_ids = outputs[0][input_ids.shape[-1]:]
response = tokenizer.decode(generated_ids, skip_special_tokens=True)
print(response)
participant_mapping)Built with Meta Llama 3
This is a finetuned LLaMA-3 8B model trained on survey response data using Supervised Finetuning (SFT).
This model is a derivative of Meta Llama 3 and is subject to the Meta Llama 3 Community License Agreement.
Notice: Llama 3 is licensed under the Llama 3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
Base model
meta-llama/Meta-Llama-3-8B-Instruct