anwllama 1
Collection
1 item • Updated
How to use anwgpt/anwllama-1-base with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="anwgpt/anwllama-1-base")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("anwgpt/anwllama-1-base")
model = AutoModelForCausalLM.from_pretrained("anwgpt/anwllama-1-base")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use anwgpt/anwllama-1-base with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "anwgpt/anwllama-1-base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "anwgpt/anwllama-1-base",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/anwgpt/anwllama-1-base
How to use anwgpt/anwllama-1-base with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "anwgpt/anwllama-1-base" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "anwgpt/anwllama-1-base",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "anwgpt/anwllama-1-base" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "anwgpt/anwllama-1-base",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use anwgpt/anwllama-1-base with Docker Model Runner:
docker model run hf.co/anwgpt/anwllama-1-base
Step Training Loss
100 9.552530
200 7.434567
300 6.307720
400 5.781603
500 5.459788
600 5.196931
700 4.950012
800 4.757678
900 4.617741
1000 4.509363
1100 4.377576
1200 4.273639
1300 4.202172
1400 4.124585
1500 4.049357
1600 3.946516
1700 3.853138
1800 3.818091
1900 3.790880
2000 3.793084