roneneldan/TinyStories
Viewer • Updated • 2.14M • 85.3k • 1.02k
How to use unamedai/KateAI with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="unamedai/KateAI") # Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("unamedai/KateAI", dtype="auto")How to use unamedai/KateAI with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "unamedai/KateAI"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "unamedai/KateAI",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/unamedai/KateAI
How to use unamedai/KateAI with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "unamedai/KateAI" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "unamedai/KateAI",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "unamedai/KateAI" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "unamedai/KateAI",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use unamedai/KateAI with Docker Model Runner:
docker model run hf.co/unamedai/KateAI
This model is in the process of being moved from gpt2 architecture -> a custom architecture. Note that it may not work at certain times because of the moving. Thank you for understanding.
This is a custom model for text generation. Money wasted: 11€
model_type: KateAIForCasualLMPlease use the api of the space.
pip install gradio_client
from gradio_client import Client
client = Client("unamedai/Kate")
result = client.predict(
message="Once,",
max_tokens=512,
temperature=0.8,
top_p=0.95,
api_name="/predict"
)
print(result)