payload = { "model": MODEL, "messages": messages, "max_tokens": 128, "temperature": 0.3, "stop": ["\nUser:"], "repetition_penalty": 1.1 }
python -m vllm.entrypoints.openai.api_server
--model mkd-ai/keural-alpha-d
--trust-remote-code
--dtype bfloat16
--max-model-len 2048
--port 8000
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for mkd-ai/keural-alpha-working-well
Base model
mkd-hossain/keural-alpha