Instructions to use Edentns/Worktro-S2-q0f16-MLC with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Edentns/Worktro-S2-q0f16-MLC with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Edentns/Worktro-S2-q0f16-MLC") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Edentns/Worktro-S2-q0f16-MLC", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Edentns/Worktro-S2-q0f16-MLC with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Edentns/Worktro-S2-q0f16-MLC" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Edentns/Worktro-S2-q0f16-MLC", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Edentns/Worktro-S2-q0f16-MLC
- SGLang
How to use Edentns/Worktro-S2-q0f16-MLC with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Edentns/Worktro-S2-q0f16-MLC" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Edentns/Worktro-S2-q0f16-MLC", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Edentns/Worktro-S2-q0f16-MLC" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Edentns/Worktro-S2-q0f16-MLC", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Edentns/Worktro-S2-q0f16-MLC with Docker Model Runner:
docker model run hf.co/Edentns/Worktro-S2-q0f16-MLC
Worktro-S2-q0f16-MLC
๋ชจ๋ธ ์ค๋ช (Model Description)
Worktro-S2-q0f16-MLC ๋ชจ๋ธ์ Worktro-S2 ๋ชจ๋ธ์ MLC-LLM ๋ผ์ด๋ธ๋ฌ๋ฆฌ์์ ๋์ํ ์ ์๋๋ก ์ปจ๋ฒํ
๋ ์ํ ์ธ์ด ๋ชจ๋ธ(sLM)์
๋๋ค. ์์ธํ ๋ด์ฉ์ Worktro-S2 ๋ชจ๋ธ ์ค๋ช
์ ์ฐธ๊ณ ํด์ฃผ์ธ์.
The Worktro-S2-q0f16-MLC model is a small language model (sLM) that has been converted from the Worktro-S2 model to operate within the MLC-LLM library. For more detailed information, please refer to the description of the Worktro-S2 model.
๋ผ์ด์ ์ค (License)
์ด ๋ชจ๋ธ์ cc-by-nc-4.0 ๋ผ์ด์ ์ค ํ์ ์ ๊ณต๋ฉ๋๋ค. ์ด ๋ผ์ด์ ์ค๋ ๋ค๋ฅธ ์ฌ๋๋ค์ด ๋น์์ ์ ์ฉ๋ ๋ฐ ์ฐ๊ตฌ ๋ชฉ์ ์ผ๋ก ๋ชจ๋ธ์ ๊ณต์ ํ๊ณ ์์ ํ ์ ์๋๋ก ํ์ฉํฉ๋๋ค.
This model is provided under the cc-by-nc-4.0 license. This license allows others to share and adapt the model for non-commercial purposes and research use.
ยฉ 2024 ์ด๋ ํฐ์ค์์ค (EDEN T&S). All rights reserved.
Model tree for Edentns/Worktro-S2-q0f16-MLC
Base model
Edentns/Worktro-S2