metadata
license: apache-2.0
base_model:
- Qwen/Qwen3-14B
Nano-Raccoon-Preview-1104
Prototyping checkpoint for NeAR-specialized SLM. Deployment friendly to single consumer GPU.
This model is a light SFT version from Qwen/Qwen3-14B, aimed at stable generative behavior on NeAR agent scaffold.
Serve with vllm
Single GPU
vllm serve billxbf/Nano-Raccoon-Preview-1104 \
--trust-remote-code \
--host 0.0.0.0 \
--port 8000
Use Tensor Parallel on 8xGPU
vllm serve billxbf/Nano-Raccoon-Preview-1104 \
--tensor-parallel-size 8 \
--trust-remote-code \
--host 0.0.0.0 \
--port 8000