|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: |
|
|
- Qwen/Qwen3-14B |
|
|
--- |
|
|
|
|
|
# Nano-Raccoon-Preview-1104 |
|
|
|
|
|
Prototyping checkpoint for NeAR-specialized SLM. Deployment friendly to single consumer GPU. |
|
|
|
|
|
This model is a light SFT version from Qwen/Qwen3-14B, aimed at stable generative behavior on NeAR agent scaffold. |
|
|
|
|
|
<p align="center"> |
|
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/645b0cb3333fb18357875c96/EH53TeulV7rBYjf0TF_nc.png" width="400" height="400"> |
|
|
</p> |
|
|
|
|
|
|
|
|
## Serve with vllm |
|
|
|
|
|
**Single GPU** |
|
|
``` |
|
|
vllm serve billxbf/Nano-Raccoon-Preview-1104 \ |
|
|
--trust-remote-code \ |
|
|
--host 0.0.0.0 \ |
|
|
--port 8000 |
|
|
``` |
|
|
|
|
|
|
|
|
**Use Tensor Parallel on 8xGPU** |
|
|
``` |
|
|
vllm serve billxbf/Nano-Raccoon-Preview-1104 \ |
|
|
--tensor-parallel-size 8 \ |
|
|
--trust-remote-code \ |
|
|
--host 0.0.0.0 \ |
|
|
--port 8000 |
|
|
``` |
|
|
|