宋小猫

SongXiaoMao

40 27

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

Qwen/Qwen3.6-27B-FP8

liked a model 3 days ago

Qwen/Qwen3.6-27B

new activity 5 days ago

nerkyor/Qwen3.6-27B-DSV4Pro-Thinking-Distill:VLLM启动报错

View all activity

Organizations

None yet

New activity in nerkyor/Qwen3.6-27B-DSV4Pro-Thinking-Distill 5 days ago

VLLM启动报错

#2 opened 6 days ago by

SongXiaoMao

New activity in havenoammo/Qwen3.6-27B-INT8-MTP about 1 month ago

MTP efficiency without official FP8 ha

👍 1

#1 opened about 1 month ago by

SongXiaoMao

New activity in nameistoken/Qwen3.6-27B-Quark-W8A8-INT8 about 1 month ago

MTP cannot be accelerated

#1 opened about 1 month ago by

SongXiaoMao

New activity in inclusionAI/Ling-2.6-flash-int4 2 months ago

The official VLLM example starts normal inference error

#3 opened 2 months ago by

SongXiaoMao

New activity in groxaxo/Qwen3.6-27B-GPTQ-Pro-4bit 2 months ago

This model cannot use MTP

#2 opened 2 months ago by

SongXiaoMao

Modify the configuration file

🔥 1

#1 opened 2 months ago by

SongXiaoMao

New activity in z-lab/Qwen3.5-27B-DFlash 3 months ago

FP8 work for base model or is 16-bit of 27B required?

#2 opened 3 months ago by

unoid

New activity in Jackrong/Qwopus3.5-27B-v3 3 months ago

Is there anyone who can tell me how to run this model with vllm correctly?

😔 3

#8 opened 3 months ago by

beginor

New activity in olka-fi/Qwen3.5-27B-MXFP4 3 months ago

Can the big guy quantify this model into MXFP4? Thank you!!

#3 opened 3 months ago by

SongXiaoMao

New activity in Jackrong/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled 3 months ago

How does the VLLM start this model?

#4 opened 3 months ago by

SongXiaoMao

New activity in olka-fi/Qwen3.5-122B-A10B-MXFP4 3 months ago

This quantization model is amzing

❤️👍 2

#1 opened 4 months ago by

hyunw55

New activity in olka-fi/Qwen3.5-27B-MXFP4 3 months ago

Why is the file size of 4bit similar to FP8?

#2 opened 3 months ago by

SongXiaoMao

New activity in edp1096/Huihui-Qwen3.5-27B-abliterated-FP8 3 months ago

Sensitive information is not a question

#3 opened 3 months ago by

SongXiaoMao

New activity in win10/Huihui-Qwen3.5-27B-abliterated-FP8 3 months ago

VLLM 0.18.0 runs with an error

#2 opened 3 months ago by

SongXiaoMao

New activity in groxaxo/Huihui-Qwen3.5-27B-W8A8-INT8 3 months ago

I get an error using vllm0.18.0

#1 opened 3 months ago by

SongXiaoMao

New activity in huihui-ai/Huihui-Qwen3.5-27B-Claude-4.6-Opus-abliterated 3 months ago

使用VLLM启动会报错

#3 opened 3 months ago by

SongXiaoMao

New activity in Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled 3 months ago

Tokenizer class TokenizersBackend does not exist in vllm v0.17.1

#26 opened 4 months ago by

putcn

New activity in huihui-ai/Huihui-Qwen3.5-122B-A10B-abliterated-GGUF 4 months ago

Can you make a quantitative model? Qwen3.5-122B-A10B-GPTQ-Int4

#2 opened 4 months ago by

SongXiaoMao

New activity in Qwen/QwQ-32B over 1 year ago

When will you fix the model replies missing</think>\n start tags

#19 opened over 1 year ago by

xldistance

When answering questions in Chinese, the model frequently terminates prematurely (outputs the end token). Is this a common problem?

#40 opened over 1 year ago by

zhangw355

宋小猫

AI & ML interests

Recent Activity

Organizations

SongXiaoMao's activity

VLLM启动报错

MTP efficiency without official FP8 ha

MTP cannot be accelerated

The official VLLM example starts normal inference error

This model cannot use MTP

Modify the configuration file

FP8 work for base model or is 16-bit of 27B required?

Is there anyone who can tell me how to run this model with vllm correctly?

Can the big guy quantify this model into MXFP4? Thank you!!

How does the VLLM start this model?

This quantization model is amzing

Why is the file size of 4bit similar to FP8?

Sensitive information is not a question

VLLM 0.18.0 runs with an error

I get an error using vllm0.18.0

使用VLLM启动会报错

Tokenizer class TokenizersBackend does not exist in vllm v0.17.1

Can you make a quantitative model? Qwen3.5-122B-A10B-GPTQ-Int4

When will you fix the model replies missing</think>\n start tags

When answering questions in Chinese, the model frequently terminates prematurely (outputs the end token). Is this a common problem?