Duplicated from deepseek-ai/DeepSeek-V4-Pro

VECTORVV1
/

vector-V4-Pro

Text Generation

8-bit precision

Model card Files Files and versions

vector-V4-Pro / inference /README.md

VECTORVV1's picture

Duplicate from deepseek-ai/DeepSeek-V4-Pro

6479f50 20 days ago

|

history blame contribute delete

951 Bytes

	# Inference code for DeepSeek models

	First convert huggingface model weight files to the format of this project.
	```bash
	export EXPERTS=384
	export MP=8
	export CONFIG=config.json
	python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}
	```

	Then chat with DeepSeek model at will!
	```bash
	torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive
	```

	Or batch inference from file.
	```bash
	torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --input-file ${FILE}
	```

	Or multi nodes inference.
	```bash
	torchrun --nnodes ${NODES} --nproc-per-node $((MP / NODES)) --node-rank $RANK --master-addr $ADDR generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --input-file ${FILE}
	```

	If you want to use fp8, just remove `"expert_dtype": "fp4"` in `config.json` and specify `--expert-dtype fp8` in `convert.py`.