This model is trained from the CareBot-instruct model use dpo strategy. details can be seen on the MonteXiaofeng/CareBot_Medical_multi-llama3-8b-instruct

Acknowledgements

This work is supported by the National Science and Technology Major Project (No. 2022ZD0116314).

本项目受新一代人工智能国家科技重大专项（No. 2022ZD0116314）支持。

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MonteXiaofeng/CareBot_Medical_multi-llama3-8b-rl

Base model

Finetuned

Finetuned

Finetuned

(1)

this model