Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -30,13 +30,13 @@ datasets:
|
|
| 30 |
|
| 31 |
## BioReason-Pro RL
|
| 32 |
|
| 33 |
-
Reinforcement learning (GRPO) optimized checkpoint of BioReason-Pro
|
| 34 |
|
| 35 |
-
**Training data:**
|
| 36 |
|
| 37 |
See also:
|
| 38 |
-
- [BioReason-Pro SFT](https://huggingface.co/wanglab/bioreason-pro-sft)
|
| 39 |
-
- [GO-GPT](https://huggingface.co/wanglab/gogpt)
|
| 40 |
|
| 41 |
|
| 42 |
## Citation
|
|
|
|
| 30 |
|
| 31 |
## BioReason-Pro RL
|
| 32 |
|
| 33 |
+
Reinforcement learning (GRPO) optimized checkpoint of BioReason-Pro, a multimodal reasoning LLM for protein function prediction. This model builds on the SFT checkpoint and is further optimized through group relative policy optimization to improve reasoning quality and GO term prediction accuracy.
|
| 34 |
|
| 35 |
+
**Training data:** [wanglab/bioreason-pro-rl-reasoning-data](https://huggingface.co/datasets/wanglab/bioreason-pro-rl-reasoning-data)
|
| 36 |
|
| 37 |
See also:
|
| 38 |
+
- [BioReason-Pro SFT](https://huggingface.co/wanglab/bioreason-pro-sft) - supervised fine-tuned checkpoint
|
| 39 |
+
- [GO-GPT](https://huggingface.co/wanglab/gogpt) - autoregressive GO term predictor
|
| 40 |
|
| 41 |
|
| 42 |
## Citation
|