arxiv:2603.20724

Multi-RF Fusion with Multi-GNN Blending for Molecular Property Prediction

Published on Mar 21

Authors:

Abstract

Multi-RF Fusion achieves top performance on molecular property prediction through ensemble methods combining random forests and graph neural networks with optimized hyperparameters and seed averaging.

AI-generated summary

Multi-RF Fusion achieves a test ROC-AUC of 0.8476 +/- 0.0002 on ogbg-molhiv (10 seeds), placing #1 on the OGB leaderboard ahead of HyperFusion (0.8475 +/- 0.0003). The core of the method is a rank-averaged ensemble of 12 Random Forest models trained on concatenated molecular fingerprints (FCFP, ECFP, MACCS, atom pairs -- 4,263 dimensions total), blended with deep-ensembled GNN predictions at 12% weight. Two findings drive the result: (1) setting max_features to 0.20 instead of the default sqrt(d) gives a +0.008 AUC gain on this scaffold split, and (2) averaging GNN predictions across 10 seeds before blending with the RF eliminates GNN seed variance entirely, dropping the final standard deviation from 0.0008 to 0.0002. No external data or pre-training is used.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2603.20724

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.20724 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.20724 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.20724 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.