You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Mol-LLM: Multimodal Generalist Molecular LLM

Mol-LLM is a multimodal generalist molecular large language model for chemistry that jointly uses molecular 1D sequences and 2D molecular graphs to solve a wide range of molecular tasks in a single unified framework. It introduces Molecular structure Preference Optimization (MolPO) to force the LLM to prefer correct molecular graphs over perturbed ones, resolving the “graph-bypass” issue common in prior multimodal molecular LLMs.

Model summary

  • Backbone: Mistral-7B-Instruct-v0.3.
  • Modalities:
    • Text (natural language instructions).
    • 1D molecular sequences (SELFIES; SMILES supported via translation).
    • 2D molecular graphs encoded by a hybrid GNN (GINE + TokenGT).
  • Architecture: Mol-LLM uses a BLIP-2–style architecture where Q-Former (32 query tokens) projects graph embeddings into the LLM token space.
    • LLM:

      • Mistral-7B-Instruct-v0.3 as text backbone.
      • Extended tokenizer with SELFIES and numeric tokens, plus task tags for heterogeneous outputs (discrete labels, floats, descriptions).
    • Hybrid graph encoder:

      • GINE for local structural patterns.
      • TokenGT (transformer-based) for global structural dependencies and large graphs.
      • Both encoders produce graph and node (and edge) embeddings; concatenated embeddings are fed into the Q-Former.
    • Q-Former:

      • 5-layer SciBERT-style transformer with 32 learnable queries.
      • Cross-attends to graph embeddings and outputs fixed-length tokens appended after SELFIES tokens in the LLM input.
      • Selected over an MLP projector due to better alignment and graph-token efficiency.
  • Tokenizer extensions: 3K SELFIES tokens, numeric tokens, and task tags for [SELFIES], [BOOLEAN], [FLOAT], [DESCRIPTION], and reaction-direction symbols.
  • Training data: ~3.3M instruction-tuning examples over 27 tasks, with ~40K held-out test instances.

Mol-LLM is positioned as a state-of-the-art or comparable generalist molecular LLM on the most comprehensive benchmark suite evaluated so far, including out-of-distribution (OOD) settings.

Intended use

Mol-LLM is intended to solve molecular tasks via a single multitask model.

Supported task families:

  • Reaction prediction:

    • Forward synthesis (product prediction, FS)
    • Retrosynthesis (reactant prediction, RS)
    • Reagent prediction (RP)
  • Property prediction:

    • Regression: LogS, LogD, HOMO, LUMO, HOMO–LUMO gap
    • Classification: BACE, BBBP, ClinTox, HIV, SIDER
  • Text–molecule tasks:

    • Description-guided molecule generation
    • Molecule captioning
    • IUPAC/SELFIES/formula translation as auxiliary tasks
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KU-AGI/Mol-LLM

Finetuned
(403)
this model

Collection including KU-AGI/Mol-LLM