Derify
/

ChemRanker-alpha-qed-sim

@@ -52,7 +52,7 @@ model-index:
 This [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) is finetuned from [Derify/ModChemBERT-IR-BASE](https://huggingface.co/Derify/ModChemBERT-IR-BASE) using hard-negative triplets derived from [Derify/pubchem_10m_genmol_similarity](https://huggingface.co/datasets/Derify/pubchem_10m_genmol_similarity). Positive SMILES pairs are first filtered by quality and similarity constraints, then reduced to one strongest positive target per anchor molecule to create a high-signal training set for reranking. The model computes relevance scores for pairs of SMILES strings, enabling SMILES reranking and molecular semantic search.
-For this variant, positives are selected with a composite ranking criterion that combines high QED and similarity without an additional similarity-contribution cutoff. The quality stage uses strict inequality filtering (`QED > 0.85`, `similarity > 0.5`, with similarity also bounded below 1.0), and then keeps the top-scoring pair per anchor molecule.
 Hard negatives are mined with [Sentence Transformers](https://www.sbert.net/) using [Derify/ChemMRL-beta](https://huggingface.co/Derify/ChemMRL-beta) as the teacher model and a TopK-PercPos-style margin setting based on [NV-Retriever](https://arxiv.org/abs/2407.15831), with `relative_margin=0.05` and `max_negative_score_threshold = pos_score * percentage_margin`. Training uses triplet-format samples with 5 mined negatives per anchor-positive pair and optimizes a multiple-negatives ranking objective, while reranking evaluation uses n-tuple samples with 30 mined negatives per query.
@@ -65,7 +65,6 @@ Hard negatives are mined with [Sentence Transformers](https://www.sbert.net/) us
 - **Number of Output Labels:** 1 label
 - **Training Dataset:**
   - [Derify/pubchem_10m_genmol_similarity](https://huggingface.co/datasets/Derify/pubchem_10m_genmol_similarity) Mined Hard Negatives
-<!-- - **Language:** Unknown -->
 - **License:** apache-2.0
 ### Model Sources
@@ -258,6 +257,7 @@ You can finetune this model on your own dataset.
 - `torch_compile`: True
 - `torch_compile_backend`: inductor
 - `torch_compile_mode`: max-autotune
 - `batch_sampler`: no_duplicates
 #### All Hyperparameters
@@ -372,7 +372,7 @@ You can finetune this model on your own dataset.
 - `neftune_noise_alpha`: None
 - `optim_target_modules`: None
 - `batch_eval_metrics`: False
-- `eval_on_start`: False
 - `use_liger_kernel`: False
 - `liger_kernel_config`: None
 - `eval_use_gather_object`: False

 This [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) is finetuned from [Derify/ModChemBERT-IR-BASE](https://huggingface.co/Derify/ModChemBERT-IR-BASE) using hard-negative triplets derived from [Derify/pubchem_10m_genmol_similarity](https://huggingface.co/datasets/Derify/pubchem_10m_genmol_similarity). Positive SMILES pairs are first filtered by quality and similarity constraints, then reduced to one strongest positive target per anchor molecule to create a high-signal training set for reranking. The model computes relevance scores for pairs of SMILES strings, enabling SMILES reranking and molecular semantic search.
+For this variant, the positives are selected with a composite ranking criterion that combines high QED and similarity without an additional similarity-contribution cutoff. The quality stage uses strict inequality filtering (`QED > 0.85`, `similarity > 0.5`, with similarity also bounded below 1.0), and then keeps the top-scoring pair per anchor molecule.
 Hard negatives are mined with [Sentence Transformers](https://www.sbert.net/) using [Derify/ChemMRL-beta](https://huggingface.co/Derify/ChemMRL-beta) as the teacher model and a TopK-PercPos-style margin setting based on [NV-Retriever](https://arxiv.org/abs/2407.15831), with `relative_margin=0.05` and `max_negative_score_threshold = pos_score * percentage_margin`. Training uses triplet-format samples with 5 mined negatives per anchor-positive pair and optimizes a multiple-negatives ranking objective, while reranking evaluation uses n-tuple samples with 30 mined negatives per query.
 - **Number of Output Labels:** 1 label
 - **Training Dataset:**
   - [Derify/pubchem_10m_genmol_similarity](https://huggingface.co/datasets/Derify/pubchem_10m_genmol_similarity) Mined Hard Negatives
 - **License:** apache-2.0
 ### Model Sources
 - `torch_compile`: True
 - `torch_compile_backend`: inductor
 - `torch_compile_mode`: max-autotune
+- `eval_on_start`: True
 - `batch_sampler`: no_duplicates
 #### All Hyperparameters
 - `neftune_noise_alpha`: None
 - `optim_target_modules`: None
 - `batch_eval_metrics`: False
+- `eval_on_start`: True
 - `use_liger_kernel`: False
 - `liger_kernel_config`: None
 - `eval_use_gather_object`: False