Sentence Similarity
sentence-transformers
Safetensors
modchembert
cheminformatics
smiles
molecular-similarity
feature-extraction
dense
Generated from Trainer
dataset_size:19381001
loss:Matryoshka2dLoss
loss:MatryoshkaLoss
loss:TanimotoSentLoss
custom_code
Eval Results (legacy)
Instructions to use Derify/ChemMRL with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Derify/ChemMRL with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Derify/ChemMRL", trust_remote_code=True) sentences = [ "COC(=O)c1sc(-c2ccc(C)cc2)c2c1NC(=O)C2(c1ccccc1)c1ccccc1", "COC(=O)c1sc(Nc2ccc(Br)cn2)c2c1NC(=O)C2(c1ccccc1)c1ccccc1", "CC[NH+]1CCOC(C(NN)c2ccccc2Br)C1", "CC([NH2+]C(C)c1ccccc1)C(=O)P(C)C(C)(C)C" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -227,11 +227,11 @@ print(similarities)
|
|
| 227 |
| type | string | string | float |
|
| 228 |
| details | <ul><li>min: 17 tokens</li><li>mean: 42.36 tokens</li><li>max: 122 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 40.93 tokens</li><li>max: 122 tokens</li></ul> | <ul><li>min: 0.02</li><li>mean: 0.56</li><li>max: 1.0</li></ul> |
|
| 229 |
* Samples:
|
| 230 |
-
| smiles_a
|
| 231 |
-
| :--------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------- | :------------------------------ |
|
| 232 |
-
| <code>COc1ccc(NC(=O)C2CC[NH+](C(C)C(=O)Nc3ccc(C(=O)Nc4ccc(F)c(F)c4)cc3C)CC2)cc1NC(=O)C1CCCCC1</code> | <code>Cc1cc(C(=O)Nc2ccc(F)c(F)c2)ccc1NC(=O)C(C)[NH+]1CCC(C(=O)Nc2cccc(NC(=O)C3CCCCC3)c2)CC1</code> | <code>0.8495575189590454</code> |
|
| 233 |
-
| <code>OCCN1CC[NH+](Cc2ccccc2OC2CC2)CC1</code> | <code>OCCN1CC[NH+](Cc2ccccc2On2cccn2)CC1</code> | <code>0.6615384817123413</code> |
|
| 234 |
-
| <code>CC1CN(C(=O)C2CC[NH+](Cc3cccc(C(N)=O)c3)CC2)CC(C)O1</code> | <code>CC1CN(C(=O)C2CC[NH+](Cc3ccccc3)CC2)CC(C)O1</code> | <code>0.7123287916183472</code> |
|
| 235 |
* Loss: [<code>Matryoshka2dLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshka2dloss) with these parameters:
|
| 236 |
```json
|
| 237 |
{
|
|
|
|
| 227 |
| type | string | string | float |
|
| 228 |
| details | <ul><li>min: 17 tokens</li><li>mean: 42.36 tokens</li><li>max: 122 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 40.93 tokens</li><li>max: 122 tokens</li></ul> | <ul><li>min: 0.02</li><li>mean: 0.56</li><li>max: 1.0</li></ul> |
|
| 229 |
* Samples:
|
| 230 |
+
| smiles_a | smiles_b | label |
|
| 231 |
+
| :----------------------------------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------- | :------------------------------ |
|
| 232 |
+
| <code>COc1ccc(NC(=O)C2CC\[NH+\](C(C)C(=O)Nc3ccc(C(=O)Nc4ccc(F)c(F)c4)cc3C)CC2)cc1NC(=O)C1CCCCC1</code> | <code>Cc1cc(C(=O)Nc2ccc(F)c(F)c2)ccc1NC(=O)C(C)\[NH+\]1CCC(C(=O)Nc2cccc(NC(=O)C3CCCCC3)c2)CC1</code> | <code>0.8495575189590454</code> |
|
| 233 |
+
| <code>OCCN1CC\[NH+\](Cc2ccccc2OC2CC2)CC1</code> | <code>OCCN1CC\[NH+\](Cc2ccccc2On2cccn2)CC1</code> | <code>0.6615384817123413</code> |
|
| 234 |
+
| <code>CC1CN(C(=O)C2CC\[NH+\](Cc3cccc(C(N)=O)c3)CC2)CC(C)O1</code> | <code>CC1CN(C(=O)C2CC\[NH+\](Cc3ccccc3)CC2)CC(C)O1</code> | <code>0.7123287916183472</code> |
|
| 235 |
* Loss: [<code>Matryoshka2dLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshka2dloss) with these parameters:
|
| 236 |
```json
|
| 237 |
{
|