Update README.md
Browse files
README.md
CHANGED
|
@@ -6,6 +6,11 @@ language:
|
|
| 6 |
- en
|
| 7 |
---
|
| 8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
**Aug 2023 Update:**
|
| 10 |
1. The SPECTER 2.0 Base and proximity adapter models have been renamed in Hugging Face based upon usage patterns as follows:
|
| 11 |
|
|
@@ -18,9 +23,7 @@ language:
|
|
| 18 |
However, for benchmarking purposes, please continue using the current version.
|
| 19 |
|
| 20 |
|
| 21 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
| 22 |
|
| 23 |
-
# SPECTER 2.0 (Base)
|
| 24 |
SPECTER 2.0 is the successor to [SPECTER](https://huggingface.co/allenai/specter) and is capable of generating task specific embeddings for scientific tasks when paired with [adapters](https://huggingface.co/models?search=allenai/specter-2_).
|
| 25 |
This is the base model to be used along with the adapters.
|
| 26 |
Given the combination of title and abstract of a scientific paper or a short texual query, the model can be used to generate effective embeddings to be used in downstream applications.
|
|
@@ -39,7 +42,7 @@ Post that it is trained with additionally attached task format specific adapter
|
|
| 39 |
Task Formats trained on:
|
| 40 |
- Classification
|
| 41 |
- Regression
|
| 42 |
-
- Proximity
|
| 43 |
- Adhoc Search
|
| 44 |
|
| 45 |
|
|
@@ -69,12 +72,12 @@ It builds on the work done in [SciRepEval: A Multi-Format Benchmark for Scientif
|
|
| 69 |
|
| 70 |
|Model|Name and HF link|Description|
|
| 71 |
|--|--|--|
|
| 72 |
-
|
|
| 73 |
-
|Adhoc Query|[allenai/specter2_adhoc_query](https://huggingface.co/allenai/specter2_adhoc_query)|Encode short raw text queries for search tasks. (Candidate papers can be encoded with proximity)|
|
| 74 |
|Classification|[allenai/specter2_classification](https://huggingface.co/allenai/specter2_classification)|Encode papers to feed into linear classifiers as features|
|
| 75 |
|Regression|[allenai/specter2_regression](https://huggingface.co/allenai/specter2_regression)|Encode papers to feed into linear regressors as features|
|
| 76 |
|
| 77 |
-
*
|
| 78 |
|
| 79 |
```python
|
| 80 |
from transformers import AutoTokenizer, AutoModel
|
|
@@ -86,7 +89,7 @@ tokenizer = AutoTokenizer.from_pretrained('allenai/specter2_base')
|
|
| 86 |
model = AutoModel.from_pretrained('allenai/specter2_base')
|
| 87 |
|
| 88 |
#load the adapter(s) as per the required task, provide an identifier for the adapter in load_as argument and activate it
|
| 89 |
-
model.load_adapter("allenai/
|
| 90 |
#other possibilities: allenai/specter2_<classification|regression|adhoc_query>
|
| 91 |
|
| 92 |
papers = [{'title': 'BERT', 'abstract': 'We introduce a new language representation model called BERT'},
|
|
|
|
| 6 |
- en
|
| 7 |
---
|
| 8 |
|
| 9 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
| 10 |
+
|
| 11 |
+
# SPECTER 2.0 (Base)
|
| 12 |
+
|
| 13 |
+
|
| 14 |
**Aug 2023 Update:**
|
| 15 |
1. The SPECTER 2.0 Base and proximity adapter models have been renamed in Hugging Face based upon usage patterns as follows:
|
| 16 |
|
|
|
|
| 23 |
However, for benchmarking purposes, please continue using the current version.
|
| 24 |
|
| 25 |
|
|
|
|
| 26 |
|
|
|
|
| 27 |
SPECTER 2.0 is the successor to [SPECTER](https://huggingface.co/allenai/specter) and is capable of generating task specific embeddings for scientific tasks when paired with [adapters](https://huggingface.co/models?search=allenai/specter-2_).
|
| 28 |
This is the base model to be used along with the adapters.
|
| 29 |
Given the combination of title and abstract of a scientific paper or a short texual query, the model can be used to generate effective embeddings to be used in downstream applications.
|
|
|
|
| 42 |
Task Formats trained on:
|
| 43 |
- Classification
|
| 44 |
- Regression
|
| 45 |
+
- Proximity (Retrieval)
|
| 46 |
- Adhoc Search
|
| 47 |
|
| 48 |
|
|
|
|
| 72 |
|
| 73 |
|Model|Name and HF link|Description|
|
| 74 |
|--|--|--|
|
| 75 |
+
|Proximity*|[allenai/specter2](https://huggingface.co/allenai/specter2)|Encode papers as queries and candidates eg. Link Prediction, Nearest Neighbor Search|
|
| 76 |
+
|Adhoc Query|[allenai/specter2_adhoc_query](https://huggingface.co/allenai/specter2_adhoc_query)|Encode short raw text queries for search tasks. (Candidate papers can be encoded with the proximity adapter)|
|
| 77 |
|Classification|[allenai/specter2_classification](https://huggingface.co/allenai/specter2_classification)|Encode papers to feed into linear classifiers as features|
|
| 78 |
|Regression|[allenai/specter2_regression](https://huggingface.co/allenai/specter2_regression)|Encode papers to feed into linear regressors as features|
|
| 79 |
|
| 80 |
+
*Proximity model should suffice for downstream task types not mentioned above
|
| 81 |
|
| 82 |
```python
|
| 83 |
from transformers import AutoTokenizer, AutoModel
|
|
|
|
| 89 |
model = AutoModel.from_pretrained('allenai/specter2_base')
|
| 90 |
|
| 91 |
#load the adapter(s) as per the required task, provide an identifier for the adapter in load_as argument and activate it
|
| 92 |
+
model.load_adapter("allenai/specter2", source="hf", load_as="proximity", set_active=True)
|
| 93 |
#other possibilities: allenai/specter2_<classification|regression|adhoc_query>
|
| 94 |
|
| 95 |
papers = [{'title': 'BERT', 'abstract': 'We introduce a new language representation model called BERT'},
|