Improve model card: Add pipeline tag, library, license & usage
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,6 +1,9 @@
|
|
| 1 |
---
|
| 2 |
-
pipeline_tag:
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
| 4 |
# Model Card: Assisting Mathematical Formalization with A Learning-based Premise Retriever
|
| 5 |
|
| 6 |
## Model Description
|
|
@@ -15,7 +18,43 @@ The model implementation and code are available at:
|
|
| 15 |
|
| 16 |
[Try our model](https://premise-search.com)
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
## Citation
|
| 21 |
If you use this model, please cite the following paper:
|
|
|
|
| 1 |
---
|
| 2 |
+
pipeline_tag: text-ranking
|
| 3 |
+
library_name: sentence-transformers
|
| 4 |
+
license: mit
|
| 5 |
---
|
| 6 |
+
|
| 7 |
# Model Card: Assisting Mathematical Formalization with A Learning-based Premise Retriever
|
| 8 |
|
| 9 |
## Model Description
|
|
|
|
| 18 |
|
| 19 |
[Try our model](https://premise-search.com)
|
| 20 |
|
| 21 |
+
## Usage
|
| 22 |
+
|
| 23 |
+
You can use this model with the `sentence-transformers` library to embed queries and premises and then calculate their similarity for retrieval.
|
| 24 |
+
|
| 25 |
+
```python
|
| 26 |
+
from sentence_transformers import SentenceTransformer, util
|
| 27 |
+
import torch
|
| 28 |
+
|
| 29 |
+
# Load the pretrained model
|
| 30 |
+
model = SentenceTransformer('ruc-ai4math/Lean_State_Search_Random')
|
| 31 |
+
|
| 32 |
+
# Example Lean proof state (query) and a list of premises
|
| 33 |
+
query = "<GOAL> (n : \u2115), n + 0 = n </GOAL>"
|
| 34 |
+
premises = [
|
| 35 |
+
"<VAR> (n : \u2115) </VAR> <GOAL> n + 0 = n </GOAL>",
|
| 36 |
+
"<VAR> (n m : \u2115) </VAR> <GOAL> n + m = m + n </GOAL>",
|
| 37 |
+
"<VAR> (n : \u2115) </VAR> <GOAL> n = n </GOAL>",
|
| 38 |
+
"lemma add_zero (n : \u2115) : n + 0 = n := by sorry" # An actual Lean lemma
|
| 39 |
+
]
|
| 40 |
|
| 41 |
+
# Encode the query and premises into embeddings
|
| 42 |
+
query_embedding = model.encode(query, convert_to_tensor=True)
|
| 43 |
+
premise_embeddings = model.encode(premises, convert_to_tensor=True)
|
| 44 |
+
|
| 45 |
+
# Calculate cosine similarity between the query and all premises
|
| 46 |
+
cosine_scores = util.cos_sim(query_embedding, premise_embeddings)
|
| 47 |
+
|
| 48 |
+
# Print the scores for each premise
|
| 49 |
+
print("Similarity scores:")
|
| 50 |
+
for i, score in enumerate(cosine_scores[0]):
|
| 51 |
+
print(f" - Premise {i+1}: '{premises[i]}', Score: {score.item():.4f}")
|
| 52 |
+
|
| 53 |
+
# Find the index of the premise with the highest similarity score
|
| 54 |
+
best_match_idx = torch.argmax(cosine_scores).item()
|
| 55 |
+
print(f"
|
| 56 |
+
Best matching premise: '{premises[best_match_idx]}'")
|
| 57 |
+
```
|
| 58 |
|
| 59 |
## Citation
|
| 60 |
If you use this model, please cite the following paper:
|