Text Classification
Scikit-learn
sentence-transformers
English
information-retrieval
claim-verification
scifact
evidence-relevance
Eval Results (legacy)
Instructions to use andreiaalexa/scifact-relevance-classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use andreiaalexa/scifact-relevance-classifier with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("andreiaalexa/scifact-relevance-classifier", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - sentence-transformers
How to use andreiaalexa/scifact-relevance-classifier with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("andreiaalexa/scifact-relevance-classifier") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
| """Embedding feature builders for claim-document relevance classification.""" | |
| from __future__ import annotations | |
| import numpy as np | |
| def e5_queries(texts: list[str]) -> list[str]: | |
| return [f"query: {text}" for text in texts] | |
| def e5_passages(texts: list[str]) -> list[str]: | |
| return [f"passage: {text}" for text in texts] | |
| def pair_features(model, claims: list[str], documents: list[str], show_progress_bar=False): | |
| """Build standard sentence-pair features from two embedding vectors. | |
| q and d alone give the classifier raw semantic position. abs(q-d) exposes | |
| distance dimensions. q*d exposes alignment dimensions. cosine gives a | |
| single retrieval-style similarity signal. | |
| """ | |
| q = model.encode( | |
| e5_queries(claims), | |
| normalize_embeddings=True, | |
| show_progress_bar=show_progress_bar, | |
| ) | |
| d = model.encode( | |
| e5_passages(documents), | |
| normalize_embeddings=True, | |
| show_progress_bar=show_progress_bar, | |
| ) | |
| cosine = np.sum(q * d, axis=1, keepdims=True) | |
| return np.hstack([q, d, np.abs(q - d), q * d, cosine]) | |