| license: mit | |
| pipeline_tag: sentence-similarity | |
| datasets: | |
| - dell-research-harvard/headlines-semantic-similarity | |
| - dell-research-harvard/AmericanStories | |
| tags: | |
| - sentence-transformers | |
| - feature-extraction | |
| - sentence-similarity | |
| - transformers | |
| language: | |
| - en | |
| base_model: "StoriesLM/StoriesLM-v1-1963" | |
| # RepresentLM-v1 | |
| This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. | |
| The model is trained on the [HEADLINES](https://huggingface.co/datasets/dell-research-harvard/headlines-semantic-similarity) semantic similarity dataset, using the [StoriesLM-v1-1963](https://huggingface.co/StoriesLM/StoriesLM-v1-1963) model as a base. | |
| ## Usage | |
| First install the [sentence-transformers](https://www.SBERT.net) package: | |
| ``` | |
| pip install -U sentence-transformers | |
| ``` | |
| The model can then be used to encode language sequences: | |
| ```python | |
| from sentence_transformers import SentenceTransformer | |
| sequences = ["This is an example sequence", "Each sequence is embedded"] | |
| model = SentenceTransformer('RepresentLM/RepresentLM-v1') | |
| embeddings = model.encode(sequences) | |
| print(embeddings) | |
| ``` | |