--- license: apache-2.0 base_model: ibm-granite/granite-embedding-107m-multilingual tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers - granite - embeddings - multilingual library_name: sentence-transformers pipeline_tag: feature-extraction --- # Granite Embedding 107M Multilingual This is a copy of the [ibm-granite/granite-embedding-107m-multilingual](https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual) model for document encoding purposes. ## Model Summary Granite-Embedding-107M-Multilingual is a 107M parameter dense biencoder embedding model from the Granite Embeddings suite that can be used to generate high quality text embeddings. This model produces embedding vectors of size 384. ## Supported Languages English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. ## Usage ### With Sentence Transformers ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer('RikoteMaster/MNLP_M3_document_encoder') embeddings = model.encode(['Your text here']) ``` ### With Transformers ```python from transformers import AutoModel, AutoTokenizer import torch model = AutoModel.from_pretrained('RikoteMaster/MNLP_M3_document_encoder') tokenizer = AutoTokenizer.from_pretrained('RikoteMaster/MNLP_M3_document_encoder') inputs = tokenizer(['Your text here'], return_tensors='pt', padding=True, truncation=True) with torch.no_grad(): outputs = model(**inputs) embeddings = outputs.last_hidden_state[:, 0] # CLS pooling embeddings = torch.nn.functional.normalize(embeddings, dim=1) ``` ## Original Model This model is based on [ibm-granite/granite-embedding-107m-multilingual](https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual) by IBM.