--- license: apache-2.0 tags: - text-encoder - feature-extraction - sentence-transformers - contrastive-learning base_model: mjaliz/vision-text-dual-encoder-v1 --- # Text Encoder extracted from mjaliz/vision-text-dual-encoder-v1 This is the text encoder component extracted from the VisionTextDualEncoder model [mjaliz/vision-text-dual-encoder-v1](https://huggingface.co/mjaliz/vision-text-dual-encoder-v1). ## Model Details - **Model type:** XLMRobertaModel - **Source model:** [mjaliz/vision-text-dual-encoder-v1](https://huggingface.co/mjaliz/vision-text-dual-encoder-v1) - **Includes projection:** False ## Usage ```python from transformers import AutoModel, AutoTokenizer # Load text encoder model = AutoModel.from_pretrained("mjaliz/siglip-text-encoder") tokenizer = AutoTokenizer.from_pretrained("mjaliz/siglip-text-encoder") # Encode text texts = ["Hello world", "How are you?"] inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt") outputs = model(**inputs) # Get embeddings (pooler output or mean of last hidden state) if hasattr(outputs, "pooler_output") and outputs.pooler_output is not None: embeddings = outputs.pooler_output else: embeddings = outputs.last_hidden_state.mean(dim=1) print(embeddings.shape) ``` ## Citation If you use this model, please cite the original dual encoder model.