--- language: - pt - multilingual license: mit tags: - recommenders - text-retrieval - product-recommendation - sentence-transformers datasets: - synthetic-ecommerce metrics: - auc - ndcg - mrr --- # foundational-model A semantic product recommendation model that matches user profiles (free text) to products. Uses a frozen multilingual MiniLM encoder with trainable projection heads and chunk attention for user encoding. ## Model description - **Architecture**: Dual-encoder (user encoder + item encoder) - **Base model**: [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) (frozen) - **Trainable params**: ~148k (projection head + chunk attention) - **Input**: User profile text + product name + description - **Output**: Cosine similarity scores for ranking ## Intended use Product recommendation from user free-text profiles (e.g. "Marcos, gosto de videogames e de música, sou de Rio de janeiro"). Trained on synthetic e-commerce interactions in Portuguese. ## How to use ```python from transformers import AutoTokenizer import torch from huggingface_hub import hf_hub_download # Download checkpoint checkpoint = hf_hub_download(repo_id="oristides/foundational-model", filename="pytorch_model.bin") # Load model (requires model_arch1.RecSysModel - see repo for architecture) from model.model_arch1 import RecSysModel tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2") model = RecSysModel() model.load_state_dict(torch.load(checkpoint, map_location="cpu")) model.eval() # Encode user and items, then: scores = user_emb @ item_embs.T ``` Or use the `recommender` CLI in this repo: `uv run projects/reneguirecsys/model/recommender.py "your profile" -k 10` ## Training - **Loss**: In-batch multi-negative cross-entropy - **Split**: Leave-one-out per user - **Eval metrics**: AUC, NDCG@10, MRR - **Max sequence length**: 256 (user chunks), 128 (items) ## Citation ```bibtex @misc{oristides-foundational-model-2025, author = {oristides}, title = {Foundational Model for Product Recommendation}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/oristides/foundational-model} } ``` ## License MIT