| --- |
| library_name: transformers |
| tags: |
| - recommendation |
| - e-commerce |
| - sequential-modeling |
| - next-item-prediction |
| - behavioral-modeling |
| --- |
| |
| # Commerce Intent |
|
|
| ## Model Overview |
|
|
| Commerce Intent is a pretrained sequential behavioral model for e-commerce session understanding. It is trained to predict the next item in a user session based on historical interaction sequences. |
|
|
| The model learns representations from multi-modal structured signals, including: |
|
|
| - Item ID |
| - Brand |
| - Category |
| - Event type (view, cart and purchase) |
| - Normalized price |
| - Positional order within the session |
|
|
| It is designed as a foundation model for downstream recommendation and behavioral modeling tasks. |
|
|
| --- |
|
|
| ## Model Details |
|
|
| ### Model Description |
|
|
| Commerce Intent models user behavior within a session as an autoregressive sequence modeling problem. Given a sequence of past interactions, the model predicts the next likely item. |
|
|
| The architecture consists of: |
|
|
| - Multi-embedding token fusion (item, brand, category, event) |
| - Continuous price projection |
| - Positional encoding |
| - Transformer encoder with causal masking |
| - Linear head for next-item prediction |
|
|
| This model is pretrained and can be fine-tuned for recommendation, ranking, or conversion modeling tasks. |
|
|
| - **Developed by:** infinity6 |
| - **Model type:** Sequential autoregressive transformer |
| - **Language(s):** Structured e-commerce interaction data (non-NLP) |
| - **License:** Apache 2.0 |
| - **Finetuned from model:** None (trained from scratch) |
|
|
| --- |
|
|
| ## Dependencies |
|
|
| This model depends on the external package: |
|
|
| - https://github.com/infinity6-ai/i6model_ecomm |
| |
| The package contains the custom architecture required to correctly load and run the model. You must install it before using Commerce Intent. |
| |
| ### Installation |
| |
| Clone the repository: |
| |
| ```bash |
| git clone https://github.com/infinity6-ai/i6model_ecomm.git |
| cd i6model_ecomm |
| pip install . |
| ``` |
| |
| --- |
| |
| ## Intended Use |
| |
| ### Direct Use |
| |
| The model can be used directly for: |
| |
| - Next-item prediction |
| - Session-based recommendation |
| - Behavioral embedding extraction |
| - Purchase intent modeling |
| - Real-time ranking systems |
| |
| Example: |
| |
| ```python |
| import torch |
| |
| from i6modelecomm.model import i6modelecomm |
| |
| model = i6modelecomm.CommerceIntent.from_pretrained( |
| "infinity6/ecomm_shop_intent_pretrained" |
| ) |
| |
| # TODO: map items and remap categories. |
|
|
| # TODO: freeze layers and train with your data. |
|
|
| model.eval() |
|
|
| D = 'cpu' |
|
|
| # batch_size | seq_len = 3 |
| itms = torch.tensor([[12, 45, 78]], dtype=torch.long).to(D) |
| brds = torch.tensor([[3, 7, 2]], dtype=torch.long).to(D) |
| cats = torch.tensor([[8, 8, 15]], dtype=torch.long).to(D) |
| prcs = torch.tensor([[29.9, 35.0, 15.5]], dtype=torch.float).to(D) |
| evts = torch.tensor([[1, 1, 2]], dtype=torch.long).to(D) |
|
|
| # mask |
| mask = torch.tensor([[1, 1, 1]], dtype=torch.bool).to(D) |
|
|
| with torch.no_grad(): |
| outputs = model( |
| itms=itms, # items |
| brds=brds, # brands |
| cats=cats, # categories |
| prcs=prcs, # prices |
| evts=evts, # events |
| attention_mask=mask, |
| labels=None # inference only -- no loss computation |
| ) |
| |
| # logits tem shape (B, L-1, num_itm) |
| logits = outputs.logits |
| |
| print("Logits shape:", logits.shape) |
| ``` |
| |
| Inputs must include: |
| |
| - `itms` |
| - `brds` |
| - `cats` |
| - `evts` |
| - `prcs` |
| - `attention_mask` |
|
|
| --- |
|
|
| ### Downstream Use |
|
|
| The model can be fine-tuned for: |
|
|
| - Conversion prediction |
| - Cart abandonment modeling |
| - Customer lifetime value modeling |
| - Cross-sell / upsell recommendation |
| - Personalized search ranking |
|
|
| --- |
|
|
| ### Out-of-Scope Use |
|
|
| This model is not suitable for: |
|
|
| - Natural language tasks |
| - Image tasks |
| - Generative text modeling |
| - Multi-user graph modeling without adaptation |
| - Cold-start scenarios without item mappings and category remapping |
|
|
| --- |
|
|
| ## Bias, Risks, and Limitations |
|
|
| - The model reflects behavioral biases present in historical e-commerce data. |
| - Popularity bias may emerge due to item frequency distribution. |
| - Model performance depends on session length and interaction quality. |
| - Cold-start performance for unseen items is limited. |
| - It does not encode demographic or identity-aware fairness constraints. |
|
|
| ### Recommendations |
|
|
| - Monitor recommendation fairness and popularity skew. |
| - Retrain periodically to reflect new item distributions. |
| - Apply business constraints in production systems. |
| - Use A/B testing before large-scale deployment. |
|
|
| --- |
|
|
| ## Training Details |
|
|
| ### Training Data |
|
|
| The model was trained on large-scale anonymized e-commerce interaction logs containing: |
|
|
| - Session-based user interactions |
| - Item identifiers |
| - Brand identifiers |
| - Category identifiers |
| - Event types |
| - Timestamped behavioral sequences |
| - Price values (log-normalized and standardized) |
|
|
| Sessions shorter than a minimum threshold were filtered. |
|
|
| --- |
|
|
| ## Training Data |
|
|
| ### Data Sources and Preparation |
|
|
| The model was trained on a unified, large-scale corpus of e-commerce interaction data, aggregating and normalizing multiple public datasets to create a robust foundation for sequential behavior modeling. |
|
|
| The training data combines the following sources: |
|
|
| | Dataset | Description | Key Statistics | |
| |---------------------------------------------------------------|------------------------------------------------------------------|-------------------| |
| | **E-commerce behavior data from multi category store** | Real event logs from a multi-category e-commerce platform | ~285M records | |
| | **E-commerce Clickstream and Transaction Dataset (Kaggle)** | Sequential event data including views and clicks | ~500K+ events | |
| | **E-Commerce Behavior Dataset – Agents for Data** | Product interactions from ~18k users across multiple event types | ~2M interactions | |
| | **Retail Rocket clickstream dataset** | Industry-standard dataset with views, carts, and purchases | ~2.7M events | |
| | **SIGIR 2021 / Coveo Session data challenge** | Navigation sessions with clicks, adds, purchase + metadata | ~30M events | |
| | **JDsearch dataset** | Real interactionswith search queries from JD.com platform | ~26M interactions | |
|
|
| ### Data Unification and Normalization |
|
|
| All datasets underwent a rigorous unification and normalization process: |
|
|
| - **Schema Alignment**: Standardized field names and types across all sources (item_id, brand_id, category_id, event_type, timestamp, price) |
| - **Event Type Normalization**: Mapped varied event nomenclature to a standardized taxonomy (view, cart, purchase) |
| - **ID Harmonization**: Created consistent ID spaces for items, brands, and categories through cross-dataset mapping |
| - **Temporal Alignment**: Unified timestamp formats and established consistent session windows |
| - **Price Normalization**: Applied log-normalization (log1p) followed by standardization using global statistics |
| - **Session Construction**: Reconstructed user sessions based on temporal proximity and interaction patterns |
| - **Quality Filtering**: Removed sessions below minimum length threshold and filtered anomalous interactions |
|
|
| This diverse and comprehensive training corpus enables the model to learn robust representations of e-commerce behavior patterns across different platforms, markets, and interaction types, serving as a strong foundation for downstream fine-tuning tasks. |
|
|
| --- |
|
|
| ### Preprocessing |
|
|
| - Missing categorical values replaced with `UNK` |
| - Price values transformed via `log1p` |
| - Standardization using global mean and standard deviation |
| - Session truncation to fixed-length sequences |
| - Right padding with attention masking |
|
|
| --- |
|
|
| ### Training Objective |
|
|
| Next-item autoregressive prediction using cross-entropy loss with padding ignored. |
|
|
| --- |
|
|
| ### Training Regime |
|
|
| - **Precision:** FP32 |
| - **Optimizer:** AdamW |
| - **Learning Rate:** 1e-3 with warmup |
| - **Gradient Clipping:** 5.0 |
| - **Causal masking applied** |
|
|
| --- |
|
|
| ## Evaluation |
|
|
| ### Metrics |
|
|
| - Cross-Entropy Loss |
| - Perplexity |
| - Recall@20 |
|
|
| ### Results |
|
|
| On the evaluation split, the model achieved: |
|
|
| - **Perplexity:** 24.04 |
| - **Recall@20:** 0.6823 |
|
|
| These results indicate strong next-item prediction performance in session-based e-commerce interaction modeling. |
|
|
| ### Summary |
|
|
| The model demonstrates: |
|
|
| - Low predictive uncertainty (Perplexity 24.04) |
| - High ranking quality for next-item recommendation (Recall@20 of 68.23%) |
|
|
| Performance may vary depending on dataset distribution, session length, and preprocessing configuration. |
|
|
| --- |
|
|
| ## Environmental Impact |
|
|
| - **Hardware:** GPU NVIDIA H100 NVL (94GB PCIe 5.0) |
| - **Precision:** FP32 |
| - **Training Duration:** Several hours (varies by configuration) |
| - **Carbon Impact:** ≈45 kg CO₂e (estimated based on energy consumption of 30h on H100 GPU) |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - No long-term user modeling beyond session scope |
| - Does not include user-level embeddings |
| - Requires predefined categorical vocabularies |
| - Limited generalization to unseen item IDs |
|
|