pip install pyarmor==7.7.4

Commerce Intent

Model Overview

Commerce Intent is a pretrained sequential behavioral model for e-commerce session understanding. It is trained to predict the next item in a user session based on historical interaction sequences.

The model learns representations from multi-modal structured signals, including:

  • Item ID
  • Brand
  • Category
  • Event type (view, cart and purchase)
  • Normalized price
  • Positional order within the session

It is designed as a foundation model for downstream recommendation and behavioral modeling tasks.


Model Details

Model Description

Commerce Intent models user behavior within a session as an autoregressive sequence modeling problem. Given a sequence of past interactions, the model predicts the next likely item.

The architecture consists of:

  • Multi-embedding token fusion (item, brand, category, event)
  • Continuous price projection
  • Positional encoding
  • Transformer encoder with causal masking
  • Linear head for next-item prediction

This model is pretrained and can be fine-tuned for recommendation, ranking, or conversion modeling tasks.

  • Developed by: infinity6
  • Model type: Sequential autoregressive transformer
  • Language(s): Structured e-commerce interaction data (non-NLP)
  • License: Apache 2.0
  • Finetuned from model: None (trained from scratch)

Intended Use

Direct Use

The model can be used directly for:

  • Next-item prediction
  • Session-based recommendation
  • Behavioral embedding extraction
  • Purchase intent modeling
  • Real-time ranking systems

Example:

import torch

from transformers import AutoModel

model = AutoModel.from_pretrained(
    "i6-aiworks/ecomm_shop_intent_pretrained", trust_remote_code=True
)

# TODO: map items and remap categories.

# TODO: freeze layers and train with your data.

# making inference
model.eval()

device = "cpu"

# batch_size 1 | seq_len = 3
itms = torch.tensor([[12, 45, 78]], dtype=torch.long).to(device)
brds = torch.tensor([[3, 7, 2]], dtype=torch.long).to(device)
cats = torch.tensor([[8, 8, 15]], dtype=torch.long).to(device)
prcs = torch.tensor([[29.9, 35.0, 15.5]], dtype=torch.float).to(device)
evts = torch.tensor([[1, 1, 2]], dtype=torch.long).to(device)

mask = torch.tensor([[1, 1, 1]], dtype=torch.bool).to(device)

with torch.no_grad():
    outputs = model(
        itms=itms,      # items
        brds=brds,      # brands
        cats=cats,      # categories
        prcs=prcs,      # prices
        evts=evts,      # events
        attention_mask=mask,
        labels=None     # just inference -- without loss
    )

logits = outputs.logits # (B, L-1, num_itm)

print("Logits shape:", logits.shape)

Inputs must include:

  • itms
  • brds
  • cats
  • evts
  • prcs
  • attention_mask

Downstream Use

The model can be fine-tuned for:

  • Conversion prediction
  • Cart abandonment modeling
  • Customer lifetime value modeling
  • Cross-sell / upsell recommendation
  • Personalized search ranking

Out-of-Scope Use

This model is not suitable for:

  • Natural language tasks
  • Image tasks
  • Generative text modeling
  • Multi-user graph modeling without adaptation
  • Cold-start scenarios without item mappings and category remapping

Bias, Risks, and Limitations

  • The model reflects behavioral biases present in historical e-commerce data.
  • Popularity bias may emerge due to item frequency distribution.
  • Model performance depends on session length and interaction quality.
  • Cold-start performance for unseen items is limited.
  • It does not encode demographic or identity-aware fairness constraints.

Recommendations

  • Monitor recommendation fairness and popularity skew.
  • Retrain periodically to reflect new item distributions.
  • Apply business constraints in production systems.
  • Use A/B testing before large-scale deployment.

Training Details

Training Data

The model was trained on large-scale anonymized e-commerce interaction logs containing:

  • Session-based user interactions
  • Item identifiers
  • Brand identifiers
  • Category identifiers
  • Event types
  • Timestamped behavioral sequences
  • Price values (log-normalized and standardized)

Sessions shorter than a minimum threshold were filtered.


Training Data

Data Sources and Preparation

The model was trained on a unified, large-scale corpus of e-commerce interaction data, aggregating and normalizing multiple public datasets to create a robust foundation for sequential behavior modeling.

The training data combines the following sources:

Dataset Description Key Statistics
E-commerce behavior data from multi category store Real event logs from a multi-category e-commerce platform ~285M records
E-commerce Clickstream and Transaction Dataset (Kaggle) Sequential event data including views and clicks ~500K+ events
E-Commerce Behavior Dataset – Agents for Data Product interactions from ~18k users across multiple event types ~2M interactions
Retail Rocket clickstream dataset Industry-standard dataset with views, carts, and purchases ~2.7M events
SIGIR 2021 / Coveo Session data challenge Navigation sessions with clicks, adds, purchase + metadata ~30M events
JDsearch dataset Real interactionswith search queries from JD.com platform ~26M interactions

Data Unification and Normalization

All datasets underwent a rigorous unification and normalization process:

  • Schema Alignment: Standardized field names and types across all sources (item_id, brand_id, category_id, event_type, timestamp, price)
  • Event Type Normalization: Mapped varied event nomenclature to a standardized taxonomy (view, cart, purchase)
  • ID Harmonization: Created consistent ID spaces for items, brands, and categories through cross-dataset mapping
  • Temporal Alignment: Unified timestamp formats and established consistent session windows
  • Price Normalization: Applied log-normalization (log1p) followed by standardization using global statistics
  • Session Construction: Reconstructed user sessions based on temporal proximity and interaction patterns
  • Quality Filtering: Removed sessions below minimum length threshold and filtered anomalous interactions

This diverse and comprehensive training corpus enables the model to learn robust representations of e-commerce behavior patterns across different platforms, markets, and interaction types, serving as a strong foundation for downstream fine-tuning tasks.


Preprocessing

  • Missing categorical values replaced with UNK
  • Price values transformed via log1p
  • Standardization using global mean and standard deviation
  • Session truncation to fixed-length sequences
  • Right padding with attention masking

Training Objective

Next-item autoregressive prediction using cross-entropy loss with padding ignored.


Training Regime

  • Precision: FP32
  • Optimizer: AdamW
  • Learning Rate: 1e-3 with warmup
  • Gradient Clipping: 5.0
  • Causal masking applied

Evaluation

Metrics

  • Cross-Entropy Loss
  • Perplexity
  • Recall@20

Results

On the evaluation split, the model achieved:

  • Perplexity: 24.04
  • Recall@20: 0.6823

These results indicate strong next-item prediction performance in session-based e-commerce interaction modeling.

Summary

The model demonstrates:

  • Low predictive uncertainty (Perplexity 24.04)
  • High ranking quality for next-item recommendation (Recall@20 of 68.23%)

Performance may vary depending on dataset distribution, session length, and preprocessing configuration.


Environmental Impact

  • Hardware: GPU NVIDIA H100 NVL (94GB PCIe 5.0)
  • Precision: FP32
  • Training Duration: Several hours (varies by configuration)
  • Carbon Impact: ≈45 kg COâ‚‚e (estimated based on energy consumption of 30h on H100 GPU)

Limitations

  • No long-term user modeling beyond session scope
  • Does not include user-level embeddings
  • Requires predefined categorical vocabularies
  • Limited generalization to unseen item IDs
Downloads last month
53
Safetensors
Model size
54.1M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support