File size: 9,049 Bytes
f0ab69f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c1377bd
f0ab69f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ca057d3
f0ab69f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c1377bd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
---
library_name: transformers
tags:
  - recommendation
  - e-commerce
  - sequential-modeling
  - next-item-prediction
  - behavioral-modeling
---

# Commerce Intent

## Model Overview

Commerce Intent is a pretrained sequential behavioral model for e-commerce session understanding. It is trained to predict the next item in a user session based on historical interaction sequences.

The model learns representations from multi-modal structured signals, including:

- Item ID
- Brand
- Category
- Event type (view, cart and purchase)
- Normalized price
- Positional order within the session

It is designed as a foundation model for downstream recommendation and behavioral modeling tasks.

---

## Model Details

### Model Description

Commerce Intent models user behavior within a session as an autoregressive sequence modeling problem. Given a sequence of past interactions, the model predicts the next likely item.

The architecture consists of:

- Multi-embedding token fusion (item, brand, category, event)
- Continuous price projection
- Positional encoding
- Transformer encoder with causal masking
- Linear head for next-item prediction

This model is pretrained and can be fine-tuned for recommendation, ranking, or conversion modeling tasks.

- **Developed by:** infinity6
- **Model type:** Sequential autoregressive transformer
- **Language(s):** Structured e-commerce interaction data (non-NLP)
- **License:** Apache 2.0
- **Finetuned from model:** None (trained from scratch)

---

## Dependencies

This model depends on the external package:

- https://github.com/infinity6-ai/i6model_ecomm

The package contains the custom architecture required to correctly load and run the model. You must install it before using Commerce Intent.

### Installation

Clone the repository:

```bash
git clone https://github.com/infinity6-ai/i6model_ecomm.git
cd i6model_ecomm
pip install .
```

---

## Intended Use

### Direct Use

The model can be used directly for:

- Next-item prediction
- Session-based recommendation
- Behavioral embedding extraction
- Purchase intent modeling
- Real-time ranking systems

Example:

```python
import torch

from i6modelecomm.model import i6modelecomm

model = i6modelecomm.CommerceIntent.from_pretrained(
    "infinity6/ecomm_shop_intent_pretrained"
)

# TODO: map items and remap categories.

# TODO: freeze layers and train with your data.

model.eval()

D = 'cpu'

# batch_size | seq_len = 3
itms = torch.tensor([[12, 45, 78]], dtype=torch.long).to(D)
brds = torch.tensor([[3, 7, 2]], dtype=torch.long).to(D)
cats = torch.tensor([[8, 8, 15]], dtype=torch.long).to(D)
prcs = torch.tensor([[29.9, 35.0, 15.5]], dtype=torch.float).to(D)
evts = torch.tensor([[1, 1, 2]], dtype=torch.long).to(D)

# mask
mask = torch.tensor([[1, 1, 1]], dtype=torch.bool).to(D)

with torch.no_grad():
    outputs = model(
        itms=itms,      # items
        brds=brds,      # brands
        cats=cats,      # categories
        prcs=prcs,      # prices
        evts=evts,      # events
        attention_mask=mask,
        labels=None     # inference only -- no loss computation
    )

# logits tem shape (B, L-1, num_itm)
logits = outputs.logits

print("Logits shape:", logits.shape)
```

Inputs must include:

- `itms`
- `brds`
- `cats`
- `evts`
- `prcs`
- `attention_mask`

---

### Downstream Use

The model can be fine-tuned for:

- Conversion prediction
- Cart abandonment modeling
- Customer lifetime value modeling
- Cross-sell / upsell recommendation
- Personalized search ranking

---

### Out-of-Scope Use

This model is not suitable for:

- Natural language tasks
- Image tasks
- Generative text modeling
- Multi-user graph modeling without adaptation
- Cold-start scenarios without item mappings and category remapping

---

## Bias, Risks, and Limitations

- The model reflects behavioral biases present in historical e-commerce data.
- Popularity bias may emerge due to item frequency distribution.
- Model performance depends on session length and interaction quality.
- Cold-start performance for unseen items is limited.
- It does not encode demographic or identity-aware fairness constraints.

### Recommendations

- Monitor recommendation fairness and popularity skew.
- Retrain periodically to reflect new item distributions.
- Apply business constraints in production systems.
- Use A/B testing before large-scale deployment.

---

## Training Details

### Training Data

The model was trained on large-scale anonymized e-commerce interaction logs containing:

- Session-based user interactions
- Item identifiers
- Brand identifiers
- Category identifiers
- Event types
- Timestamped behavioral sequences
- Price values (log-normalized and standardized)

Sessions shorter than a minimum threshold were filtered.

---

## Training Data

### Data Sources and Preparation

The model was trained on a unified, large-scale corpus of e-commerce interaction data, aggregating and normalizing multiple public datasets to create a robust foundation for sequential behavior modeling.

The training data combines the following sources:

| Dataset                                                       | Description                                                      | Key Statistics    |
|---------------------------------------------------------------|------------------------------------------------------------------|-------------------|
| **E-commerce behavior data from multi category store**        | Real event logs from a multi-category e-commerce platform        | ~285M records     |
| **E-commerce Clickstream and Transaction Dataset (Kaggle)**   | Sequential event data including views and clicks                 | ~500K+ events     |
| **E-Commerce Behavior Dataset – Agents for Data**             | Product interactions from ~18k users across multiple event types | ~2M interactions  |
| **Retail Rocket clickstream dataset**                         | Industry-standard dataset with views, carts, and purchases       | ~2.7M events      |
| **SIGIR 2021 / Coveo Session data challenge**                 | Navigation sessions with clicks, adds, purchase + metadata       | ~30M events       |
| **JDsearch dataset**                                          | Real interactionswith search queries from JD.com platform        | ~26M interactions |

### Data Unification and Normalization

All datasets underwent a rigorous unification and normalization process:

- **Schema Alignment**: Standardized field names and types across all sources (item_id, brand_id, category_id, event_type, timestamp, price)
- **Event Type Normalization**: Mapped varied event nomenclature to a standardized taxonomy (view, cart, purchase)
- **ID Harmonization**: Created consistent ID spaces for items, brands, and categories through cross-dataset mapping
- **Temporal Alignment**: Unified timestamp formats and established consistent session windows
- **Price Normalization**: Applied log-normalization (log1p) followed by standardization using global statistics
- **Session Construction**: Reconstructed user sessions based on temporal proximity and interaction patterns
- **Quality Filtering**: Removed sessions below minimum length threshold and filtered anomalous interactions

This diverse and comprehensive training corpus enables the model to learn robust representations of e-commerce behavior patterns across different platforms, markets, and interaction types, serving as a strong foundation for downstream fine-tuning tasks.

---

### Preprocessing

- Missing categorical values replaced with `UNK`
- Price values transformed via `log1p`
- Standardization using global mean and standard deviation
- Session truncation to fixed-length sequences
- Right padding with attention masking

---

### Training Objective

Next-item autoregressive prediction using cross-entropy loss with padding ignored.

---

### Training Regime

- **Precision:** FP32
- **Optimizer:** AdamW
- **Learning Rate:** 1e-3 with warmup
- **Gradient Clipping:** 5.0
- **Causal masking applied**

---

## Evaluation

### Metrics

- Cross-Entropy Loss
- Perplexity
- Recall@20

### Results

On the evaluation split, the model achieved:

- **Perplexity:** 24.04
- **Recall@20:** 0.6823

These results indicate strong next-item prediction performance in session-based e-commerce interaction modeling.

### Summary

The model demonstrates:

- Low predictive uncertainty (Perplexity 24.04)
- High ranking quality for next-item recommendation (Recall@20 of 68.23%)

Performance may vary depending on dataset distribution, session length, and preprocessing configuration.

---

## Environmental Impact

- **Hardware:** GPU NVIDIA H100 NVL (94GB PCIe 5.0)
- **Precision:** FP32
- **Training Duration:** Several hours (varies by configuration)
- **Carbon Impact:** ≈45 kg CO₂e (estimated based on energy consumption of 30h on H100 GPU)

---

## Limitations

- No long-term user modeling beyond session scope
- Does not include user-level embeddings
- Requires predefined categorical vocabularies
- Limited generalization to unseen item IDs