Large Behavioral Model (LBM) for E-commerce
A transformer-based Large Behavioral Model (LBM) designed for e-commerce behavioral prediction. This model predicts user actions, product recommendations, and purchase timing based on sequential user behavior patterns.
π Quick Start
Installation
pip install transformers torch huggingface-hub
Basic Usage
from src.models.hf_pipeline import LBMPipeline
# Load model from Hugging Face Hub
pipeline = LBMPipeline(model="souvik16011991roy/LBM-ecom")
# Prepare inputs
inputs = {
'product_ids': ['1000978', '1001588', '1001606'],
'event_types': ['view', 'view', 'cart'],
'category_ids': ['cat1', 'cat1', 'cat2'],
'hours': [10, 14, 18],
'days': [0, 0, 1], # 0=Monday
'prices': [29.99, 49.99, 79.99],
'segment_id': 0
}
# Make predictions
results = pipeline(inputs)
print(results)
π Table of Contents
- Model Description
- Architecture
- Features
- API Documentation
- Usage Examples
- Input/Output Formats
- Performance
- Limitations
- Citation
π― Model Description
The Large Behavioral Model (LBM) is a transformer-based neural network designed for e-commerce behavioral prediction. It analyzes sequential user behavior patterns to predict:
- Next Action: What action will the user take next? (view/cart/purchase)
- Next Product: Which products are most likely to be viewed/purchased next?
- Purchase Timing: When will the user make their next purchase?
Model Specifications
- Model Type: Transformer Encoder
- Parameters: ~139M parameters
- Architecture: Multi-modal embeddings + Transformer encoder + Multi-task heads
- Max Sequence Length: 100 events
- Vocabulary Sizes:
- Products: 164,577
- Events: 3 (view, cart, purchase)
- Categories: 624
ποΈ Architecture
Input Embeddings:
βββ Product Embedding (164,577 Γ 256)
βββ Event Embedding (3 Γ 16)
βββ Category Embedding (624 Γ 32)
βββ Segment Embedding (8 Γ 32)
βββ Hour Embedding (24 Γ 16)
βββ Day Embedding (7 Γ 16)
βββ Price Projection (1 Γ 16)
Transformer Encoder:
βββ 4 Transformer Layers
β βββ Multi-head Attention (8 heads)
β βββ Feed-forward Network (2048 dim)
βββ Layer Normalization
Output Heads:
βββ Action Head (512 Γ 3)
βββ Product Head (512 Γ 164,577)
βββ Timing Head (512 Γ 1)
β¨ Features
- Multi-task Learning: Predicts action, product, and timing simultaneously
- Temporal Awareness: Incorporates hour and day of week patterns
- User Segmentation: Personalizes predictions based on user segments
- Product Recommendations: Provides top-K product recommendations with scores
- Purchase Timing: Predicts days until next purchase
π‘ API Documentation
Hugging Face Inference API
The model is available via Hugging Face Inference API. Here are the endpoints and example responses:
Base URL
https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Authentication
Include your Hugging Face token in the Authorization header:
Authorization: Bearer YOUR_HF_TOKEN
Endpoint: Predict All (Default)
Request:
POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN
{
"inputs": {
"product_ids": ["1000978", "1001588", "1001606"],
"event_types": ["view", "view", "cart"],
"category_ids": ["cat1", "cat1", "cat2"],
"hours": [10, 14, 18],
"days": [0, 0, 1],
"prices": [29.99, 49.99, 79.99],
"segment_id": 0,
"task": "all"
}
}
Mock Response:
{
"next_action": {
"predicted": "purchase",
"probabilities": {
"view": 0.15,
"cart": 0.25,
"purchase": 0.60
}
},
"next_products": [
{
"product_id": "1002042",
"score": 0.234
},
{
"product_id": "1002062",
"score": 0.189
},
{
"product_id": "1002098",
"score": 0.156
},
{
"product_id": "1002099",
"score": 0.134
},
{
"product_id": "1002100",
"score": 0.112
},
{
"product_id": "1002101",
"score": 0.098
},
{
"product_id": "1002102",
"score": 0.087
},
{
"product_id": "1002103",
"score": 0.076
},
{
"product_id": "1002225",
"score": 0.065
},
{
"product_id": "1002266",
"score": 0.049
}
],
"next_purchase_days": 3.5
}
Endpoint: Predict Next Action Only
Request:
POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN
{
"inputs": {
"product_ids": ["1000978", "1001588"],
"event_types": ["view", "cart"],
"task": "action"
}
}
Mock Response:
{
"next_action": {
"predicted": "purchase",
"probabilities": {
"view": 0.20,
"cart": 0.30,
"purchase": 0.50
}
}
}
Endpoint: Predict Next Products Only
Request:
POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN
{
"inputs": {
"product_ids": ["1000978", "1001588", "1001606"],
"event_types": ["view", "view", "cart"],
"category_ids": ["cat1", "cat1", "cat2"],
"top_k": 5,
"task": "product"
}
}
Mock Response:
{
"next_products": [
{
"product_id": "1002042",
"score": 0.234
},
{
"product_id": "1002062",
"score": 0.189
},
{
"product_id": "1002098",
"score": 0.156
},
{
"product_id": "1002099",
"score": 0.134
},
{
"product_id": "1002100",
"score": 0.112
}
]
}
Endpoint: Predict Purchase Timing Only
Request:
POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN
{
"inputs": {
"product_ids": ["1000978", "1001588", "1001606"],
"event_types": ["view", "cart", "purchase"],
"task": "timing"
}
}
Mock Response:
{
"next_purchase_days": 7.2
}
Batch Processing
Request:
POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN
{
"inputs": [
{
"product_ids": ["1000978", "1001588"],
"event_types": ["view", "cart"],
"task": "all"
},
{
"product_ids": ["1001606", "1002042"],
"event_types": ["view", "view"],
"task": "all"
}
]
}
Mock Response:
{
"results": [
{
"next_action": {
"predicted": "purchase",
"probabilities": {
"view": 0.20,
"cart": 0.30,
"purchase": 0.50
}
},
"next_products": [
{"product_id": "1002042", "score": 0.234},
{"product_id": "1002062", "score": 0.189}
],
"next_purchase_days": 3.5
},
{
"next_action": {
"predicted": "view",
"probabilities": {
"view": 0.65,
"cart": 0.25,
"purchase": 0.10
}
},
"next_products": [
{"product_id": "1002098", "score": 0.198},
{"product_id": "1002099", "score": 0.167}
],
"next_purchase_days": 12.3
}
]
}
Error Responses
Invalid Input Format:
{
"error": "product_ids and event_types are required",
"status_code": 400
}
Length Mismatch:
{
"error": "product_ids and event_types must have the same length",
"status_code": 400
}
Server Error:
{
"error": "Model inference failed: ...",
"status_code": 500
}
π» Usage Examples
Python - Using Pipeline
from src.models.hf_pipeline import LBMPipeline
# Initialize pipeline
pipeline = LBMPipeline(model="souvik16011991roy/LBM-ecom")
# Example 1: Basic prediction
inputs = {
'product_ids': ['1000978', '1001588'],
'event_types': ['view', 'cart'],
'segment_id': 0
}
results = pipeline(inputs)
print(f"Next action: {results['next_action']['predicted']}")
print(f"Top product: {results['next_products'][0]['product_id']}")
print(f"Days until purchase: {results['next_purchase_days']}")
# Example 2: Predict next action only
next_action = pipeline.predict_next_action(
product_ids=['1000978', '1001588'],
event_types=['view', 'cart']
)
print(next_action)
# Example 3: Get top 10 product recommendations
top_products = pipeline.predict_next_product(
product_ids=['1000978', '1001588', '1001606'],
event_types=['view', 'view', 'cart'],
top_k=10
)
for product in top_products:
print(f"Product {product['product_id']}: {product['score']:.4f}")
# Example 4: Predict purchase timing
days_until = pipeline.predict_next_purchase(
product_ids=['1000978', '1001588'],
event_types=['view', 'cart']
)
print(f"Days until next purchase: {days_until:.2f}")
Python - Using Requests (Inference API)
import requests
API_URL = "https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom"
headers = {"Authorization": f"Bearer YOUR_HF_TOKEN"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
# Predict all
output = query({
"inputs": {
"product_ids": ["1000978", "1001588", "1001606"],
"event_types": ["view", "view", "cart"],
"category_ids": ["cat1", "cat1", "cat2"],
"hours": [10, 14, 18],
"days": [0, 0, 1],
"prices": [29.99, 49.99, 79.99],
"segment_id": 0,
"task": "all"
}
})
print(output)
JavaScript/Node.js
const fetch = require('node-fetch');
const API_URL = 'https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom';
const headers = {
'Authorization': 'Bearer YOUR_HF_TOKEN',
'Content-Type': 'application/json'
};
async function predict(inputs) {
const response = await fetch(API_URL, {
method: 'POST',
headers: headers,
body: JSON.stringify({ inputs })
});
return await response.json();
}
// Usage
predict({
product_ids: ['1000978', '1001588'],
event_types: ['view', 'cart'],
task: 'all'
}).then(result => {
console.log(result);
});
cURL
curl -X POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom \
-H "Authorization: Bearer YOUR_HF_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"product_ids": ["1000978", "1001588"],
"event_types": ["view", "cart"],
"task": "all"
}
}'
π₯ Input/Output Formats
Input Format
All inputs should be provided as a dictionary with the following fields:
| Field | Type | Required | Description | Example |
|---|---|---|---|---|
product_ids |
List[str/int] | β Yes | List of product IDs in sequence | ["1000978", "1001588"] |
event_types |
List[str] | β Yes | List of event types (view/cart/purchase) | ["view", "cart"] |
category_ids |
List[str/int] | β No | List of category IDs | ["cat1", "cat2"] |
hours |
List[int] | β No | Hour of day (0-23) | [10, 14, 18] |
days |
List[int] | β No | Day of week (0=Monday, 6=Sunday) | [0, 0, 1] |
prices |
List[float] | β No | Product prices | [29.99, 49.99] |
segment_id |
int | β No | User segment ID (0-7) | 0 |
task |
str | β No | Task type: "all", "action", "product", "timing" | "all" |
top_k |
int | β No | Number of top products to return (default: 10) | 10 |
Constraints:
product_idsandevent_typesmust have the same length- Maximum sequence length: 100 events
- If sequence exceeds max length, it will be truncated from the beginning
Output Format
Task: "all" (Default)
{
"next_action": {
"predicted": "purchase",
"probabilities": {
"view": 0.15,
"cart": 0.25,
"purchase": 0.60
}
},
"next_products": [
{
"product_id": "1002042",
"score": 0.234
},
...
],
"next_purchase_days": 3.5
}
Task: "action"
{
"next_action": {
"predicted": "purchase",
"probabilities": {
"view": 0.20,
"cart": 0.30,
"purchase": 0.50
}
}
}
Task: "product"
{
"next_products": [
{
"product_id": "1002042",
"score": 0.234
},
...
]
}
Task: "timing"
{
"next_purchase_days": 7.2
}
π Performance
Model Metrics
- Model Size: ~139M parameters
- Model File Size: ~558 MB (model.safetensors)
- Vocabulary Size: 164,577 products
- Inference Speed: ~100-300ms per prediction (CPU)
- Memory Usage: ~2GB RAM for inference
Target Performance Metrics
- Next Action Accuracy: > 65%
- Product Recommendation NDCG@10: > 0.45
- Purchase Timing RMSE: < 20% of mean LTV
- API Response Time: < 300ms
β οΈ Limitations
Sequence Length: Maximum 100 events per sequence. Longer sequences are truncated.
Vocabulary: Fixed product/category vocabularies from training data. New products not in vocabulary will be mapped to default values.
Temporal Patterns: Predictions are based on historical patterns and may not capture sudden trends or seasonal changes.
Cold Start: Limited performance for:
- Users with minimal history (< 5 events)
- Products with no historical data
- New categories
Data Requirements: Requires structured sequential data with product IDs, event types, and optional metadata.
Segment Dependency: Best performance when user segment information is available.
π§ Advanced Usage
Loading Model Directly
from src.models.hf_lbm import LBMModel
from src.models.hf_tokenizer import LBMTokenizer
import torch
# Load model and tokenizer
model = LBMModel.from_pretrained("souvik16011991roy/LBM-ecom")
tokenizer = LBMTokenizer.from_pretrained("souvik16011991roy/LBM-ecom")
# Encode inputs
encoded = tokenizer.encode_sequence(
product_ids=['1000978', '1001588'],
event_types=['view', 'cart'],
return_tensors='pt'
)
# Forward pass
model.eval()
with torch.no_grad():
outputs = model(
product_ids=encoded['product_ids'],
event_types=encoded['event_types'],
category_ids=encoded['category_ids'],
segment_ids=encoded['segment_id'],
hours=encoded['hours'],
days=encoded['days'],
prices=encoded['prices'],
attention_mask=encoded['attention_mask']
)
# Process outputs
import torch.nn.functional as F
action_probs = F.softmax(outputs.action_logits[:, -1, :], dim=-1)
product_probs = F.softmax(outputs.product_logits[:, -1, :], dim=-1)
timing = outputs.timing_pred[:, -1, 0]
Custom Configuration
from src.models.hf_lbm import LBMConfig, LBMModel
# Create custom config
config = LBMConfig(
product_vocab_size=200000,
hidden_dim=1024,
num_layers=6,
num_heads=16
)
# Initialize model
model = LBMModel(config)
π Citation
If you use this model in your research, please cite:
@misc{lbm-ecom-model,
title={Large Behavioral Model for E-commerce},
author={Souvik Roy},
year={2024},
howpublished={\url{https://huggingface.co/souvik16011991roy/LBM-ecom}}
}
π License
This model is licensed under the MIT License. See LICENSE file for details.
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
π§ Contact
For questions or issues:
- Model Repository: https://huggingface.co/souvik16011991roy/LBM-ecom
- Issues: Open an issue on the Hugging Face model repository
π Acknowledgments
- Built with Transformers by Hugging Face
- Trained on e-commerce transaction data
- Architecture inspired by transformer-based sequential models
Made with β€οΈ for e-commerce behavioral prediction
- Downloads last month
- 24