Large Behavioral Model (LBM) for E-commerce

A transformer-based Large Behavioral Model (LBM) designed for e-commerce behavioral prediction. This model predicts user actions, product recommendations, and purchase timing based on sequential user behavior patterns.

🚀 Quick Start

Installation

pip install transformers torch huggingface-hub

Basic Usage

from src.models.hf_pipeline import LBMPipeline

# Load model from Hugging Face Hub
pipeline = LBMPipeline(model="souvik16011991roy/LBM-ecom")

# Prepare inputs
inputs = {
    'product_ids': ['1000978', '1001588', '1001606'],
    'event_types': ['view', 'view', 'cart'],
    'category_ids': ['cat1', 'cat1', 'cat2'],
    'hours': [10, 14, 18],
    'days': [0, 0, 1],  # 0=Monday
    'prices': [29.99, 49.99, 79.99],
    'segment_id': 0
}

# Make predictions
results = pipeline(inputs)
print(results)

📋 Table of Contents

Model Description
Architecture
Features
API Documentation
Usage Examples
Input/Output Formats
Performance
Limitations
Citation

🎯 Model Description

The Large Behavioral Model (LBM) is a transformer-based neural network designed for e-commerce behavioral prediction. It analyzes sequential user behavior patterns to predict:

Next Action: What action will the user take next? (view/cart/purchase)
Next Product: Which products are most likely to be viewed/purchased next?
Purchase Timing: When will the user make their next purchase?

Model Specifications

Model Type: Transformer Encoder
Parameters: ~139M parameters
Architecture: Multi-modal embeddings + Transformer encoder + Multi-task heads
Max Sequence Length: 100 events
Vocabulary Sizes:
- Products: 164,577
- Events: 3 (view, cart, purchase)
- Categories: 624

🏗️ Architecture

Input Embeddings:
├── Product Embedding (164,577 × 256)
├── Event Embedding (3 × 16)
├── Category Embedding (624 × 32)
├── Segment Embedding (8 × 32)
├── Hour Embedding (24 × 16)
├── Day Embedding (7 × 16)
└── Price Projection (1 × 16)

Transformer Encoder:
├── 4 Transformer Layers
│   ├── Multi-head Attention (8 heads)
│   └── Feed-forward Network (2048 dim)
└── Layer Normalization

Output Heads:
├── Action Head (512 × 3)
├── Product Head (512 × 164,577)
└── Timing Head (512 × 1)

✨ Features

Multi-task Learning: Predicts action, product, and timing simultaneously
Temporal Awareness: Incorporates hour and day of week patterns
User Segmentation: Personalizes predictions based on user segments
Product Recommendations: Provides top-K product recommendations with scores
Purchase Timing: Predicts days until next purchase

📡 API Documentation

Hugging Face Inference API

The model is available via Hugging Face Inference API. Here are the endpoints and example responses:

Base URL

https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom

Authentication

Include your Hugging Face token in the Authorization header:

Authorization: Bearer YOUR_HF_TOKEN

Endpoint: Predict All (Default)

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": {
    "product_ids": ["1000978", "1001588", "1001606"],
    "event_types": ["view", "view", "cart"],
    "category_ids": ["cat1", "cat1", "cat2"],
    "hours": [10, 14, 18],
    "days": [0, 0, 1],
    "prices": [29.99, 49.99, 79.99],
    "segment_id": 0,
    "task": "all"
  }
}

Mock Response:

{
  "next_action": {
    "predicted": "purchase",
    "probabilities": {
      "view": 0.15,
      "cart": 0.25,
      "purchase": 0.60
    }
  },
  "next_products": [
    {
      "product_id": "1002042",
      "score": 0.234
    },
    {
      "product_id": "1002062",
      "score": 0.189
    },
    {
      "product_id": "1002098",
      "score": 0.156
    },
    {
      "product_id": "1002099",
      "score": 0.134
    },
    {
      "product_id": "1002100",
      "score": 0.112
    },
    {
      "product_id": "1002101",
      "score": 0.098
    },
    {
      "product_id": "1002102",
      "score": 0.087
    },
    {
      "product_id": "1002103",
      "score": 0.076
    },
    {
      "product_id": "1002225",
      "score": 0.065
    },
    {
      "product_id": "1002266",
      "score": 0.049
    }
  ],
  "next_purchase_days": 3.5
}

Endpoint: Predict Next Action Only

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": {
    "product_ids": ["1000978", "1001588"],
    "event_types": ["view", "cart"],
    "task": "action"
  }
}

Mock Response:

{
  "next_action": {
    "predicted": "purchase",
    "probabilities": {
      "view": 0.20,
      "cart": 0.30,
      "purchase": 0.50
    }
  }
}

Endpoint: Predict Next Products Only

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": {
    "product_ids": ["1000978", "1001588", "1001606"],
    "event_types": ["view", "view", "cart"],
    "category_ids": ["cat1", "cat1", "cat2"],
    "top_k": 5,
    "task": "product"
  }
}

Mock Response:

{
  "next_products": [
    {
      "product_id": "1002042",
      "score": 0.234
    },
    {
      "product_id": "1002062",
      "score": 0.189
    },
    {
      "product_id": "1002098",
      "score": 0.156
    },
    {
      "product_id": "1002099",
      "score": 0.134
    },
    {
      "product_id": "1002100",
      "score": 0.112
    }
  ]
}

Endpoint: Predict Purchase Timing Only

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": {
    "product_ids": ["1000978", "1001588", "1001606"],
    "event_types": ["view", "cart", "purchase"],
    "task": "timing"
  }
}

Mock Response:

{
  "next_purchase_days": 7.2
}

Batch Processing

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": [
    {
      "product_ids": ["1000978", "1001588"],
      "event_types": ["view", "cart"],
      "task": "all"
    },
    {
      "product_ids": ["1001606", "1002042"],
      "event_types": ["view", "view"],
      "task": "all"
    }
  ]
}

Mock Response:

{
  "results": [
    {
      "next_action": {
        "predicted": "purchase",
        "probabilities": {
          "view": 0.20,
          "cart": 0.30,
          "purchase": 0.50
        }
      },
      "next_products": [
        {"product_id": "1002042", "score": 0.234},
        {"product_id": "1002062", "score": 0.189}
      ],
      "next_purchase_days": 3.5
    },
    {
      "next_action": {
        "predicted": "view",
        "probabilities": {
          "view": 0.65,
          "cart": 0.25,
          "purchase": 0.10
        }
      },
      "next_products": [
        {"product_id": "1002098", "score": 0.198},
        {"product_id": "1002099", "score": 0.167}
      ],
      "next_purchase_days": 12.3
    }
  ]
}

Error Responses

Invalid Input Format:

{
  "error": "product_ids and event_types are required",
  "status_code": 400
}

Length Mismatch:

{
  "error": "product_ids and event_types must have the same length",
  "status_code": 400
}

Server Error:

{
  "error": "Model inference failed: ...",
  "status_code": 500
}

💻 Usage Examples

Python - Using Pipeline

from src.models.hf_pipeline import LBMPipeline

# Initialize pipeline
pipeline = LBMPipeline(model="souvik16011991roy/LBM-ecom")

# Example 1: Basic prediction
inputs = {
    'product_ids': ['1000978', '1001588'],
    'event_types': ['view', 'cart'],
    'segment_id': 0
}

results = pipeline(inputs)
print(f"Next action: {results['next_action']['predicted']}")
print(f"Top product: {results['next_products'][0]['product_id']}")
print(f"Days until purchase: {results['next_purchase_days']}")

# Example 2: Predict next action only
next_action = pipeline.predict_next_action(
    product_ids=['1000978', '1001588'],
    event_types=['view', 'cart']
)
print(next_action)

# Example 3: Get top 10 product recommendations
top_products = pipeline.predict_next_product(
    product_ids=['1000978', '1001588', '1001606'],
    event_types=['view', 'view', 'cart'],
    top_k=10
)
for product in top_products:
    print(f"Product {product['product_id']}: {product['score']:.4f}")

# Example 4: Predict purchase timing
days_until = pipeline.predict_next_purchase(
    product_ids=['1000978', '1001588'],
    event_types=['view', 'cart']
)
print(f"Days until next purchase: {days_until:.2f}")

Python - Using Requests (Inference API)

import requests

API_URL = "https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom"
headers = {"Authorization": f"Bearer YOUR_HF_TOKEN"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

# Predict all
output = query({
    "inputs": {
        "product_ids": ["1000978", "1001588", "1001606"],
        "event_types": ["view", "view", "cart"],
        "category_ids": ["cat1", "cat1", "cat2"],
        "hours": [10, 14, 18],
        "days": [0, 0, 1],
        "prices": [29.99, 49.99, 79.99],
        "segment_id": 0,
        "task": "all"
    }
})

print(output)

JavaScript/Node.js

const fetch = require('node-fetch');

const API_URL = 'https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom';
const headers = {
    'Authorization': 'Bearer YOUR_HF_TOKEN',
    'Content-Type': 'application/json'
};

async function predict(inputs) {
    const response = await fetch(API_URL, {
        method: 'POST',
        headers: headers,
        body: JSON.stringify({ inputs })
    });
    
    return await response.json();
}

// Usage
predict({
    product_ids: ['1000978', '1001588'],
    event_types: ['view', 'cart'],
    task: 'all'
}).then(result => {
    console.log(result);
});

cURL

curl -X POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "product_ids": ["1000978", "1001588"],
      "event_types": ["view", "cart"],
      "task": "all"
    }
  }'

📥 Input/Output Formats

Input Format

All inputs should be provided as a dictionary with the following fields:

Field	Type	Required	Description	Example
`product_ids`	List[str/int]	✅ Yes	List of product IDs in sequence	`["1000978", "1001588"]`
`event_types`	List[str]	✅ Yes	List of event types (view/cart/purchase)	`["view", "cart"]`
`category_ids`	List[str/int]	❌ No	List of category IDs	`["cat1", "cat2"]`
`hours`	List[int]	❌ No	Hour of day (0-23)	`[10, 14, 18]`
`days`	List[int]	❌ No	Day of week (0=Monday, 6=Sunday)	`[0, 0, 1]`
`prices`	List[float]	❌ No	Product prices	`[29.99, 49.99]`
`segment_id`	int	❌ No	User segment ID (0-7)	`0`
`task`	str	❌ No	Task type: "all", "action", "product", "timing"	`"all"`
`top_k`	int	❌ No	Number of top products to return (default: 10)	`10`

Constraints:

product_ids and event_types must have the same length
Maximum sequence length: 100 events
If sequence exceeds max length, it will be truncated from the beginning

Output Format

Task: "all" (Default)

{
  "next_action": {
    "predicted": "purchase",
    "probabilities": {
      "view": 0.15,
      "cart": 0.25,
      "purchase": 0.60
    }
  },
  "next_products": [
    {
      "product_id": "1002042",
      "score": 0.234
    },
    ...
  ],
  "next_purchase_days": 3.5
}

Task: "action"

{
  "next_action": {
    "predicted": "purchase",
    "probabilities": {
      "view": 0.20,
      "cart": 0.30,
      "purchase": 0.50
    }
  }
}

Task: "product"

{
  "next_products": [
    {
      "product_id": "1002042",
      "score": 0.234
    },
    ...
  ]
}

Task: "timing"

{
  "next_purchase_days": 7.2
}

📊 Performance

Model Metrics

Model Size: ~139M parameters
Model File Size: ~558 MB (model.safetensors)
Vocabulary Size: 164,577 products
Inference Speed: ~100-300ms per prediction (CPU)
Memory Usage: ~2GB RAM for inference

Target Performance Metrics

Next Action Accuracy: > 65%
Product Recommendation NDCG@10: > 0.45
Purchase Timing RMSE: < 20% of mean LTV
API Response Time: < 300ms

⚠️ Limitations

Sequence Length: Maximum 100 events per sequence. Longer sequences are truncated.
Vocabulary: Fixed product/category vocabularies from training data. New products not in vocabulary will be mapped to default values.
Temporal Patterns: Predictions are based on historical patterns and may not capture sudden trends or seasonal changes.
Cold Start: Limited performance for:
- Users with minimal history (< 5 events)
- Products with no historical data
- New categories
Data Requirements: Requires structured sequential data with product IDs, event types, and optional metadata.
Segment Dependency: Best performance when user segment information is available.

🔧 Advanced Usage

Loading Model Directly

from src.models.hf_lbm import LBMModel
from src.models.hf_tokenizer import LBMTokenizer
import torch

# Load model and tokenizer
model = LBMModel.from_pretrained("souvik16011991roy/LBM-ecom")
tokenizer = LBMTokenizer.from_pretrained("souvik16011991roy/LBM-ecom")

# Encode inputs
encoded = tokenizer.encode_sequence(
    product_ids=['1000978', '1001588'],
    event_types=['view', 'cart'],
    return_tensors='pt'
)

# Forward pass
model.eval()
with torch.no_grad():
    outputs = model(
        product_ids=encoded['product_ids'],
        event_types=encoded['event_types'],
        category_ids=encoded['category_ids'],
        segment_ids=encoded['segment_id'],
        hours=encoded['hours'],
        days=encoded['days'],
        prices=encoded['prices'],
        attention_mask=encoded['attention_mask']
    )

# Process outputs
import torch.nn.functional as F
action_probs = F.softmax(outputs.action_logits[:, -1, :], dim=-1)
product_probs = F.softmax(outputs.product_logits[:, -1, :], dim=-1)
timing = outputs.timing_pred[:, -1, 0]

Custom Configuration

from src.models.hf_lbm import LBMConfig, LBMModel

# Create custom config
config = LBMConfig(
    product_vocab_size=200000,
    hidden_dim=1024,
    num_layers=6,
    num_heads=16
)

# Initialize model
model = LBMModel(config)

📚 Citation

If you use this model in your research, please cite:

@misc{lbm-ecom-model,
  title={Large Behavioral Model for E-commerce},
  author={Souvik Roy},
  year={2024},
  howpublished={\url{https://huggingface.co/souvik16011991roy/LBM-ecom}}
}

📝 License

This model is licensed under the MIT License. See LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📧 Contact

For questions or issues:

Model Repository: https://huggingface.co/souvik16011991roy/LBM-ecom
Issues: Open an issue on the Hugging Face model repository

🙏 Acknowledgments

Built with Transformers by Hugging Face
Trained on e-commerce transaction data
Architecture inspired by transformer-based sequential models

Made with ❤️ for e-commerce behavioral prediction

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support