Large Behavioral Model (LBM) for E-commerce

Hugging Face License

A transformer-based Large Behavioral Model (LBM) designed for e-commerce behavioral prediction. This model predicts user actions, product recommendations, and purchase timing based on sequential user behavior patterns.

πŸš€ Quick Start

Installation

pip install transformers torch huggingface-hub

Basic Usage

from src.models.hf_pipeline import LBMPipeline

# Load model from Hugging Face Hub
pipeline = LBMPipeline(model="souvik16011991roy/LBM-ecom")

# Prepare inputs
inputs = {
    'product_ids': ['1000978', '1001588', '1001606'],
    'event_types': ['view', 'view', 'cart'],
    'category_ids': ['cat1', 'cat1', 'cat2'],
    'hours': [10, 14, 18],
    'days': [0, 0, 1],  # 0=Monday
    'prices': [29.99, 49.99, 79.99],
    'segment_id': 0
}

# Make predictions
results = pipeline(inputs)
print(results)

πŸ“‹ Table of Contents

🎯 Model Description

The Large Behavioral Model (LBM) is a transformer-based neural network designed for e-commerce behavioral prediction. It analyzes sequential user behavior patterns to predict:

  • Next Action: What action will the user take next? (view/cart/purchase)
  • Next Product: Which products are most likely to be viewed/purchased next?
  • Purchase Timing: When will the user make their next purchase?

Model Specifications

  • Model Type: Transformer Encoder
  • Parameters: ~139M parameters
  • Architecture: Multi-modal embeddings + Transformer encoder + Multi-task heads
  • Max Sequence Length: 100 events
  • Vocabulary Sizes:
    • Products: 164,577
    • Events: 3 (view, cart, purchase)
    • Categories: 624

πŸ—οΈ Architecture

Input Embeddings:
β”œβ”€β”€ Product Embedding (164,577 Γ— 256)
β”œβ”€β”€ Event Embedding (3 Γ— 16)
β”œβ”€β”€ Category Embedding (624 Γ— 32)
β”œβ”€β”€ Segment Embedding (8 Γ— 32)
β”œβ”€β”€ Hour Embedding (24 Γ— 16)
β”œβ”€β”€ Day Embedding (7 Γ— 16)
└── Price Projection (1 Γ— 16)

Transformer Encoder:
β”œβ”€β”€ 4 Transformer Layers
β”‚   β”œβ”€β”€ Multi-head Attention (8 heads)
β”‚   └── Feed-forward Network (2048 dim)
└── Layer Normalization

Output Heads:
β”œβ”€β”€ Action Head (512 Γ— 3)
β”œβ”€β”€ Product Head (512 Γ— 164,577)
└── Timing Head (512 Γ— 1)

✨ Features

  • Multi-task Learning: Predicts action, product, and timing simultaneously
  • Temporal Awareness: Incorporates hour and day of week patterns
  • User Segmentation: Personalizes predictions based on user segments
  • Product Recommendations: Provides top-K product recommendations with scores
  • Purchase Timing: Predicts days until next purchase

πŸ“‘ API Documentation

Hugging Face Inference API

The model is available via Hugging Face Inference API. Here are the endpoints and example responses:

Base URL

https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom

Authentication

Include your Hugging Face token in the Authorization header:

Authorization: Bearer YOUR_HF_TOKEN

Endpoint: Predict All (Default)

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": {
    "product_ids": ["1000978", "1001588", "1001606"],
    "event_types": ["view", "view", "cart"],
    "category_ids": ["cat1", "cat1", "cat2"],
    "hours": [10, 14, 18],
    "days": [0, 0, 1],
    "prices": [29.99, 49.99, 79.99],
    "segment_id": 0,
    "task": "all"
  }
}

Mock Response:

{
  "next_action": {
    "predicted": "purchase",
    "probabilities": {
      "view": 0.15,
      "cart": 0.25,
      "purchase": 0.60
    }
  },
  "next_products": [
    {
      "product_id": "1002042",
      "score": 0.234
    },
    {
      "product_id": "1002062",
      "score": 0.189
    },
    {
      "product_id": "1002098",
      "score": 0.156
    },
    {
      "product_id": "1002099",
      "score": 0.134
    },
    {
      "product_id": "1002100",
      "score": 0.112
    },
    {
      "product_id": "1002101",
      "score": 0.098
    },
    {
      "product_id": "1002102",
      "score": 0.087
    },
    {
      "product_id": "1002103",
      "score": 0.076
    },
    {
      "product_id": "1002225",
      "score": 0.065
    },
    {
      "product_id": "1002266",
      "score": 0.049
    }
  ],
  "next_purchase_days": 3.5
}

Endpoint: Predict Next Action Only

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": {
    "product_ids": ["1000978", "1001588"],
    "event_types": ["view", "cart"],
    "task": "action"
  }
}

Mock Response:

{
  "next_action": {
    "predicted": "purchase",
    "probabilities": {
      "view": 0.20,
      "cart": 0.30,
      "purchase": 0.50
    }
  }
}

Endpoint: Predict Next Products Only

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": {
    "product_ids": ["1000978", "1001588", "1001606"],
    "event_types": ["view", "view", "cart"],
    "category_ids": ["cat1", "cat1", "cat2"],
    "top_k": 5,
    "task": "product"
  }
}

Mock Response:

{
  "next_products": [
    {
      "product_id": "1002042",
      "score": 0.234
    },
    {
      "product_id": "1002062",
      "score": 0.189
    },
    {
      "product_id": "1002098",
      "score": 0.156
    },
    {
      "product_id": "1002099",
      "score": 0.134
    },
    {
      "product_id": "1002100",
      "score": 0.112
    }
  ]
}

Endpoint: Predict Purchase Timing Only

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": {
    "product_ids": ["1000978", "1001588", "1001606"],
    "event_types": ["view", "cart", "purchase"],
    "task": "timing"
  }
}

Mock Response:

{
  "next_purchase_days": 7.2
}

Batch Processing

Request:

POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom
Content-Type: application/json
Authorization: Bearer YOUR_HF_TOKEN

{
  "inputs": [
    {
      "product_ids": ["1000978", "1001588"],
      "event_types": ["view", "cart"],
      "task": "all"
    },
    {
      "product_ids": ["1001606", "1002042"],
      "event_types": ["view", "view"],
      "task": "all"
    }
  ]
}

Mock Response:

{
  "results": [
    {
      "next_action": {
        "predicted": "purchase",
        "probabilities": {
          "view": 0.20,
          "cart": 0.30,
          "purchase": 0.50
        }
      },
      "next_products": [
        {"product_id": "1002042", "score": 0.234},
        {"product_id": "1002062", "score": 0.189}
      ],
      "next_purchase_days": 3.5
    },
    {
      "next_action": {
        "predicted": "view",
        "probabilities": {
          "view": 0.65,
          "cart": 0.25,
          "purchase": 0.10
        }
      },
      "next_products": [
        {"product_id": "1002098", "score": 0.198},
        {"product_id": "1002099", "score": 0.167}
      ],
      "next_purchase_days": 12.3
    }
  ]
}

Error Responses

Invalid Input Format:

{
  "error": "product_ids and event_types are required",
  "status_code": 400
}

Length Mismatch:

{
  "error": "product_ids and event_types must have the same length",
  "status_code": 400
}

Server Error:

{
  "error": "Model inference failed: ...",
  "status_code": 500
}

πŸ’» Usage Examples

Python - Using Pipeline

from src.models.hf_pipeline import LBMPipeline

# Initialize pipeline
pipeline = LBMPipeline(model="souvik16011991roy/LBM-ecom")

# Example 1: Basic prediction
inputs = {
    'product_ids': ['1000978', '1001588'],
    'event_types': ['view', 'cart'],
    'segment_id': 0
}

results = pipeline(inputs)
print(f"Next action: {results['next_action']['predicted']}")
print(f"Top product: {results['next_products'][0]['product_id']}")
print(f"Days until purchase: {results['next_purchase_days']}")

# Example 2: Predict next action only
next_action = pipeline.predict_next_action(
    product_ids=['1000978', '1001588'],
    event_types=['view', 'cart']
)
print(next_action)

# Example 3: Get top 10 product recommendations
top_products = pipeline.predict_next_product(
    product_ids=['1000978', '1001588', '1001606'],
    event_types=['view', 'view', 'cart'],
    top_k=10
)
for product in top_products:
    print(f"Product {product['product_id']}: {product['score']:.4f}")

# Example 4: Predict purchase timing
days_until = pipeline.predict_next_purchase(
    product_ids=['1000978', '1001588'],
    event_types=['view', 'cart']
)
print(f"Days until next purchase: {days_until:.2f}")

Python - Using Requests (Inference API)

import requests

API_URL = "https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom"
headers = {"Authorization": f"Bearer YOUR_HF_TOKEN"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

# Predict all
output = query({
    "inputs": {
        "product_ids": ["1000978", "1001588", "1001606"],
        "event_types": ["view", "view", "cart"],
        "category_ids": ["cat1", "cat1", "cat2"],
        "hours": [10, 14, 18],
        "days": [0, 0, 1],
        "prices": [29.99, 49.99, 79.99],
        "segment_id": 0,
        "task": "all"
    }
})

print(output)

JavaScript/Node.js

const fetch = require('node-fetch');

const API_URL = 'https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom';
const headers = {
    'Authorization': 'Bearer YOUR_HF_TOKEN',
    'Content-Type': 'application/json'
};

async function predict(inputs) {
    const response = await fetch(API_URL, {
        method: 'POST',
        headers: headers,
        body: JSON.stringify({ inputs })
    });
    
    return await response.json();
}

// Usage
predict({
    product_ids: ['1000978', '1001588'],
    event_types: ['view', 'cart'],
    task: 'all'
}).then(result => {
    console.log(result);
});

cURL

curl -X POST https://api-inference.huggingface.co/models/souvik16011991roy/LBM-ecom \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "product_ids": ["1000978", "1001588"],
      "event_types": ["view", "cart"],
      "task": "all"
    }
  }'

πŸ“₯ Input/Output Formats

Input Format

All inputs should be provided as a dictionary with the following fields:

Field Type Required Description Example
product_ids List[str/int] βœ… Yes List of product IDs in sequence ["1000978", "1001588"]
event_types List[str] βœ… Yes List of event types (view/cart/purchase) ["view", "cart"]
category_ids List[str/int] ❌ No List of category IDs ["cat1", "cat2"]
hours List[int] ❌ No Hour of day (0-23) [10, 14, 18]
days List[int] ❌ No Day of week (0=Monday, 6=Sunday) [0, 0, 1]
prices List[float] ❌ No Product prices [29.99, 49.99]
segment_id int ❌ No User segment ID (0-7) 0
task str ❌ No Task type: "all", "action", "product", "timing" "all"
top_k int ❌ No Number of top products to return (default: 10) 10

Constraints:

  • product_ids and event_types must have the same length
  • Maximum sequence length: 100 events
  • If sequence exceeds max length, it will be truncated from the beginning

Output Format

Task: "all" (Default)

{
  "next_action": {
    "predicted": "purchase",
    "probabilities": {
      "view": 0.15,
      "cart": 0.25,
      "purchase": 0.60
    }
  },
  "next_products": [
    {
      "product_id": "1002042",
      "score": 0.234
    },
    ...
  ],
  "next_purchase_days": 3.5
}

Task: "action"

{
  "next_action": {
    "predicted": "purchase",
    "probabilities": {
      "view": 0.20,
      "cart": 0.30,
      "purchase": 0.50
    }
  }
}

Task: "product"

{
  "next_products": [
    {
      "product_id": "1002042",
      "score": 0.234
    },
    ...
  ]
}

Task: "timing"

{
  "next_purchase_days": 7.2
}

πŸ“Š Performance

Model Metrics

  • Model Size: ~139M parameters
  • Model File Size: ~558 MB (model.safetensors)
  • Vocabulary Size: 164,577 products
  • Inference Speed: ~100-300ms per prediction (CPU)
  • Memory Usage: ~2GB RAM for inference

Target Performance Metrics

  • Next Action Accuracy: > 65%
  • Product Recommendation NDCG@10: > 0.45
  • Purchase Timing RMSE: < 20% of mean LTV
  • API Response Time: < 300ms

⚠️ Limitations

  1. Sequence Length: Maximum 100 events per sequence. Longer sequences are truncated.

  2. Vocabulary: Fixed product/category vocabularies from training data. New products not in vocabulary will be mapped to default values.

  3. Temporal Patterns: Predictions are based on historical patterns and may not capture sudden trends or seasonal changes.

  4. Cold Start: Limited performance for:

    • Users with minimal history (< 5 events)
    • Products with no historical data
    • New categories
  5. Data Requirements: Requires structured sequential data with product IDs, event types, and optional metadata.

  6. Segment Dependency: Best performance when user segment information is available.

πŸ”§ Advanced Usage

Loading Model Directly

from src.models.hf_lbm import LBMModel
from src.models.hf_tokenizer import LBMTokenizer
import torch

# Load model and tokenizer
model = LBMModel.from_pretrained("souvik16011991roy/LBM-ecom")
tokenizer = LBMTokenizer.from_pretrained("souvik16011991roy/LBM-ecom")

# Encode inputs
encoded = tokenizer.encode_sequence(
    product_ids=['1000978', '1001588'],
    event_types=['view', 'cart'],
    return_tensors='pt'
)

# Forward pass
model.eval()
with torch.no_grad():
    outputs = model(
        product_ids=encoded['product_ids'],
        event_types=encoded['event_types'],
        category_ids=encoded['category_ids'],
        segment_ids=encoded['segment_id'],
        hours=encoded['hours'],
        days=encoded['days'],
        prices=encoded['prices'],
        attention_mask=encoded['attention_mask']
    )

# Process outputs
import torch.nn.functional as F
action_probs = F.softmax(outputs.action_logits[:, -1, :], dim=-1)
product_probs = F.softmax(outputs.product_logits[:, -1, :], dim=-1)
timing = outputs.timing_pred[:, -1, 0]

Custom Configuration

from src.models.hf_lbm import LBMConfig, LBMModel

# Create custom config
config = LBMConfig(
    product_vocab_size=200000,
    hidden_dim=1024,
    num_layers=6,
    num_heads=16
)

# Initialize model
model = LBMModel(config)

πŸ“š Citation

If you use this model in your research, please cite:

@misc{lbm-ecom-model,
  title={Large Behavioral Model for E-commerce},
  author={Souvik Roy},
  year={2024},
  howpublished={\url{https://huggingface.co/souvik16011991roy/LBM-ecom}}
}

πŸ“ License

This model is licensed under the MIT License. See LICENSE file for details.

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“§ Contact

For questions or issues:

πŸ™ Acknowledgments

  • Built with Transformers by Hugging Face
  • Trained on e-commerce transaction data
  • Architecture inspired by transformer-based sequential models

Made with ❀️ for e-commerce behavioral prediction

Downloads last month
24
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support