|
|
--- |
|
|
license: mit |
|
|
datasets: |
|
|
- Recompense/amazon-appliances-lite-data |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- finance |
|
|
|
|
|
|
|
|
--- |
|
|
# Amazon Appliances Product Price Predictor |
|
|
|
|
|
A Bi-LSTM model trained to predict e-commerce product prices from textual descriptions. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Model type:** Bi-directional LSTM (Keras) |
|
|
- **Task:** Regression (price prediction) |
|
|
- **Input:** Product description (text) |
|
|
- **Output:** Predicted price (USD) |
|
|
|
|
|
--- |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model is designed to provide quick, approximate pricing for small-to-medium sized e-commerce catalogs where descriptions follow a consistent style (e.g., electronics or appliances). It **should not** be used: |
|
|
|
|
|
- For precise financial appraisal or high-stakes bidding. |
|
|
- On descriptions with highly technical jargon the model wasn’t trained on. |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- **Domain sensitivity:** Trained on the `Recompense/amazon-appliances-lite-data` dataset—performance may degrade on other product categories. |
|
|
- **Short descriptions:** Very long or unstructured text may reduce accuracy. |
|
|
- **Price range:** Only learns the range present in the training data (~\$10–\$500). |
|
|
|
|
|
--- |
|
|
|
|
|
## Training |
|
|
|
|
|
- **Dataset:** `Recompense/amazon-appliances-lite-data` |
|
|
- **Preprocessing:** |
|
|
- Text vectorization (max 10 000 tokens) |
|
|
- Embedding dimension: 128 |
|
|
- **Architecture:** |
|
|
1. Embedding → Bi-LSTM(64) → Bi-LSTM(64) → Dense(1) |
|
|
- **Optimizer:** Adam, learning rate 1e-3 |
|
|
- **Epochs:** 20, batch size 32 |
|
|
|
|
|
--- |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
- **Metric:** Root Mean Squared Logarithmic Error (RMSLE) |
|
|
- **Formula (display mode):** |
|
|
|
|
|
$$ |
|
|
RMSLE = \sqrt{ \frac{1}{n} \sum_{i=1}^{n} \bigl(\log(1 + \hat{y}_i) - \log(1 + y_i)\bigr)^2 } |
|
|
$$ |
|
|
|
|
|
- **Test RMSLE:** 0.51 on held-out validation set hit 90.4% on validation set with margin of $40 |
|
|
|
|
|
--- |
|
|
|
|
|
## Files |
|
|
|
|
|
- **`model_weights.h5`** – Trained Keras weights |
|
|
- **`model_config.json`** – Model architecture config |
|
|
- **`vectorizer_config.json`** – Text vectorization config |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
Below is an end-to-end example showing how to load the model from the Hugging Face Hub, set your preferred Keras backend, and run inference using the helper function: |
|
|
|
|
|
```python |
|
|
# 1) Install dependencies (if needed) |
|
|
# pip install tensorflow jax keras huggingface_hub |
|
|
|
|
|
# 2) Choose your backend: "jax", "torch", or "tensorflow" |
|
|
import os |
|
|
os.environ["KERAS_BACKEND"] = "jax" # or "torch", or "tensorflow" |
|
|
|
|
|
# 3) Load Keras and the model from the Hub |
|
|
from keras.saving import load_model |
|
|
|
|
|
model = load_model("hf://Recompense/product-pricer-bilstm") |
|
|
|
|
|
# 4) Define your inference function |
|
|
import tensorflow as tf |
|
|
|
|
|
def bilstm_pricer(item_text: str) -> int: |
|
|
""" |
|
|
Predict the price of a product given its description. |
|
|
|
|
|
Args: |
|
|
item_text (str): The full prompt text, including any prefix. |
|
|
Only the description (after the first blank line) is used. |
|
|
|
|
|
Returns: |
|
|
int: The rounded, non-negative predicted price in USD. |
|
|
""" |
|
|
# Extract just the product description (assuming a prefix question) |
|
|
try: |
|
|
description = item_text.split('\n\n', 1)[1] |
|
|
except IndexError: |
|
|
description = item_text |
|
|
|
|
|
# Vectorize and batch the text |
|
|
text_tensor = tf.convert_to_tensor([description]) |
|
|
|
|
|
# Model prediction |
|
|
pred = model.predict(text_tensor, verbose=0)[0][0] |
|
|
|
|
|
# Post-process: clamp and round |
|
|
pred = max(0.0, pred) |
|
|
return round(pred) |
|
|
|
|
|
# 5) Example inference |
|
|
prompt = ( |
|
|
"What is a fair price for the following appliance?\n\n" |
|
|
"Stainless steel 12-cup programmable coffee maker with auto-shutoff" |
|
|
) |
|
|
|
|
|
predicted_price = bilstm_pricer(prompt) |
|
|
print(f"Predicted price: ${predicted_price}") |
|
|
``` |
|
|
|