File size: 3,696 Bytes
4932fec e4a4876 4932fec dfa9fbd 4932fec | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | ---
license: mit
datasets:
- Recompense/amazon-appliances-lite-data
language:
- en
tags:
- finance
---
# Amazon Appliances Product Price Predictor
A Bi-LSTM model trained to predict e-commerce product prices from textual descriptions.
---
## Model Details
- **Model type:** Bi-directional LSTM (Keras)
- **Task:** Regression (price prediction)
- **Input:** Product description (text)
- **Output:** Predicted price (USD)
---
## Intended Use
This model is designed to provide quick, approximate pricing for small-to-medium sized e-commerce catalogs where descriptions follow a consistent style (e.g., electronics or appliances). It **should not** be used:
- For precise financial appraisal or high-stakes bidding.
- On descriptions with highly technical jargon the model wasn’t trained on.
---
## Limitations
- **Domain sensitivity:** Trained on the `Recompense/amazon-appliances-lite-data` dataset—performance may degrade on other product categories.
- **Short descriptions:** Very long or unstructured text may reduce accuracy.
- **Price range:** Only learns the range present in the training data (~\$10–\$500).
---
## Training
- **Dataset:** `Recompense/amazon-appliances-lite-data`
- **Preprocessing:**
- Text vectorization (max 10 000 tokens)
- Embedding dimension: 128
- **Architecture:**
1. Embedding → Bi-LSTM(64) → Bi-LSTM(64) → Dense(1)
- **Optimizer:** Adam, learning rate 1e-3
- **Epochs:** 20, batch size 32
---
## Evaluation
- **Metric:** Root Mean Squared Logarithmic Error (RMSLE)
- **Formula (display mode):**
$$
RMSLE = \sqrt{ \frac{1}{n} \sum_{i=1}^{n} \bigl(\log(1 + \hat{y}_i) - \log(1 + y_i)\bigr)^2 }
$$
- **Test RMSLE:** 0.51 on held-out validation set hit 90.4% on validation set with margin of $40
---
## Files
- **`model_weights.h5`** – Trained Keras weights
- **`model_config.json`** – Model architecture config
- **`vectorizer_config.json`** – Text vectorization config
---
## Usage
Below is an end-to-end example showing how to load the model from the Hugging Face Hub, set your preferred Keras backend, and run inference using the helper function:
```python
# 1) Install dependencies (if needed)
# pip install tensorflow jax keras huggingface_hub
# 2) Choose your backend: "jax", "torch", or "tensorflow"
import os
os.environ["KERAS_BACKEND"] = "jax" # or "torch", or "tensorflow"
# 3) Load Keras and the model from the Hub
from keras.saving import load_model
model = load_model("hf://Recompense/product-pricer-bilstm")
# 4) Define your inference function
import tensorflow as tf
def bilstm_pricer(item_text: str) -> int:
"""
Predict the price of a product given its description.
Args:
item_text (str): The full prompt text, including any prefix.
Only the description (after the first blank line) is used.
Returns:
int: The rounded, non-negative predicted price in USD.
"""
# Extract just the product description (assuming a prefix question)
try:
description = item_text.split('\n\n', 1)[1]
except IndexError:
description = item_text
# Vectorize and batch the text
text_tensor = tf.convert_to_tensor([description])
# Model prediction
pred = model.predict(text_tensor, verbose=0)[0][0]
# Post-process: clamp and round
pred = max(0.0, pred)
return round(pred)
# 5) Example inference
prompt = (
"What is a fair price for the following appliance?\n\n"
"Stainless steel 12-cup programmable coffee maker with auto-shutoff"
)
predicted_price = bilstm_pricer(prompt)
print(f"Predicted price: ${predicted_price}")
```
|