File size: 3,696 Bytes
4932fec
 
 
 
 
 
 
 
 
 
 
e4a4876
4932fec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dfa9fbd
4932fec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
license: mit
datasets:
- Recompense/amazon-appliances-lite-data
language:
- en
tags:
- finance


---
# Amazon Appliances Product Price Predictor

A Bi-LSTM model trained to predict e-commerce product prices from textual descriptions.

---

## Model Details

- **Model type:** Bi-directional LSTM (Keras)  
- **Task:** Regression (price prediction)  
- **Input:** Product description (text)  
- **Output:** Predicted price (USD)  

---

## Intended Use

This model is designed to provide quick, approximate pricing for small-to-medium sized e-commerce catalogs where descriptions follow a consistent style (e.g., electronics or appliances). It **should not** be used:

- For precise financial appraisal or high-stakes bidding.  
- On descriptions with highly technical jargon the model wasn’t trained on.  

---

## Limitations

- **Domain sensitivity:** Trained on the `Recompense/amazon-appliances-lite-data` dataset—performance may degrade on other product categories.  
- **Short descriptions:** Very long or unstructured text may reduce accuracy.  
- **Price range:** Only learns the range present in the training data (~\$10–\$500).  

---

## Training

- **Dataset:** `Recompense/amazon-appliances-lite-data`  
- **Preprocessing:**  
  - Text vectorization (max 10 000 tokens)  
  - Embedding dimension: 128  
- **Architecture:**  
  1. Embedding → Bi-LSTM(64) → Bi-LSTM(64) → Dense(1)  
- **Optimizer:** Adam, learning rate 1e-3  
- **Epochs:** 20, batch size 32  

---

## Evaluation

- **Metric:** Root Mean Squared Logarithmic Error (RMSLE)  
- **Formula (display mode):**

$$
RMSLE = \sqrt{ \frac{1}{n} \sum_{i=1}^{n} \bigl(\log(1 + \hat{y}_i) - \log(1 + y_i)\bigr)^2 }
$$

- **Test RMSLE:** 0.51 on held-out validation set hit 90.4% on validation set with margin of $40 

---

## Files

- **`model_weights.h5`** – Trained Keras weights  
- **`model_config.json`** – Model architecture config  
- **`vectorizer_config.json`** – Text vectorization config  

---

## Usage

Below is an end-to-end example showing how to load the model from the Hugging Face Hub, set your preferred Keras backend, and run inference using the helper function:

```python
# 1) Install dependencies (if needed)
#    pip install tensorflow jax keras huggingface_hub

# 2) Choose your backend: "jax", "torch", or "tensorflow"
import os
os.environ["KERAS_BACKEND"] = "jax"  # or "torch", or "tensorflow"

# 3) Load Keras and the model from the Hub
from keras.saving import load_model

model = load_model("hf://Recompense/product-pricer-bilstm")

# 4) Define your inference function
import tensorflow as tf

def bilstm_pricer(item_text: str) -> int:
    """
    Predict the price of a product given its description.
    
    Args:
        item_text (str): The full prompt text, including any prefix. 
                         Only the description (after the first blank line) is used.
    
    Returns:
        int: The rounded, non-negative predicted price in USD.
    """
    # Extract just the product description (assuming a prefix question)
    try:
        description = item_text.split('\n\n', 1)[1]
    except IndexError:
        description = item_text
    
    # Vectorize and batch the text
    text_tensor = tf.convert_to_tensor([description])
    
    # Model prediction
    pred = model.predict(text_tensor, verbose=0)[0][0]
    
    # Post-process: clamp and round
    pred = max(0.0, pred)
    return round(pred)

# 5) Example inference
prompt = (
    "What is a fair price for the following appliance?\n\n"
    "Stainless steel 12-cup programmable coffee maker with auto-shutoff"
)

predicted_price = bilstm_pricer(prompt)
print(f"Predicted price: ${predicted_price}")
```