dejanseo
/

ecommerce-taxonomy-classifier

@@ -1,43 +1,92 @@
 ---
 license: other
 license_name: link-attribution
 license_link: https://dejanmarketing.com/link-attribution/
 ---
-# Model Card: dejanseo/ecommerce-taxonomy-classifier
-## Model Description
-**dejanseo/ecommerce-taxonomy-classifier** is a multi-level text classification model designed to categorize ecommerce product descriptions (or similar text) into a hierarchical taxonomy. It uses a pretrained ALBERT (albert-base-v2) backbone and a custom classification head that leverages parent-level one-hot encodings for deeper levels in the taxonomy.
-### Model Architecture
-- **Base Model**: [ALBERT (albert-base-v2)](https://huggingface.co/albert-base-v2)
-- **Classification Head**: A linear layer (or multi-layer head) on top of the ALBERT pooled output, concatenated with a parent-level one-hot vector representing the higher-level class.
-### Intended Use
-- **Primary Application**: Categorizing product descriptions in online retail or marketplace scenarios.
-- **Potential Use Cases**:
-  - E-commerce product listing
-  - Product categorization for inventory management
-  - Enriching product feeds for better search/discovery
-### How to Use
-1. **Installation**: Install `transformers`, `torch`, etc.
-2. **Pipeline Example**:
-   ```python
-   from transformers import AlbertTokenizer, AlbertForSequenceClassification
-   import torch
-   # Load tokenizer and model from the Hugging Face Hub
-   tokenizer = AlbertTokenizer.from_pretrained("dejanseo/ecommerce-taxonomy-classifier")
-   model = AlbertForSequenceClassification.from_pretrained("dejanseo/ecommerce-taxonomy-classifier")
-   model.eval()
-   text = "Experience the magic of music with the Clavinova CLP-800 series digital pianos."
-   inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
-   with torch.no_grad():
-       outputs = model(**inputs)
-       logits = outputs.logits
-       predicted_class = torch.argmax(logits, dim=-1).item()
-   print("Predicted Class:", predicted_class)

 ---
+language: en
 license: other
 license_name: link-attribution
 license_link: https://dejanmarketing.com/link-attribution/
+model_name: Taxonomy Classifier
+pipeline_tag: text-classification
 ---
+# Taxonomy Classifier
+This model is a hierarchical text classifier designed to categorize text into a 7-level taxonomy. It utilizes a chain of models, where the prediction at each level informs the prediction at the subsequent level. This approach reduces the classification space at each step.
+## Model Details
+- **Model Developers:** You
+- **Model Type:** Hierarchical Text Classification
+- **Base Model:** [`albert/albert-base-v2`](https://huggingface.co/albert/albert-base-v2)
+- **Model Architecture:**
+    - **Level 1:** Standard sequence classification using `AlbertForSequenceClassification`.
+    - **Levels 2-7:** Custom architecture (`TaxonomyClassifier`) where the ALBERT pooled output is concatenated with a one-hot encoded representation of the predicted ID from the previous level before being fed into a linear classification layer.
+- **Language(s):** English
+- **Library:** [Transformers](https://huggingface.co/docs/transformers/index)
+- **License:** [link-attribution](https://dejanmarketing.com/link-attribution/)
+## Uses
+### Direct Use
+The model is intended for categorizing text into a predefined 7-level taxonomy.
+### Downstream Uses
+Potential applications include:
+- Automated content tagging
+- Product categorization
+- Information organization
+### Out-of-Scope Use
+The model's performance on text outside the domain of the training data or for classifying into taxonomies with different structures is not guaranteed.
+## Limitations
+- Performance is dependent on the quality and coverage of the training data.
+- Errors in earlier levels of the hierarchy can propagate to subsequent levels.
+- The model's performance on unseen categories is limited.
+- The model may exhibit biases present in the training data.
+- The reliance on one-hot encoding for parent IDs can lead to high-dimensional input features at deeper levels, potentially impacting training efficiency and performance (especially observed at Level 4).
+## Training Data
+The model was trained on a dataset of 374,521 samples. Each row in the training data represents a full taxonomy path from the root level to a leaf node.
+## Training Procedure
+- **Levels:** Seven separate models were trained, one for each level of the taxonomy.
+- **Level 1 Training:** Trained as a standard sequence classification task.
+- **Levels 2-7 Training:** Trained with a custom architecture incorporating the predicted parent ID.
+- **Input Format:**
+    - **Level 1:** Text response.
+    - **Levels 2-7:** Text response concatenated with a one-hot encoded vector of the predicted ID from the previous level.
+- **Objective Function:** CrossEntropyLoss
+- **Optimizer:** AdamW
+- **Learning Rate:** Initially 5e-5, adjusted to 1e-5 for Level 4.
+- **Training Hyperparameters:**
+    - **Epochs:** 10
+    - **Validation Split:** 0.1
+    - **Validation Frequency:** Every 1000 steps
+    - **Batch Size:** 38
+    - **Max Sequence Length:** 512
+    - **Early Stopping Patience:** 3
+## Evaluation
+Validation loss was used as the primary evaluation metric during training. The following validation loss trends were observed:
+- **Level 1, 2, and 3:** Showed a relatively rapid decrease in validation loss during training.
+- **Level 4:** Exhibited a slower decrease in validation loss, potentially due to the significant increase in the dimensionality of the parent ID one-hot encoding.
+Further evaluation on downstream tasks is recommended to assess the model's practical performance.
+## How to Use
+Inference can be performed using the provided Streamlit application.
+1. **Input Text:** Enter the text you want to classify.
+2. **Select Checkpoints:** Choose the desired checkpoint for each level's model. Checkpoints are saved in the respective `level{n}` directories (e.g., `level1/model` or `level4/level4_step31000`).
+3. **Run Inference:** Click the "Run Inference" button.
+The application will output the predicted ID and the corresponding text description for each level of the taxonomy, based on the provided `mapping.csv` file.