bareethul's picture
Update README.md
cc827a8 verified
---
title: Book Genre Predictor
colorFrom: indigo
colorTo: red
sdk: gradio
sdk_version: 5.47.1
app_file: app.py
pinned: false
license: mit
---
# Book Genre Predictor
This Hugging Face Space hosts a **Gradio app** that predicts the **genre of a book** based on its **physical dimensions and page count**.
It uses a **AutoGluon Tabular model** trained during last session.
---
## Dataset & Model Card
- **Dataset:** Book metadata dataset (features: `Height`, `Width`, `Depth`, `Page Count`; label: `Genre`).
- **Dataset Information:** This app uses the [Books-tabular-dataset (its-zion-18)](https://huggingface.co/datasets/its-zion-18/Books-tabular-dataset) The dataset is licensed under **MIT** and consists of **~330 records** in Parquet format (split into `original` and `augmented`).
- **Model Repo:** [FaiyazAzam/24679-tabular-autolguon-predictor](https://huggingface.co/FaiyazAzam/24679-tabular-autolguon-predictor)
- **Framework:** [AutoGluon Tabular](https://auto.gluon.ai/stable/index.html)
- **Task:** Multi class classification -> predict `Genre` (numeric code).
### Input Features
| Feature | Type | Unit / Description |
|--------------|---------|-------------------------------------|
| Height | float | cm – height of the book |
| Width | float | cm – width of the book |
| Depth | float | cm – spine thickness |
| Page Count | integer | number of pages |
### Label
- `Genre` β†’ encoded as **numeric codes** (e.g. 0, 1, 2, …).
- Mapping to actual names was not provided in the original dataset.
---
## App Interface
- **Widgets:** Numeric input boxes for each feature.
- **Output:** Numeric code prediction (e.g. `"Predicted Genre: 1"`).
- **Examples:** 3 preloaded examples for quick testing.
- **Validation:** Ensures all inputs are positive.
---
## πŸ” Example Usage
| Height (cm) | Width (cm) | Depth (cm) | Page Count |
|-------------|------------|------------|------------|
| 20.1 | 13.5 | 1.8 | 250 |
| 24.0 | 15.0 | 2.2 | 320 |
| 18.5 | 12.0 | 1.5 | 180 |
Note: The model often defaults to predicting a single genre (e.g. code 0).
This reflects dataset/model limitations, not the app itself.
---
## Technical Details
- **Backend:** AutoGluon `TabularPredictor` loaded from a zipped artifact.
- **Interface:** [Gradio](https://www.gradio.app/).
- **Deployment:** Hugging Face Spaces (`sdk: gradio`).
- **Environment:** Python 3.10, pinned requirements.
---
## Limitations
- **Numeric labels only:** Original training dataset did not include human readable genre names.
- **Collapsed predictions:** Model tends to overpredict the majority class (`0`).
- **Generalization:** Accuracy on unseen books is uncertain due to limited feature set.
---
## Future Improvements
- Map numeric codes to the actual genre categories from the dataset.
- Retrain model with balanced classes.
- Provide confidence scores along with predictions.
- Explore richer book features (author, publisher, language).
---
## AI Disclosure
Parts of this project were supported with the help of AI tools (GPT-5), mainly for:
- Debugging deployment issues on Hugging Face Spaces
- Improving the stability of the Gradio interface
- Polishing documentation
The dataset, model training, and integration choices remain based on classmate provided artifacts and my own implementation work.
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference