| Receipt-Classifier | |
| ================================ | |
| Project: Makerspace Inventory Management Project | |
| Author: Ethan Kessler (Carnegie Mellon University) | |
| License: MIT | |
| Date: 2025 | |
| Description: | |
| ------------- | |
| This model predicts whether a line from a receipt received from McMaster Carr contains | |
| important information for describing item name, quantity, vendor, manufacturer, | |
| or other important inventory field listing. It was trained using using code generated by | |
| Copilot on a curated dataset with line information extracted using Tesseract OCR from 20 | |
| receipts retrieved from https://www.mcmaster.com/. | |
| Framework: | |
| ----------- | |
| - Model: WeightedDistilbert | |
| - Cross Entropy Loss: 'Other' = 1.0,'Imporant' = 10.0 | |
| - Epochs: 6 | |
| - batch size per device: 16 | |
| - Learning rate: 5e-5 | |
| - Weight decay: 0.01 | |
| Performance: | |
| ------------- | |
| - eval_loss β 0.06072889268398285 | |
| - eval_accuracy β 0.9861111111111112 | |
| - eval_f1 β 0.9411764705882353 | |
| - eval_runtime: 2.1135 | |
| - eval_samples_per_second: 68.132 | |
| - eval_steps_per_second: 8.517 | |
| Notes: | |
| ------ | |
| - Intended for referee-assistive scoring and highlight extraction. | |
| - Trained on clean data for Olympic-level scenarios. | |
| Limitations: | |
| ------------- | |
| - Trained only on receipts from McMaster-Carr. | |
| - May not identify prices. | |
| Ethical Use: | |
| ------------- | |
| For research, education, and makerspace management only. | |
| All data sourced from McMaster-Carr. | |
| Citation: | |
| ---------- | |
| Kessler, E. (2025). "receipt-classifier" | |
| Hugging Face: https://huggingface.co/emkessle/receipt-classifier |