| # π§ TextSummarizerForInventoryReport-T5 | |
| A T5-based text summarization model fine-tuned on inventory report data. This model generates concise summaries of detailed inventory-related texts, making it useful for warehouse management, stock reporting, and supply chain documentation. | |
| ## β¨ Model Highlights | |
| - π Based on t5-small from Hugging Face π€ | |
| - π Fine-tuned on structured inventory report data (report_text β summary_text) | |
| - π Generates meaningful and human-readable summaries | |
| - β‘ Supports maximum input length of 512 tokens and output length of 128 tokens | |
| - π§ Built using Hugging Face Transformers and PyTorch | |
| --- | |
| ## π§ Intended Uses | |
| - β Inventory report summarization | |
| - β Warehouse/logistics management automation | |
| - β Business analytics and reporting dashboards | |
| ## π« Limitations | |
| - β Not optimized for very long reports (>512 tokens) | |
| - π Trained primarily on English-language technical/business reports | |
| - π§Ύ Performance may degrade on unstructured or noisy input text | |
| - π€ Not designed for creative or narrative summarization | |
| ## ποΈββοΈ Training Details | |
| | Attribute | Value | | |
| |-------------------|----------------------------------------| | |
| | Base Model | t5-small | | |
| | Dataset | Custom inventory reports | | |
| | Max Input Tokens | 512 | | |
| | Max Output Tokens | 128 | | |
| | Epochs | 3 | | |
| | Batch Size | 2 | | |
| | Optimizer | AdamW | | |
| | Loss Function |CrossEntropyLosS(with -100 padding mask)| | |
| | Framework | PyTorch + Hugging Face Transformers | | |
| | Hardware | CUDA-enabled GPU | | |
| --- | |
| ## π Usage | |
| ```python | |
| from transformers import T5Tokenizer, T5ForConditionalGeneration, Trainer, TrainingArguments | |
| from datasets import Dataset | |
| import torch | |
| import torch.nn.functional as F | |
| model_name = "AventIQ-AI/Text_Summarization_For_inventory_Report" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| model = AutoModelForSequenceClassification.from_pretrained(model_name) | |
| model.eval() | |
| def preprocess(example): | |
| input_text = "summarize: " + example["full_text"] | |
| input_enc = tokenizer(input_text, truncation=True, padding="max_length", max_length=512) | |
| target_enc = tokenizer(example["summary"], truncation=True, padding="max_length", max_length=64) | |
| input_enc["labels"] = target_enc["input_ids"] | |
| return input_enc | |
| # Generate summary | |
| summary = summarize(long_text, model, tokenizer) | |
| print("Summary:", summary) | |
| ``` | |
| ## Repository Structure | |
| ``` | |
| . | |
| βββ model/ # Contains the quantized model files | |
| βββ tokenizer_config/ # Tokenizer configuration and vocabulary files | |
| βββ model.safensors/ # Fine Tuned Model | |
| βββ README.md # Model documentation | |
| ``` | |
| π€ Contributing | |
| Contributions are welcome! | |
| Feel free to open an issue or submit a pull request if you have suggestions, improvements, or want to adapt the model to new domains. | |