--- license: apache-2.0 datasets: - msamogh/indirect-requests language: - en metrics: - accuracy base_model: - google-t5/t5-base pipeline_tag: text2text-generation library_name: transformers tags: - prompt_restructuring - prompt_refining - indirect_requests - pragmatics --- # PragmaticLM - T5 for Prompt Restructuring ![Model](assets/dp.png) ## 📌 Overview **PragmaticLM** is a fine-tuned T5 model designed to **restructure and reframe user prompts** for better understanding by downstream LLMs. The model enhances prompt clarity by leveraging **contextual restructuring** techniques. ## 🚀 Model Details - **Base Model**: [T5-Base](https://huggingface.co/t5-base) - **Training Data**: [Indirect Requests] (https://huggingface.co/datasets/msamogh/indirect-requests) - **Task Type**: Text-to-text transformation - **Library**: [Hugging Face Transformers](https://github.com/huggingface/transformers) ## 📊 Training Configuration - **Epochs**: 10 - **Batch Size**: 8 - **Learning Rate**: Encoder: `1e-5`, Decoder: `3e-5` - **Optimizer**: AdamW - **Loss Function**: Cross-entropy loss - **Hardware**: GPU (T4) ## ⚡ Usage ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM import torch tokenizer = AutoTokenizer.from_pretrained("aliMohammad16/pragmaticLM") model = AutoModelForSeq2SeqLM.from_pretrained("aliMohammad16/pragmaticLM") def restructure_prompt(input_prompt): input_text = f"Restructure Prompt: {input_prompt}" inputs = tokenizer(input_text, return_tensors="pt", padding=True) output = model.generate( inputs.input_ids, max_length=64, num_beams=4, early_stopping=True ) return tokenizer.decode(output[0], skip_special_tokens=True) # Example Usage test_prompt = "I am not feeeling well. I need to consult a doctor nearby." print(restructure_prompt(test_prompt)) ``` ## ⏳ Improvements - **Work in progress**: This is a work in progress. I am actively working on this model. - **Update**: Next I am implementing a multimodular pipeline, integrating TinyLlama 1.1B and Llama Index RAG with `prompt-restructuring` model, to improve output generation.