pragmaticLM / README.md

aliMohammad16

Update README.md

b331aaa verified 10 months ago

preview code

raw

history blame contribute delete

2.16 kB

metadata

license: apache-2.0
datasets:
  - msamogh/indirect-requests
language:
  - en
metrics:
  - accuracy
base_model:
  - google-t5/t5-base
pipeline_tag: text2text-generation
library_name: transformers
tags:
  - prompt_restructuring
  - prompt_refining
  - indirect_requests
  - pragmatics

PragmaticLM - T5 for Prompt Restructuring

📌 Overview

PragmaticLM is a fine-tuned T5 model designed to restructure and reframe user prompts for better understanding by downstream LLMs. The model enhances prompt clarity by leveraging contextual restructuring techniques.

🚀 Model Details

Base Model: T5-Base
Training Data: [Indirect Requests] (https://huggingface.co/datasets/msamogh/indirect-requests)
Task Type: Text-to-text transformation
Library: Hugging Face Transformers

📊 Training Configuration

Epochs: 10
Batch Size: 8
Learning Rate: Encoder: 1e-5, Decoder: 3e-5
Optimizer: AdamW
Loss Function: Cross-entropy loss
Hardware: GPU (T4)

⚡ Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch

tokenizer = AutoTokenizer.from_pretrained("aliMohammad16/pragmaticLM")
model = AutoModelForSeq2SeqLM.from_pretrained("aliMohammad16/pragmaticLM")

def restructure_prompt(input_prompt):
    input_text = f"Restructure Prompt: {input_prompt}"
    inputs = tokenizer(input_text, return_tensors="pt", padding=True)
    
    output = model.generate(
        inputs.input_ids,
        max_length=64,
        num_beams=4,
        early_stopping=True
    )
    
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Example Usage
test_prompt = "I am not feeeling well. I need to consult a doctor nearby."
print(restructure_prompt(test_prompt))

⏳ Improvements

Work in progress: This is a work in progress. I am actively working on this model.
Update: Next I am implementing a multimodular pipeline, integrating TinyLlama 1.1B and Llama Index RAG with prompt-restructuring model, to improve output generation.