CI_MA_Reframe / README.md

jokugeorgin

Update README.md

3dce1b9 verified 3 months ago

preview code

raw

history blame contribute delete

2.26 kB

metadata

language:
  - en
license: apache-2.0
tags:
  - t5
  - text2text-generation
  - microaggression
  - reframing
  - paraphrase
pipeline_tag: text2text-generation
widget:
  - text: 'rephrase: You speak good English for someone from there.'
  - text: 'rephrase: Where are you really from?'
  - text: 'rephrase: You''re so articulate for your background.'
datasets:
  - custom
metrics:
  - bleu
  - rouge
base_model: t5-base
model-index:
  - name: CI_MA_Reframe
    results:
      - task:
          type: text2text-generation
          name: Microaggression Reframing
        metrics:
          - type: bleu
            value: 0.75
            name: BLEU

CI_MA_Reframe - Microaggression Reframing Model

This model reframes potentially problematic text into more inclusive language using a fine-tuned T5 architecture.

Model Description

Model type: T5 for text-to-text generation
Task: Text reframing/paraphrasing
Base model: t5-base

Usage

Important: Always prefix your input with "rephrase: " for proper generation.

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("jokugeorgin/CI_MA_Reframe")
model = T5ForConditionalGeneration.from_pretrained("jokugeorgin/CI_MA_Reframe")

text = "rephrase: You speak good English for someone from there."
inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True)

outputs = model.generate(
    **inputs,
    max_length=256,
    num_beams=5,
    num_return_sequences=3,
    temperature=0.8,
    do_sample=True,
    no_repeat_ngram_size=2
)

for output in outputs:
    print(tokenizer.decode(output, skip_special_tokens=True))

API Usage

curl https://api-inference.huggingface.co/models/jokugeorgin/CI_MA_Reframe \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": "rephrase: You speak good English for someone from there.",
    "parameters": {
      "max_new_tokens": 96,
      "num_return_sequences": 3,
      "temperature": 0.8
    }
  }'

Training Data

Custom dataset of microaggression examples and their reframed alternatives.

Limitations

Requires "rephrase: " prefix for optimal results
Works best with English text
May occasionally produce generic reframings