Instructions to use monsoon-nlp/ar-seq2seq-gender-decoder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use monsoon-nlp/ar-seq2seq-gender-decoder with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="monsoon-nlp/ar-seq2seq-gender-decoder")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("monsoon-nlp/ar-seq2seq-gender-decoder") model = AutoModelForCausalLM.from_pretrained("monsoon-nlp/ar-seq2seq-gender-decoder") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use monsoon-nlp/ar-seq2seq-gender-decoder with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "monsoon-nlp/ar-seq2seq-gender-decoder" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "monsoon-nlp/ar-seq2seq-gender-decoder", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/monsoon-nlp/ar-seq2seq-gender-decoder
- SGLang
How to use monsoon-nlp/ar-seq2seq-gender-decoder with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "monsoon-nlp/ar-seq2seq-gender-decoder" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "monsoon-nlp/ar-seq2seq-gender-decoder", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "monsoon-nlp/ar-seq2seq-gender-decoder" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "monsoon-nlp/ar-seq2seq-gender-decoder", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use monsoon-nlp/ar-seq2seq-gender-decoder with Docker Model Runner:
docker model run hf.co/monsoon-nlp/ar-seq2seq-gender-decoder
ar-seq2seq-gender (decoder)
This is a seq2seq model (decoder half) to "flip" gender in first-person Arabic sentences. The model can augment your existing Arabic data, or generate counterfactuals to test a model's decisions (would changing the gender of the subject or speaker change output?).
Intended Examples:
- 'أنا سعيد' <=> 'انا سعيدة'
- 'ركض إلى المتجر' <=> 'ركضت إلى المتجر'
People's names, gender pronouns, gendered words (father, mother), and many other values are currently unchanged by this model. Future versions may be trained on more data.
Sample Code
import torch
from transformers import AutoTokenizer, EncoderDecoderModel
model = EncoderDecoderModel.from_encoder_decoder_pretrained(
"monsoon-nlp/ar-seq2seq-gender-encoder",
"monsoon-nlp/ar-seq2seq-gender-decoder",
min_length=40
)
tokenizer = AutoTokenizer.from_pretrained('monsoon-nlp/ar-seq2seq-gender-decoder') # same as MARBERT original
input_ids = torch.tensor(tokenizer.encode("أنا سعيدة")).unsqueeze(0)
generated = model.generate(input_ids, decoder_start_token_id=model.config.decoder.pad_token_id)
tokenizer.decode(generated.tolist()[0][1 : len(input_ids[0]) - 1])
> 'انا سعيد'
https://colab.research.google.com/drive/1S0kE_2WiV82JkqKik_sBW-0TUtzUVmrV?usp=sharing
Training
I originally developed a gender flip Python script for Spanish sentences, using BETO, and spaCy. More about this project: https://medium.com/ai-in-plain-english/gender-bias-in-spanish-bert-1f4d76780617
The Arabic model encoder and decoder started with weights and vocabulary from MARBERT from UBC-NLP, and was trained on the Arabic Parallel Gender Corpus from NYU Abu Dhabi. The text is first-person sentences from OpenSubtitles, with parallel gender-reinflected sentences generated by Arabic speakers.
Training notebook: https://colab.research.google.com/drive/1TuDfnV2gQ-WsDtHkF52jbn699bk6vJZV
Non-binary gender
This model is useful to generate male and female text samples, but falls short of capturing gender diversity in the world and in the Arabic language. This subject is discussed in the bias statement of the Gender Reinflection paper.
- Downloads last month
- 4