etghan_tagging_v1 / README.md
QomSSLab's picture
Upload README.md with huggingface_hub
18bd565 verified
metadata
language: fa
pipeline_tag: token-classification
library_name: transformers

QomSSLab/etghan_tagging_v1

This repository hosts an XLM-RoBERTa token-classification head trained.

Usage

from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline

model_id = "QomSSLab/etghan_tagging_v1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id)
tagger = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")

text = "مثال از یک ورودی فارسی"
for entity in tagger(text):
    print(entity)

Labels

  • A
  • ABSENCE
  • APPEALABILITY
  • APPEAL_AUTHORITY
  • APPEAL_DEADLINE
  • B
  • BSEMHE
  • COURT_NAME
  • DEFENDANT_ADDRESS
  • DEFENDANT_FATHER
  • DEFENDANT_NAME
  • FINALITY
  • JUDGE_NAME
  • JUDGE_POSITION
  • JUDGE_SIGNATURE
  • JUDGMENT_DATE
  • LEGAL_CHARGE
  • LEGAL_REPRESENTATIVE
  • O
  • PLACE_OF_OFFENSE
  • PLACE_OF_OFFICE
  • PLAINTIFF_ADDRESS
  • PLAINTIFF_CLAIM
  • PLAINTIFF_FATHER
  • PLAINTIFF_NAME
  • PRESENCE
  • SUBJECT_OF_CLAIM
  • TIME_OF_OFFENSE

Training

  • Base model: xlm-roberta-large
  • Optimizer/args: default Trainer settings (AdamW, lr=3e-5, batch size 8, epochs configurable)