distilbert-base-NER / README.md
autoevaluator's picture
Add verifyToken field to verify evaluation results are produced by Hugging Face's automatic model evaluator
e488b05
|
raw
history blame
3.11 kB
metadata
language: en
license: apache-2.0
datasets:
  - conll2003
model-index:
  - name: elastic/distilbert-base-uncased-finetuned-conll03-english
    results:
      - task:
          type: token-classification
          name: Token Classification
        dataset:
          name: conll2003
          type: conll2003
          config: conll2003
          split: validation
        metrics:
          - type: accuracy
            value: 0.9854480753649896
            name: Accuracy
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNjcxZmJmZjE1Y2EzYWYzZDJjMTFhMGVlYWU1OThhYTY2NTZiMmRlYzY4MTcyZTkzYmNlNmI2YWE0Y2ZiYTcyZCIsInZlcnNpb24iOjF9.IDIpE6cedd9eW82XzNgl5buN9BdQPwQ7FfDJXBhuVkbZVRP31x0Q2wyKqw4PMXUtfVwdSD9ryc0HKx6sslcPDQ
          - type: precision
            value: 0.9880928983228512
            name: Precision
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNTRiYTYwNDAxNGMwOWQ2MjExZGU2OThiYTM2N2UyMjFhNDEyMzI3ZWFjYWJmNDNhMjRhMDIwNzFkNDc1YWYwMyIsInZlcnNpb24iOjF9.iYDlmVGs9cU6qZ-vEdqdwBMe3O1vv7zEg7IYshL-NDctCbD2QKvVHBkBqhcD4ydjevK3VDOigQclwZEJw1shCQ
          - type: recall
            value: 0.9895677847945542
            name: Recall
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNmVlYzliMDFiYmQyMGMwMDRhMDc5NTliNjRiNDczMGI4YjdjMWRiNDhmMWQ2ZmRlYTg4ZTU3NDAyOTdjYTA0NCIsInZlcnNpb24iOjF9.c4EwFSv8IO4E2PcSOaiK7UNjTM-wU3PE9AgOfsgfE3IeZ31jJqVxPbKBBW6YgaVXKgNp_O3U58zcC_EnE9-PAg
          - type: f1
            value: 0.9888297915932504
            name: F1
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiY2ViZmE3NjMwZjJmMTU4MDMyOWZjNDY5YjQzYzMyNWExZTA1NDM2ZjViNzdmNWJmM2IzMGEwYmM3MzRiNmM4NiIsInZlcnNpb24iOjF9.HD0BOw-IRrvIFswHKU8uNrNizvmfaaXYYo81KML0NFBS1-4BGWrRgo3gcxTYOCzvDyC5rTy6jKOV8KPRwJEfDg
          - type: loss
            value: 0.06707527488470078
            name: loss
            verified: true
            verifyToken: >-
              eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmE1NjExNTdkOWMyZTNkNWIyZTA5OTU3NzBmZmI0NWVhMjY4ZmVmZjdmM2NhNmY0YWNjMGE0OGJiMmQwZTQ2NiIsInZlcnNpb24iOjF9.E28oWv-H4AJvAHcAuFLomho0-_CSkzsCgYesEhIkjzUvP0YSjwDUHN_qAkndBIzw6bMEKplT8a3FCWePN1ScAA

DistilBERT base uncased, fine-tuned for NER using the conll03 english dataset. Note that this model is not sensitive to capital letters — "english" is the same as "English". For the case sensitive version, please use elastic/distilbert-base-cased-finetuned-conll03-english.

Versions

  • Transformers version: 4.3.1
  • Datasets version: 1.3.0

Training

$ run_ner.py \
  --model_name_or_path distilbert-base-uncased \
  --label_all_tokens True \
  --return_entity_level_metrics True \
  --dataset_name conll2003 \
  --output_dir /tmp/distilbert-base-uncased-finetuned-conll03-english \
  --do_train \
  --do_eval

After training, we update the labels to match the NER specific labels from the dataset conll2003