Upload multi-domain zero-shot GLiREL model

Browse files

Files changed (3) hide show

README.md +108 -0
glirel_config.json +110 -0
pytorch_model.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,108 @@

+---
+language: en
+license: mit
+library_name: glirel
+tags:
+- relation-extraction
+- zero-shot
+- multi-domain
+- glirel
+- named-entity-recognition
+datasets:
+- custom-multi-domain
+metrics:
+- f1
+- precision
+- recall
+pipeline_tag: token-classification
+---
+# GLiREL Multi-Domain Zero-Shot Relation Extraction
+This model is a fine-tuned version of [jackboyla/glirel-large-v0](https://huggingface.co/jackboyla/glirel-large-v0) for multi-domain zero-shot relation extraction.
+## Model Description
+GLiREL (Generalist and Lightweight model for Relation Extraction) is a state-of-the-art model for zero-shot relation extraction. This version has been specifically fine-tuned on multi-domain data to improve performance across diverse domains in zero-shot scenarios.
+## Training Data
+The model was trained on a multi-domain dataset with domain-based splits to ensure true zero-shot evaluation:
+- **Training Examples**: N/A
+- **Training Domains**: N/A
+- **Relation Types**: N/A
+- **Entity Types**: N/A
+## Key Features
+- **Zero-shot relation extraction**: Can extract relations for unseen relation types
+- **Multi-domain capability**: Trained on diverse domains for better generalization
+- **Domain-based splitting**: Training and evaluation use different domains for true zero-shot evaluation
+- **Lightweight**: Efficient inference while maintaining high performance
+## Usage
+```python
+from glirel import GLiREL
+# Load the model
+model = GLiREL.from_pretrained("skv03/ner-span-glirel")
+# Example usage
+text = "John works at OpenAI in San Francisco."
+labels = ["works_at", "located_in", "founded_by"]
+# Extract relations
+relations = model.predict_relations(text, labels)
+print(relations)
+```
+## Training Configuration
+- **Base Model**: jackboyla/glirel-large-v0
+- **Training Steps**: 15,000
+- **Batch Size**: 6
+- **Learning Rate (Encoder)**: 1e-5
+- **Learning Rate (Others)**: 5e-5
+- **Max Length**: 512
+- **Evaluation Strategy**: Every 4,000 steps
+- **Zero-shot Setup**: Domain-based splits (no domain overlap between train/test)
+## Model Architecture
+- **Label Embedding Strategy**: both (label + entity token)
+- **Loss Function**: Binary Cross Entropy
+- **Scheduler**: Cosine with Warmup
+- **Dropout**: 0.1
+- **Max Types per Batch**: 50
+## Performance
+This model is designed for zero-shot relation extraction across multiple domains. Performance metrics will vary depending on the specific domains and relation types in your use case.
+## Limitations
+- Performance may vary significantly across different domains
+- Best suited for English text
+- Requires entity spans to be provided for relation extraction
+## Citation
+If you use this model, please cite the original GLiREL paper:
+```bibtex
+@misc{boylan2025glirelgeneralistmodel,
+      title={GLiREL -- Generalist Model for Zero-Shot Relation Extraction},
+      author={Jack Boylan and Chris Hokamp and Demian Gholipour Ghalandari},
+      year={2025},
+      eprint={2501.03172},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2501.03172},
+}
+```
+## Model Card Authors
+Created by the GLiREL fine-tuning team.

glirel_config.json ADDED Viewed

	@@ -0,0 +1,110 @@

+{
+  "lr_encoder": "1e-5",
+  "lr_others": "1e-4",
+  "weight_decay_encoder": 0.01,
+  "weight_decay_other": 0.01,
+  "num_steps": 500000,
+  "warmup_ratio": 0.1,
+  "train_batch_size": 8,
+  "eval_every": 15000,
+  "gradient_accumulation": 8,
+  "eval_batch_size": 32,
+  "num_layers_freeze": null,
+  "early_stopping_patience": null,
+  "early_stopping_delta": 0.0,
+  "save_at": [
+    15000,
+    30000,
+    45000,
+    60000,
+    75000,
+    90000,
+    105000,
+    120000,
+    135000,
+    150000,
+    165000,
+    180000,
+    195000,
+    210000,
+    225000,
+    240000,
+    255000,
+    270000,
+    285000,
+    300000,
+    315000,
+    330000,
+    345000,
+    360000,
+    375000,
+    390000,
+    405000,
+    420000,
+    435000,
+    450000,
+    465000,
+    480000,
+    495000,
+    500000
+  ],
+  "max_saves": 8,
+  "max_width": 6,
+  "model_name": "microsoft/deberta-v3-large",
+  "fine_tune": true,
+  "subtoken_pooling": "first",
+  "hidden_size": 768,
+  "scorer": "dot",
+  "rel_mode": "marker",
+  "span_marker_mode": "markerv1",
+  "refine_prompt": false,
+  "refine_relation": false,
+  "ffn_mul": 4,
+  "dropout": 0.4,
+  "scheduler": "cosine_with_warmup",
+  "loss_func": "binary_cross_entropy_loss",
+  "alpha": 0.6,
+  "gamma": 3,
+  "label_embed_strategy": "both",
+  "use_typed_relations": true,
+  "consistency_loss_weight": 0.1,
+  "enable_ner_module": true,
+  "ner_threshold": 0.5,
+  "ner_fn_loss_weight": 1.5,
+  "ner_loss_weight": 100.0,
+  "rel_loss_weight": 1.0,
+  "ner_threshold_offset": -0.02,
+  "training_phase": "ner_only",
+  "span_f1_target": 0.7,
+  "relation_f1_target": 0.7,
+  "coref_classifier": false,
+  "coref_loss_weight": 10.0,
+  "coreference_label": null,
+  "dataset_name": "custom",
+  "root_dir": "multi_domain",
+  "train_data": [
+    "data/multi_domain_train_processed.jsonl"
+  ],
+  "eval_data": [
+    "data/multi_domain_test_processed.jsonl"
+  ],
+  "prev_path": "./ner-glirel-log/saved_at/model_60000",
+  "size_sup": -1,
+  "num_train_rel_types": 40,
+  "num_unseen_rel_types": 15,
+  "top_k": 1,
+  "random_drop": false,
+  "max_len": 512,
+  "eval_threshold": [
+    0.1,
+    0.2,
+    0.3,
+    0.5,
+    0.6,
+    0.7
+  ],
+  "max_entity_pair_distance": null,
+  "fixed_relation_types": false,
+  "name": "large",
+  "log_dir": "ner-glirel-log-2/"
+}

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7109e72d05ee4908506984e08c0cbb5972a4c0b417eb561ded1e85916a031d97
+size 1951515495