skv03 commited on
Commit
da85fb0
·
verified ·
1 Parent(s): 7ae6ae8

Upload multi-domain zero-shot GLiREL model

Browse files
Files changed (3) hide show
  1. README.md +108 -0
  2. glirel_config.json +110 -0
  3. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ library_name: glirel
5
+ tags:
6
+ - relation-extraction
7
+ - zero-shot
8
+ - multi-domain
9
+ - glirel
10
+ - named-entity-recognition
11
+ datasets:
12
+ - custom-multi-domain
13
+ metrics:
14
+ - f1
15
+ - precision
16
+ - recall
17
+ pipeline_tag: token-classification
18
+ ---
19
+
20
+ # GLiREL Multi-Domain Zero-Shot Relation Extraction
21
+
22
+ This model is a fine-tuned version of [jackboyla/glirel-large-v0](https://huggingface.co/jackboyla/glirel-large-v0) for multi-domain zero-shot relation extraction.
23
+
24
+ ## Model Description
25
+
26
+ GLiREL (Generalist and Lightweight model for Relation Extraction) is a state-of-the-art model for zero-shot relation extraction. This version has been specifically fine-tuned on multi-domain data to improve performance across diverse domains in zero-shot scenarios.
27
+
28
+ ## Training Data
29
+
30
+ The model was trained on a multi-domain dataset with domain-based splits to ensure true zero-shot evaluation:
31
+
32
+ - **Training Examples**: N/A
33
+ - **Training Domains**: N/A
34
+ - **Relation Types**: N/A
35
+ - **Entity Types**: N/A
36
+
37
+ ## Key Features
38
+
39
+ - **Zero-shot relation extraction**: Can extract relations for unseen relation types
40
+ - **Multi-domain capability**: Trained on diverse domains for better generalization
41
+ - **Domain-based splitting**: Training and evaluation use different domains for true zero-shot evaluation
42
+ - **Lightweight**: Efficient inference while maintaining high performance
43
+
44
+ ## Usage
45
+
46
+ ```python
47
+ from glirel import GLiREL
48
+
49
+ # Load the model
50
+ model = GLiREL.from_pretrained("skv03/ner-span-glirel")
51
+
52
+ # Example usage
53
+ text = "John works at OpenAI in San Francisco."
54
+ labels = ["works_at", "located_in", "founded_by"]
55
+
56
+ # Extract relations
57
+ relations = model.predict_relations(text, labels)
58
+ print(relations)
59
+ ```
60
+
61
+ ## Training Configuration
62
+
63
+ - **Base Model**: jackboyla/glirel-large-v0
64
+ - **Training Steps**: 15,000
65
+ - **Batch Size**: 6
66
+ - **Learning Rate (Encoder)**: 1e-5
67
+ - **Learning Rate (Others)**: 5e-5
68
+ - **Max Length**: 512
69
+ - **Evaluation Strategy**: Every 4,000 steps
70
+ - **Zero-shot Setup**: Domain-based splits (no domain overlap between train/test)
71
+
72
+ ## Model Architecture
73
+
74
+ - **Label Embedding Strategy**: both (label + entity token)
75
+ - **Loss Function**: Binary Cross Entropy
76
+ - **Scheduler**: Cosine with Warmup
77
+ - **Dropout**: 0.1
78
+ - **Max Types per Batch**: 50
79
+
80
+ ## Performance
81
+
82
+ This model is designed for zero-shot relation extraction across multiple domains. Performance metrics will vary depending on the specific domains and relation types in your use case.
83
+
84
+ ## Limitations
85
+
86
+ - Performance may vary significantly across different domains
87
+ - Best suited for English text
88
+ - Requires entity spans to be provided for relation extraction
89
+
90
+ ## Citation
91
+
92
+ If you use this model, please cite the original GLiREL paper:
93
+
94
+ ```bibtex
95
+ @misc{boylan2025glirelgeneralistmodel,
96
+ title={GLiREL -- Generalist Model for Zero-Shot Relation Extraction},
97
+ author={Jack Boylan and Chris Hokamp and Demian Gholipour Ghalandari},
98
+ year={2025},
99
+ eprint={2501.03172},
100
+ archivePrefix={arXiv},
101
+ primaryClass={cs.CL},
102
+ url={https://arxiv.org/abs/2501.03172},
103
+ }
104
+ ```
105
+
106
+ ## Model Card Authors
107
+
108
+ Created by the GLiREL fine-tuning team.
glirel_config.json ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "lr_encoder": "1e-5",
3
+ "lr_others": "1e-4",
4
+ "weight_decay_encoder": 0.01,
5
+ "weight_decay_other": 0.01,
6
+ "num_steps": 500000,
7
+ "warmup_ratio": 0.1,
8
+ "train_batch_size": 8,
9
+ "eval_every": 15000,
10
+ "gradient_accumulation": 8,
11
+ "eval_batch_size": 32,
12
+ "num_layers_freeze": null,
13
+ "early_stopping_patience": null,
14
+ "early_stopping_delta": 0.0,
15
+ "save_at": [
16
+ 15000,
17
+ 30000,
18
+ 45000,
19
+ 60000,
20
+ 75000,
21
+ 90000,
22
+ 105000,
23
+ 120000,
24
+ 135000,
25
+ 150000,
26
+ 165000,
27
+ 180000,
28
+ 195000,
29
+ 210000,
30
+ 225000,
31
+ 240000,
32
+ 255000,
33
+ 270000,
34
+ 285000,
35
+ 300000,
36
+ 315000,
37
+ 330000,
38
+ 345000,
39
+ 360000,
40
+ 375000,
41
+ 390000,
42
+ 405000,
43
+ 420000,
44
+ 435000,
45
+ 450000,
46
+ 465000,
47
+ 480000,
48
+ 495000,
49
+ 500000
50
+ ],
51
+ "max_saves": 8,
52
+ "max_width": 6,
53
+ "model_name": "microsoft/deberta-v3-large",
54
+ "fine_tune": true,
55
+ "subtoken_pooling": "first",
56
+ "hidden_size": 768,
57
+ "scorer": "dot",
58
+ "rel_mode": "marker",
59
+ "span_marker_mode": "markerv1",
60
+ "refine_prompt": false,
61
+ "refine_relation": false,
62
+ "ffn_mul": 4,
63
+ "dropout": 0.4,
64
+ "scheduler": "cosine_with_warmup",
65
+ "loss_func": "binary_cross_entropy_loss",
66
+ "alpha": 0.6,
67
+ "gamma": 3,
68
+ "label_embed_strategy": "both",
69
+ "use_typed_relations": true,
70
+ "consistency_loss_weight": 0.1,
71
+ "enable_ner_module": true,
72
+ "ner_threshold": 0.5,
73
+ "ner_fn_loss_weight": 1.5,
74
+ "ner_loss_weight": 100.0,
75
+ "rel_loss_weight": 1.0,
76
+ "ner_threshold_offset": -0.02,
77
+ "training_phase": "ner_only",
78
+ "span_f1_target": 0.7,
79
+ "relation_f1_target": 0.7,
80
+ "coref_classifier": false,
81
+ "coref_loss_weight": 10.0,
82
+ "coreference_label": null,
83
+ "dataset_name": "custom",
84
+ "root_dir": "multi_domain",
85
+ "train_data": [
86
+ "data/multi_domain_train_processed.jsonl"
87
+ ],
88
+ "eval_data": [
89
+ "data/multi_domain_test_processed.jsonl"
90
+ ],
91
+ "prev_path": "./ner-glirel-log/saved_at/model_60000",
92
+ "size_sup": -1,
93
+ "num_train_rel_types": 40,
94
+ "num_unseen_rel_types": 15,
95
+ "top_k": 1,
96
+ "random_drop": false,
97
+ "max_len": 512,
98
+ "eval_threshold": [
99
+ 0.1,
100
+ 0.2,
101
+ 0.3,
102
+ 0.5,
103
+ 0.6,
104
+ 0.7
105
+ ],
106
+ "max_entity_pair_distance": null,
107
+ "fixed_relation_types": false,
108
+ "name": "large",
109
+ "log_dir": "ner-glirel-log-2/"
110
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7109e72d05ee4908506984e08c0cbb5972a4c0b417eb561ded1e85916a031d97
3
+ size 1951515495