park990
/

hihi_model

Model card Files Files and versions

hihi_model / README.md

park990's picture

Upload README.md

35781b2 verified about 2 months ago

|

history blame contribute delete

2.22 kB

	---
	license: apache-2.0
	datasets:
	- thunlp/docred
	---
	## Evaluation Results

	Evaluation was conducted on the DocRED dev set.

	### Dev Set Performance
	- Best Epoch: 29 / 30
	- Training Loss: 0.0023

	### Main Metrics
	- Micro F1: 59.25%
	- Precision: 63.11%
	- Recall: 55.83%

	### Interpretation
	This V2 checkpoint achieves a Micro F1 of 59.25% on the DocRED dev set, with a Precision of 63.11% and Recall of 55.83%.
	Compared to V1 (Micro F1 60.71%), V2 shows slightly lower overall F1 but maintains a competitive precision-recall balance.
	The relatively higher precision suggests that V2 makes more conservative predictions, reducing false positives at the cost of some recall.

	### V1 vs V2 Comparison

	\| Metric \| V1 (best_model_f1_56_64) \| V2 (best_model_V2) \|
	\|-----------\|--------------------------\|--------------------\|
	\| Micro F1 \| 60.71% \| 59.25% \|
	\| Precision \| 65.34% \| 63.11% \|
	\| Recall \| 56.70% \| 55.83% \|

	### Notes
	- This model is designed for document-level relation extraction on the DocRED benchmark.
	- V2 was trained as an ablation/comparison run against V1 to verify reproducibility and threshold sensitivity.
	- Performance may vary depending on preprocessing details, threshold settings, and evaluation configuration.

	## License and Dataset Notice

	### Code / Model License
	This project is built upon several open-source works:

	- HuggingFace Transformers / BERT — Apache License 2.0
	- ATLOP — MIT License
	- GAIN — MIT License
	- DREEAM — based on the original paper and implementation references

	### Dataset Notice
	This model is trained and evaluated on the DocRED dataset.
	DocRED is intended for research use. Users should separately review the dataset's original terms and conditions before any redistribution or commercial use.

	### Intended Use
	This repository is intended for:
	- academic research
	- experimentation on document-level relation extraction
	- knowledge graph construction pipelines
	- benchmark comparison and ablation studies

	It is not guaranteed for production use without additional validation.