tempgraphrag-artifacts / logs /eval-multitq.log
anonym-ous's picture
Add files using upload-large-folder tool
47e3460 verified
Raw
History Blame Contribute Delete
1.57 kB
[mtq-eval] loading KG from ~/temporal-aware-graphrag/data/MultiTQ/MultiTQ/kg
[mtq-eval] 461,329 triples
[mtq-eval] building retriever (k=15, hops=2)
[mtq-eval] loading questions from ~/temporal-aware-graphrag/data/MultiTQ/MultiTQ/questions/test.json
[mtq-eval] stratified subset: 1,496 of 54,584
[mtq-eval] pre-retrieving evidence per question
[mtq-eval] 1,496/1,496 got >=1 triple
[mtq-eval] loading policy from ../checkpoints/sft-multitq/final
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
Loading weights: 0%| | 0/399 [00:00<?, ?it/s] Loading weights: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 399/399 [00:00<00:00, 5259.26it/s]
[mtq-eval] generating predictions (bs=8)
[transformers] The following generation flags are not valid and may be ignored: ['temperature', 'top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
[mtq-eval] progress: 8/1496
[mtq-eval] progress: 208/1496
[mtq-eval] progress: 408/1496
[mtq-eval] progress: 608/1496
[mtq-eval] progress: 808/1496
[mtq-eval] progress: 1008/1496
[mtq-eval] progress: 1208/1496
[mtq-eval] progress: 1408/1496
[mtq-eval] wrote ../outputs/eval/multitq-v3-sft.json
[mtq-eval] OVERALL: n=1496 EM=0.280 F1=0.315
[mtq-eval] by qtype:
after_first: n=187 EM=0.091
before_after: n=187 EM=0.481
before_last: n=187 EM=0.118
equal: n=374 EM=0.447
equal_multi: n=187 EM=0.246
first_last: n=374 EM=0.206
[mtq-eval] by answer_type:
entity: n=1122 EM=0.289
time: n=374 EM=0.254