NLP2025_HW3 / dataset

Commit History

Add validation_pairs.json (223 samples, stratified split)
0e740e5
verified

Francesco77 commited on

Add train_pairs.json (257 samples, stratified split)
3608b89
verified

Francesco77 commited on

Add validation_pairs.json (96 samples, stratified split)
92147d5
verified

Francesco77 commited on

Add train_pairs.json (384 samples, stratified split)
1f9cfbd
verified

Francesco77 commited on

Add sentences_stats.json (lengths per sentence & subsentences)
f361d1b
verified

Francesco77 commited on

Add dataset_full_aligned_tagged.json (with sentence/subsentence ids)
06d6826
verified

Francesco77 commited on

Add sentences_stats.json (lengths per sentence & subsentences)
a769331
verified

Francesco77 commited on

Add dataset_full_aligned_tagged.json (with sentence/subsentence ids)
4c49baa
verified

Francesco77 commited on

Add dataset_full_aligned.json (gold↔ocr pairs with metadata)
8ed276f
verified

Francesco77 commited on

Delete dataset/eng/aligned_pairs_first4.json
b685994
verified

Francesco77 commited on

Delete dataset/eng/aligned_pairs_first5.json
4522849
verified

Francesco77 commited on

Add aligned_pairs_first4.json (gold↔ocr pairs with metadata)
592e1c3
verified

Francesco77 commited on

Add aligned_pairs_first4.json (gold↔ocr pairs with metadata)
141a59b
verified

Francesco77 commited on

Add aligned_pairs_first4.json (gold↔ocr pairs with metadata)
54cad20
verified

Francesco77 commited on

Carica dataset iniziale
4fc64e6
verified

Francesco77 commited on