Buckets:

HCAI-Lab/dolma3-6t-sample-1000-docs / sample_contract.json
glennmatlin's picture
download
raw
481 Bytes
{
"WORKING_SAMPLE_TOKEN_FLOOR_PER_BIN": 0,
"WORKING_SAMPLE_DOCS_PER_BIN": 1000,
"WORKING_SAMPLE_GLOBAL_TOKEN_BUDGET": null,
"WORKING_SAMPLE_MIN_TOKEN_COUNT": 512,
"WORKING_SAMPLE_MAX_TOKEN_COUNT": null,
"WORKING_SAMPLE_REALIZED_TOKEN_TOTAL": 1409023370,
"WORKING_SAMPLE_REALIZED_DOC_COUNT": 574578,
"WORKING_SAMPLE_UNDERFILLED_BIN_COUNT": 4,
"WORKING_SAMPLE_COVERED_BIN_COUNT": 576,
"WORKING_SAMPLE_TOTAL_BIN_COUNT": 576,
"WORKING_SAMPLE_SAMPLING_SEED": 42
}

Xet Storage Details

Size:
481 Bytes
·
Xet hash:
4fda5c72da81676e2a66efe0960df3a5bd373ed14bac8ba55480ec3ad4681586

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.