lsmpp's picture
Add files using upload-large-folder tool
4cef5ec verified

๋ฐ์ดํ„ฐ ์ฝœ๋ ˆ์ดํ„ฐ(Data Collator)[[data-collator]]

๋ฐ์ดํ„ฐ ์ฝœ๋ ˆ์ดํ„ฐ๋Š” ๋ฐ์ดํ„ฐ์…‹ ์š”์†Œ๋“ค์˜ ๋ฆฌ์ŠคํŠธ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐฐ์น˜๋ฅผ ํ˜•์„ฑํ•˜๋Š” ๊ฐ์ฒด์ž…๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์š”์†Œ๋“ค์€ train_dataset ๋˜๋Š” eval_dataset์˜ ์š”์†Œ๋“ค๊ณผ ๋™์ผํ•œ ํƒ€์ž… ์ž…๋‹ˆ๋‹ค. ๋ฐฐ์น˜๋ฅผ ๊ตฌ์„ฑํ•˜๊ธฐ ์œ„ํ•ด, ๋ฐ์ดํ„ฐ ์ฝœ๋ ˆ์ดํ„ฐ๋Š” (ํŒจ๋”ฉ๊ณผ ๊ฐ™์€) ์ผ๋ถ€ ์ฒ˜๋ฆฌ๋ฅผ ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. [DataCollatorForLanguageModeling]๊ณผ ๊ฐ™์€ ์ผ๋ถ€ ์ฝœ๋ ˆ์ดํ„ฐ๋Š” ํ˜•์„ฑ๋œ ๋ฐฐ์น˜์— (๋ฌด์ž‘์œ„ ๋งˆ์Šคํ‚น๊ณผ ๊ฐ™์€) ์ผ๋ถ€ ๋ฌด์ž‘์œ„ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•๋„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ ์˜ˆ์‹œ๋Š” ์˜ˆ์ œ ์Šคํฌ๋ฆฝํŠธ๋‚˜ ์˜ˆ์ œ ๋…ธํŠธ๋ถ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ธฐ๋ณธ ๋ฐ์ดํ„ฐ ์ฝœ๋ ˆ์ดํ„ฐ[[transformers.default_data_collator]]

[[autodoc]] data.data_collator.default_data_collator

DefaultDataCollator[[transformers.DefaultDataCollator]]

[[autodoc]] data.data_collator.DefaultDataCollator

DataCollatorWithPadding[[transformers.DataCollatorWithPadding]]

[[autodoc]] data.data_collator.DataCollatorWithPadding

DataCollatorForTokenClassification[[transformers.DataCollatorForTokenClassification]]

[[autodoc]] data.data_collator.DataCollatorForTokenClassification

DataCollatorForSeq2Seq[[transformers.DataCollatorForSeq2Seq]]

[[autodoc]] data.data_collator.DataCollatorForSeq2Seq

DataCollatorForLanguageModeling[[transformers.DataCollatorForLanguageModeling]]

[[autodoc]] data.data_collator.DataCollatorForLanguageModeling - numpy_mask_tokens - tf_mask_tokens - torch_mask_tokens

DataCollatorForWholeWordMask[[transformers.DataCollatorForWholeWordMask]]

[[autodoc]] data.data_collator.DataCollatorForWholeWordMask - numpy_mask_tokens - tf_mask_tokens - torch_mask_tokens

DataCollatorForPermutationLanguageModeling[[transformers.DataCollatorForPermutationLanguageModeling]]

[[autodoc]] data.data_collator.DataCollatorForPermutationLanguageModeling - numpy_mask_tokens - tf_mask_tokens - torch_mask_tokens

DataCollatorWithFlatteningtransformers.DataCollatorWithFlattening

[[autodoc]] data.data_collator.DataCollatorWithFlattening