YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

IMDB Trainer

Hugging Face Trainer ์‚ฌ์šฉ ํ๋ฆ„์„ ๋‹จ๊ณ„๋ณ„ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•˜๊ณ , IMDB ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ ์‹คํ–‰ ํŒŒ์ผ์„ ํ•จ๊ป˜ ๋‘” ํด๋”์ž…๋‹ˆ๋‹ค.

ํŽ˜์ด์ง€ ๊ตฌํ˜„ ์ฝ”๋“œ

  • page_01_basic_trainer.py: IMDB ๋กœ๋“œ, tokenizer ์ ์šฉ, TrainingArguments, Trainer, train, evaluate, predict, save_model ๊ธฐ๋ณธ ํ๋ฆ„์ž…๋‹ˆ๋‹ค.
  • page_02_resume_training.py: save_strategy="steps"์™€ resume_from_checkpoint=True๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์žฌ๊ฐœ ํ•™์Šต ํ๋ฆ„์ž…๋‹ˆ๋‹ค.
  • page_03_plot_curve.py: learning_curve.jsonl์„ ์ฝ์–ด loss, accuracy, precision, f1 ๊ทธ๋ž˜ํ”„๋ฅผ ์ €์žฅํ•˜๊ณ  best checkpoint ์œ„์น˜๋ฅผ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค.
  • page_04_stage_finetuning.py: stage1์—์„œ backbone์„ freezeํ•˜๊ณ  head๋งŒ ํ•™์Šตํ•œ ๋’ค, stage2์—์„œ best checkpoint์˜ ๊ฐ€์ค‘์น˜๋งŒ ๋กœ๋“œํ•ด์„œ ์ƒˆ optimizer์™€ scheduler๋กœ ๋‹ค์‹œ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.
  • curve_logger.py: step๋ณ„ ๋กœ๊ทธ์™€ ํ‰๊ฐ€ ์ง€ํ‘œ๋ฅผ JSONL๋กœ ์ €์žฅํ•˜๋Š” Trainer callback์ž…๋‹ˆ๋‹ค.
  • trainer_utils.py: ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ, metric ๊ณ„์‚ฐ, Trainer/TrainingArguments ๋ฒ„์ „ ํ˜ธํ™˜ ์ฒ˜๋ฆฌ, best checkpoint ์กฐํšŒ ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.

์ปค์Šคํ…€ ๋ชจ๋ธ

  • custom_text_config.py: ์ปค์Šคํ…€ ๋ชจ๋ธ ์„ค์ • ํŒŒ์ผ์ž…๋‹ˆ๋‹ค.
  • custom_text_classifier.py: PreTrainedModel ๊ธฐ๋ฐ˜ ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

์‹คํ–‰ ์ค€๋น„

uv sync

ํŽ˜์ด์ง€ ์ฝ”๋“œ ์‹คํ–‰ ์˜ˆ์‹œ

๋น ๋ฅธ ํ™•์ธ์šฉ์œผ๋กœ ์ƒ˜ํ”Œ ์ˆ˜๋ฅผ ์ค„์—ฌ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

uv run python page_01_basic_trainer.py --max-train-samples 128 --max-eval-samples 64 --epochs 1
uv run python page_02_resume_training.py --max-train-samples 128 --max-eval-samples 64 --epochs 1
uv run python page_03_plot_curve.py --run-dir results/page_02_resume
uv run python page_04_stage_finetuning.py --max-train-samples 128 --max-eval-samples 64

IMDB ํ•™์Šต ์‹คํ–‰

๊ธฐ๋ณธ ๋ชจ๋ธ:

uv run python homework_0528_imdb.py --model-type auto --epochs 3

์ปค์Šคํ…€ ๋ชจ๋ธ:

uv run python homework_0528_imdb.py --model-type custom --epochs 3

์ด์–ด์„œ ํ•™์Šต:

uv run python homework_0528_imdb.py --model-type custom --output-dir outputs/20260610_210000_custom --resume

์ฒดํฌํฌ์ธํŠธ๋ฅผ ์ง์ ‘ ์ง€์ •ํ•ด์„œ ์ด์–ด์„œ ํ•™์Šตํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

uv run python homework_0528_imdb.py --model-type custom --resume-checkpoint outputs/20260610_210000_custom/checkpoint-1000

์‹คํ–‰ ๊ฒฐ๊ณผ๋Š” outputs/๋‚ ์งœ_์‹œ๊ฐ„_๋ชจ๋ธ์ข…๋ฅ˜ ํด๋”์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.

Colab Google Drive ๊ฒฝ๋กœ

Drive๋ฅผ ๋งˆ์šดํŠธํ•œ ๋’ค --output-root๋ฅผ Drive ๊ฒฝ๋กœ๋กœ ์ง€์ •ํ•˜๋ฉด ์ฒดํฌํฌ์ธํŠธ๋ฅผ Drive์— ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

uv run python homework_0528_imdb.py --model-type custom --output-root /content/drive/MyDrive/imdb_trainer_outputs
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support