diff --git "a/run_20251218_140435/README.md" "b/run_20251218_140435/README.md" --- "a/run_20251218_140435/README.md" +++ "b/run_20251218_140435/README.md" @@ -69,4 +69,466 @@ Cite TRL as: publisher = {GitHub}, howpublished = {\url{https://github.com/huggingface/trl}} } -``` \ No newline at end of file +``` + +分析して +unsloth@e150a2d13ef8:/workspace$ git clone https://github.com/gitpullpull/Introspective_Temperature +Cloning into 'Introspective_Temperature'... +remote: Enumerating objects: 24, done. +remote: Counting objects: 100% (24/24), done. +remote: Compressing objects: 100% (17/17), done. +remote: Total 24 (delta 6), reused 21 (delta 3), pack-reused 0 (from 0) +Receiving objects: 100% (24/24), 9.12 MiB | 20.25 MiB/s, done. +Resolving deltas: 100% (6/6), done. +unsloth@e150a2d13ef8:/workspace$ cd Introspective_Temperature/ +unsloth@e150a2d13ef8:/workspace/Introspective_Temperature$ bash run_job.sh  +=== Job Started at 20251218_140407 === +Logs will be saved to: training_log_20251218_140407.txt +[1/2] Installing dependencies... +Requirement already satisfied: pandas in /opt/conda/lib/python3.11/site-packages (2.3.3) +Requirement already satisfied: matplotlib in /opt/conda/lib/python3.11/site-packages (3.10.8) +Requirement already satisfied: huggingface_hub in /opt/conda/lib/python3.11/site-packages (0.36.0) +Requirement already satisfied: numpy>=1.23.2 in /opt/conda/lib/python3.11/site-packages (from pandas) (2.2.6) +Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.11/site-packages (from pandas) (2.9.0.post0) +Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.11/site-packages (from pandas) (2025.2) +Requirement already satisfied: tzdata>=2022.7 in /opt/conda/lib/python3.11/site-packages (from pandas) (2025.2) +Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.11/site-packages (from matplotlib) (1.3.3) +Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.11/site-packages (from matplotlib) (0.12.1) +Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.11/site-packages (from matplotlib) (4.61.0) +Requirement already satisfied: kiwisolver>=1.3.1 in /opt/conda/lib/python3.11/site-packages (from matplotlib) (1.4.9) +Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.11/site-packages (from matplotlib) (25.0) +Requirement already satisfied: pillow>=8 in /opt/conda/lib/python3.11/site-packages (from matplotlib) (11.3.0) +Requirement already satisfied: pyparsing>=3 in /opt/conda/lib/python3.11/site-packages (from matplotlib) (3.2.5) +Requirement already satisfied: filelock in /opt/conda/lib/python3.11/site-packages (from huggingface_hub) (3.20.0) +Requirement already satisfied: fsspec>=2023.5.0 in /opt/conda/lib/python3.11/site-packages (from huggingface_hub) (2025.3.0) +Requirement already satisfied: pyyaml>=5.1 in /opt/conda/lib/python3.11/site-packages (from huggingface_hub) (6.0.3) +Requirement already satisfied: requests in /opt/conda/lib/python3.11/site-packages (from huggingface_hub) (2.32.5) +Requirement already satisfied: tqdm>=4.42.1 in /opt/conda/lib/python3.11/site-packages (from huggingface_hub) (4.67.1) +Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.11/site-packages (from huggingface_hub) (4.15.0) +Requirement already satisfied: hf-xet<2.0.0,>=1.1.3 in /opt/conda/lib/python3.11/site-packages (from huggingface_hub) (1.2.0) +Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0) +Requirement already satisfied: charset_normalizer<4,>=2 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface_hub) (3.4.4) +Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface_hub) (3.11) +Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface_hub) (2.6.2) +Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.11/site-packages (from requests->huggingface_hub) (2025.11.12) +[2/2] Starting Train-ORPO.py... +🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning. +TMA benchmarks will be running without grid constant TMA descriptor. +🦥 Unsloth Zoo will now patch everything to make training faster! +Unsloth: FBGEMM on the current GPU cannot load - will switch to Triton kernels +[unsloth_zoo.log|WARNING]Unsloth: Failed to import trl openenv: No module named 'trl.experimental.openenv' +Run Timestamp: 20251218_140435 +Hugging Face logged in. Target Repo: gitpullpull/Introspective_Temperature_test +Loading model: Unsloth/qwen3-8b... +==((====))==  Unsloth 2025.12.4: Fast Qwen3 patching. Transformers: 4.57.1. vLLM: 0.11.2. +   \\   /|    NVIDIA GeForce RTX 5090. Num GPUs = 1. Max memory: 31.367 GB. Platform: Linux. +O^O/ \_/ \    Torch: 2.9.0+cu128. CUDA: 12.0. CUDA Toolkit: 12.8. Triton: 3.5.0 +\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.33.post1. FA2 = False] + "-____-"     Free license: http://github.com/unslothai/unsloth +Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored! +The following TP rules were not applied on any of the layers: {'layers.*.self_attn.q_proj': 'colwise', 'layers.*.self_attn.k_proj': 'colwise', 'layers.*.self_attn.v_proj': 'colwise', 'layers.*.self_attn.o_proj': 'rowwise', 'layers.*.mlp.gate_proj': 'colwise', 'layers.*.mlp.up_proj': 'colwise', 'layers.*.mlp.down_proj': 'rowwise'} +The following layers were not sharded: lm_head.weight, model.embed_tokens.weight, model.layers.*.self_attn.k_norm.weight, model.layers.*.post_attention_layernorm.weight, model.layers.*.self_attn.q_norm.weight, model.norm.weight, model.layers.*.input_layernorm.weight +Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  2.46it/s] +Unsloth: Dropout = 0 is supported for fast patching. You are using dropout = 0.05. +Unsloth will patch all other layers, except LoRA matrices, causing a performance hit. +Unsloth 2025.12.4 patched 36 layers with 0 QKV layers, 0 O layers and 0 MLP layers. +Generating train split: 2897 examples [00:00, 48596.01 examples/s] +Filter: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2897/2897 [00:00<00:00, 45470.16 examples/s] +Map: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2897/2897 [00:00<00:00, 5254.68 examples/s] +Map (num_proc=64): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2897/2897 [00:07<00:00, 373.47 examples/s] +Map (num_proc=64): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2897/2897 [00:16<00:00, 170.57 examples/s] +Map (num_proc=64): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2897/2897 [00:16<00:00, 174.46 examples/s] +Lion 8bit ORPOトレーニングを開始します (LR: 1e-06)... +The model is already on multiple devices. Skipping the move to device specified in `args`. +==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1 +   \\   /|    Num examples = 2,897 | Num Epochs = 1 | Total steps = 363 +O^O/ \_/ \��   Batch size per device = 1 | Gradient accumulation steps = 8 +\        /    Data Parallel GPUs = 1 | Total batch size (1 x 8 x 1) = 8 + "-____-"     Trainable parameters = 87,293,952 of 8,278,029,312 (1.05% trained) +  0%|                                                                                                                                                                                                                                     | 0/363 [00:00