============================================================ TD CLEAN v2 Research-backed. Every line proven. ============================================================ System optimization: GPU persistent mode: ON (no cold-start delay) GPU clocks: MAX (no throttling) CPU governor: performance (max frequency) OpenMP/MKL threads: 28 NUMA affinity: pinned to first 8 cores Pre-caching datasets from HuggingFace Hub... Attempting flash-attn install (optional, ~3-4x faster attention)... error: subprocess-exited-with-error × Building wheel for flash-attn (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [209 lines of output] torch.__version__ = 2.10.0+cu128 /venv/main/lib/python3.10/site-packages/setuptools/dist.py:759: SetuptoolsDeprecationWarning: License classifiers are deprecated. !! ******************************************************************************** Please consider removing the following classifiers in favor of a SPDX license expression: License :: OSI Approved :: BSD License See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details. ******************************************************************************** !! self._finalize_license_expression() running bdist_wheel Guessing wheel URL: https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.10cxx11abiTRUE-cp310-cp310-linux_x86_64.whl Precompiled wheel not found. Building from source... running build running build_py creating build/lib.linux-x86_64-cpython-310/flash_attn copying flash_attn/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn copying flash_attn/bert_padding.py -> build/lib.linux-x86_64-cpython-310/flash_attn copying flash_attn/flash_attn_interface.py -> build/lib.linux-x86_64-cpython-310/flash_attn copying flash_attn/flash_attn_triton.py -> build/lib.linux-x86_64-cpython-310/flash_attn copying flash_attn/flash_attn_triton_og.py -> build/lib.linux-x86_64-cpython-310/flash_attn copying flash_attn/flash_blocksparse_attention.py -> build/lib.linux-x86_64-cpython-310/flash_attn copying flash_attn/flash_blocksparse_attn_interface.py -> build/lib.linux-x86_64-cpython-310/flash_attn creating build/lib.linux-x86_64-cpython-310/hopper copying hopper/__init__.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/benchmark_attn.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/benchmark_flash_attention_fp8.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/benchmark_mla_decode.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/benchmark_split_kv.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/flash_attn_interface.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/generate_kernels.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/padding.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/setup.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/test_attn_kvcache.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/test_flash_attn.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/test_kvcache.py -> build/lib.linux-x86_64-cpython-310/hopper copying hopper/test_util.py -> build/lib.linux-x86_64-cpython-310/hopper creating build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/ampere_helpers.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/blackwell_helpers.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/block_info.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/fast_math.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/flash_bwd.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/flash_bwd_postprocess.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/flash_bwd_preprocess.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/flash_fwd.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/flash_fwd_sm100.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/hopper_helpers.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/interface.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/mask.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/mma_sm100_desc.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/named_barrier.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/pack_gqa.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/pipeline.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/seqlen_info.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/softmax.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/tile_scheduler.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute copying flash_attn/cute/utils.py -> build/lib.linux-x86_64-cpython-310/flash_attn/cute creating build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/bench.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/bwd_prefill.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/bwd_prefill_fused.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/bwd_prefill_onekernel.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/bwd_prefill_split.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/bwd_ref.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/fp8.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/fwd_decode.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/fwd_prefill.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/fwd_ref.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/interface_fa.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/test.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/train.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd copying flash_attn/flash_attn_triton_amd/utils.py -> build/lib.linux-x86_64-cpython-310/flash_attn/flash_attn_triton_amd creating build/lib.linux-x86_64-cpython-310/flash_attn/layers copying flash_attn/layers/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/layers copying flash_attn/layers/patch_embed.py -> build/lib.linux-x86_64-cpython-310/flash_attn/layers copying flash_attn/layers/rotary.py -> build/lib.linux-x86_64-cpython-310/flash_attn/layers creating build/lib.linux-x86_64-cpython-310/flash_attn/losses copying flash_attn/losses/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/losses copying flash_attn/losses/cross_entropy.py -> build/lib.linux-x86_64-cpython-310/flash_attn/losses creating build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/baichuan.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/bert.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/bigcode.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/btlm.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/falcon.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/gpt.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/gptj.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/llama.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/opt.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models copying flash_attn/models/vit.py -> build/lib.linux-x86_64-cpython-310/flash_attn/models creating build/lib.linux-x86_64-cpython-310/flash_attn/modules copying flash_attn/modules/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules copying flash_attn/modules/block.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules copying flash_attn/modules/embedding.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules copying flash_attn/modules/mha.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules copying flash_attn/modules/mlp.py -> build/lib.linux-x86_64-cpython-310/flash_attn/modules creating build/lib.linux-x86_64-cpython-310/flash_attn/ops copying flash_attn/ops/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops copying flash_attn/ops/activations.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops copying flash_attn/ops/fused_dense.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops copying flash_attn/ops/layer_norm.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops copying flash_attn/ops/rms_norm.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops creating build/lib.linux-x86_64-cpython-310/flash_attn/utils copying flash_attn/utils/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils copying flash_attn/utils/benchmark.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils copying flash_attn/utils/distributed.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils copying flash_attn/utils/generation.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils copying flash_attn/utils/library.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils copying flash_attn/utils/pretrained.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils copying flash_attn/utils/testing.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils copying flash_attn/utils/torch.py -> build/lib.linux-x86_64-cpython-310/flash_attn/utils creating build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton copying flash_attn/ops/triton/__init__.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton copying flash_attn/ops/triton/cross_entropy.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton copying flash_attn/ops/triton/k_activations.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton copying flash_attn/ops/triton/layer_norm.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton copying flash_attn/ops/triton/linear.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton copying flash_attn/ops/triton/mlp.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton copying flash_attn/ops/triton/rotary.py -> build/lib.linux-x86_64-cpython-310/flash_attn/ops/triton running build_ext Traceback (most recent call last): File "", line 486, in run File "/venv/main/lib/python3.10/urllib/request.py", line 241, in urlretrieve with contextlib.closing(urlopen(url, data)) as fp: File "/venv/main/lib/python3.10/urllib/request.py", line 216, in urlopen return opener.open(url, data, timeout) File "/venv/main/lib/python3.10/urllib/request.py", line 525, in open response = meth(req, response) File "/venv/main/lib/python3.10/urllib/request.py", line 634, in http_response response = self.parent.error( File "/venv/main/lib/python3.10/urllib/request.py", line 563, in error return self._call_chain(*args) File "/venv/main/lib/python3.10/urllib/request.py", line 496, in _call_chain result = func(*args) File "/venv/main/lib/python3.10/urllib/request.py", line 643, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/venv/main/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in main() File "/venv/main/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main json_out["return_val"] = hook(**hook_input["kwargs"]) File "/venv/main/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 280, in build_wheel return _build_backend().build_wheel( File "/venv/main/lib/python3.10/site-packages/setuptools/build_meta.py", line 435, in build_wheel return _build(['bdist_wheel', '--dist-info-dir', str(metadata_directory)]) File "/venv/main/lib/python3.10/site-packages/setuptools/build_meta.py", line 423, in _build return self._build_with_temp_dir( File "/venv/main/lib/python3.10/site-packages/setuptools/build_meta.py", line 404, in _build_with_temp_dir self.run_setup() File "/venv/main/lib/python3.10/site-packages/setuptools/build_meta.py", line 512, in run_setup super().run_setup(setup_script=setup_script) File "/venv/main/lib/python3.10/site-packages/setuptools/build_meta.py", line 317, in run_setup exec(code, locals()) File "", line 526, in File "/venv/main/lib/python3.10/site-packages/setuptools/__init__.py", line 115, in setup return distutils.core.setup(**attrs) File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 186, in setup return run_commands(dist) File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 202, in run_commands dist.run_commands() File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 1002, in run_commands self.run_command(cmd) File "/venv/main/lib/python3.10/site-packages/setuptools/dist.py", line 1102, in run_command super().run_command(command) File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command cmd_obj.run() File "", line 503, in run File "/venv/main/lib/python3.10/site-packages/setuptools/command/bdist_wheel.py", line 370, in run self.run_command("build") File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 357, in run_command self.distribution.run_command(command) File "/venv/main/lib/python3.10/site-packages/setuptools/dist.py", line 1102, in run_command super().run_command(command) File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command cmd_obj.run() File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 135, in run self.run_command(cmd_name) File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 357, in run_command self.distribution.run_command(command) File "/venv/main/lib/python3.10/site-packages/setuptools/dist.py", line 1102, in run_command super().run_command(command) File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command cmd_obj.run() File "/venv/main/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 96, in run _build_ext.run(self) File "/venv/main/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 368, in run self.build_extensions() File "/venv/main/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 695, in build_extensions _check_cuda_version(compiler_name, compiler_version) File "/venv/main/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 524, in _check_cuda_version raise RuntimeError(CUDA_MISMATCH_MESSAGE, cuda_str_version, torch.version.cuda) RuntimeError: ('The detected CUDA version (%s) mismatches the version that was used to compilePyTorch (%s). Please make sure to use the same CUDA versions.', '11.8', '12.8') [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for flash-attn error: failed-wheel-build-for-install × Failed to build installable wheels for some pyproject.toml based projects ╰─> flash-attn flash-attn not available (will use SDPA fallback — still fast) 🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning. 🦥 Unsloth Zoo will now patch everything to make training faster! Running self-tests... ✓ format_training_pair ✓ format_error_correction ✓ FrontierFactory math ✓ ProblemComposer ✓ ThompsonBandit ✓ StrategySelector Tests: 6 passed, 0 failed GPU: NVIDIA RTX A6000 (48GB) RAM: 283GB CPU cores: 28 Loading model... ==((====))== Unsloth 2026.3.4: Fast Qwen3_5 patching. Transformers: 5.2.0. \\ /| NVIDIA RTX A6000. Num GPUs = 1. Max memory: 47.536 GB. Platform: Linux. O^O/ \_/ \ Torch: 2.10.0+cu128. CUDA: 8.6. CUDA Toolkit: 12.8. Triton: 3.6.0 \ / Bfloat16 = TRUE. FA [Xformers = 0.0.35. FA2 = False] "-____-" Free license: http://github.com/unslothai/unsloth Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored! The fast path is not available because one of the required library is not installed. Falling back to torch implementation. To install follow https://github.com/fla-org/flash-linear-attention#installation and https://github.com/Dao-AILab/causal-conv1d Loading weights: 0%| | 0/760 [00:00._remove at 0x7f5281f64550> Traceback (most recent call last): File "/venv/main/lib/python3.10/_weakrefset.py", line 39, in _remove def _remove(item, selfref=ref(self)): File "/workspace/my-ai/td_clean_v2.py", line 3657, in _handle_shutdown sys.exit(1) SystemExit: 1