md896 commited on
Commit
8b3c03a
·
1 Parent(s): d061422

Prevent torchvision import crashes in HF Jobs

Browse files

HF base images can include torchvision, and pip uninstall may be blocked by PEP-668 unless break-system-packages is enabled. When Unsloth's dependency resolution downgrades torch, any remaining torchvision can crash transformers/trl imports (torchvision::nms).\n\nSet TRANSFORMERS_NO_TORCHVISION early, opt into PIP_BREAK_SYSTEM_PACKAGES for all pip ops, and uninstall torchvision/torchaudio both before and after Unsloth install so text-only training stays stable.

Constraint: HF Jobs run on externally-managed system Python (PEP-668)\nRejected: Pin torch/torchvision versions | pip resolver still drags heavy CUDA wheels and increases cost\nConfidence: high\nScope-risk: narrow\nDirective: Keep torchvision out of text-only training images unless you also pin torch+torchvision compat\nTested: python -m py_compile ultimate_sota_training.py\nNot-tested: HF Jobs end-to-end run

Files changed (1) hide show
  1. ultimate_sota_training.py +13 -1
ultimate_sota_training.py CHANGED
@@ -45,11 +45,19 @@ def bootstrap_deps() -> None:
45
  if os.environ.get("SKIP_BOOTSTRAP") == "1":
46
  return
47
 
 
 
 
 
 
 
 
 
48
  print("📦 Bootstrapping dependencies...")
49
 
50
  # Text-only run: torchvision/torchaudio are not required and are a common source
51
  # of crashes when torch versions shift in container images.
52
- _pip(["uninstall", "-y", "torchvision", "torchaudio"], check=False)
53
 
54
  # Keep these scoped; avoid blanket -U to reduce resolver churn.
55
  _pip(
@@ -75,6 +83,10 @@ def bootstrap_deps() -> None:
75
  ]
76
  )
77
 
 
 
 
 
78
 
79
  bootstrap_deps()
80
 
 
45
  if os.environ.get("SKIP_BOOTSTRAP") == "1":
46
  return
47
 
48
+ # Ensure text-only transformers runs never hard-import torchvision even if it
49
+ # is present in the base image.
50
+ os.environ.setdefault("TRANSFORMERS_NO_TORCHVISION", "1")
51
+
52
+ # Ubuntu 24.04+ images may mark system Python as "externally managed"
53
+ # (PEP-668). Prefer an explicit opt-out for all pip ops in ephemeral jobs.
54
+ os.environ.setdefault("PIP_BREAK_SYSTEM_PACKAGES", "1")
55
+
56
  print("📦 Bootstrapping dependencies...")
57
 
58
  # Text-only run: torchvision/torchaudio are not required and are a common source
59
  # of crashes when torch versions shift in container images.
60
+ _pip(["uninstall", "--break-system-packages", "-y", "torchvision", "torchaudio"], check=False)
61
 
62
  # Keep these scoped; avoid blanket -U to reduce resolver churn.
63
  _pip(
 
83
  ]
84
  )
85
 
86
+ # Some dependency resolution paths can reintroduce torchvision. Remove it
87
+ # again right before importing transformers/trl.
88
+ _pip(["uninstall", "--break-system-packages", "-y", "torchvision", "torchaudio"], check=False)
89
+
90
 
91
  bootstrap_deps()
92