TD Quick Start — Rent a GPU and Go

What You Need (One-Time Setup)

vast.ai account — sign up at vast.ai, add credit ($10-20 to start)
HuggingFace account — sign up at huggingface.co (use any username, doesn't have to be your real name)
HuggingFace token — Settings → Access Tokens → New Token → Write access
ntfy.sh app on your phone (you already have this)

One-Time: Upload Your Code to Private HuggingFace

Do this once from your computer. After this, your code lives in a private repo that only you can see.

# Install the tool
pip install huggingface_hub

# Log in (paste your token when asked)
huggingface-cli login

# Upload everything
HF_USER=your_hf_username bash upload_to_hf.sh

Now your td_lang, td_fuse, .td files, and deploy script are all in a private HuggingFace repo. Nobody can see them except you.

When you update your code, just run upload_to_hf.sh again — it overwrites with the latest version.

Every Time: Rent GPU → 3 Commands → Done

1. Rent a GPU on vast.ai

Go to vast.ai → Console → Search for:

GPU: RTX 4090 (24GB) or A100 (40GB+)
Image: Pick one with PyTorch pre-installed (like pytorch/pytorch)
Storage: At least 100GB disk
Cost: ~$0.40-0.80/hr for a 4090

Click RENT and wait for it to start (~1-2 minutes).

2. Connect to the GPU

vast.ai gives you an SSH command. Copy and paste it into your terminal:

ssh -p 12345 root@ssh1.vast.ai

3. Run these 3 commands

# Set your token
export HF_TOKEN=hf_your_token_here

# Download your code from HuggingFace (takes ~10 seconds)
pip install huggingface_hub -q && python -c "
from huggingface_hub import snapshot_download
snapshot_download('YOUR_USERNAME/td-toolkit', local_dir='/workspace/td')
"

# Go!
cd /workspace/td && bash deploy.sh demo_autopilot.td

That's it. Put your phone down. ntfy.sh sends you updates as it runs.

4. When it's done

Your model gets saved to Google Drive automatically (if rclone is configured in the .td file). Otherwise it stays on the GPU at final_model/.

Setting Up Google Drive (Optional, One-Time per GPU)

On the GPU machine after SSHing in:

rclone config

Type n for new remote
Name it gdrive
Pick Google Drive from the list
Follow the prompts (it gives you a URL to visit in your browser)
Done — now save base to "gdrive:TD/models/final" works in your .td files

Tip: You can save the rclone config to your HuggingFace repo too, so you don't have to set it up every time.

Quick Reference

Command	What it does
`bash deploy.sh my_file.td`	Full setup + run
`python -m td_lang check my_file.td`	Check syntax only
`python -m td_lang info my_file.td`	Show plan without running
`python -m td_lang run my_file.td`	Run (skip deploy setup)
`python -m td_lang run my_file.td --dry`	Compile but don't execute

If Something Goes Wrong

OOM (out of memory): Your .td file's on_error block handles this — it retries with smaller batches
Model download fails: Check your HF_TOKEN is set correctly
ntfy not working: Check your phone has the ntfy app and you're subscribed to the right topic
GPU disconnects: Re-SSH in, your files are still there. Run deploy.sh again — td_lang picks up from the last snapshot

Cost Estimate

For the full demo_autopilot.td pipeline (merge 4 models + 5 training loops):

RTX 4090: ~$0.50/hr × ~30-40 hrs = ~$15-20
A100 40GB: ~$1.00/hr × ~20-30 hrs = ~$20-30
Budget cap in .td file: Set max_cost = 160.00 to prevent runaway costs