Spaces:

iteratehack
/

team_22

Runtime error

App Files Files Community

team_22 / README_HF.md

Antigravity Agent

Deploy Neuro-Flyt 3D Training

6083286 13 days ago

preview code

raw

history blame contribute delete

3.6 kB

Deploying Neuro-Flyt 3D to Hugging Face Spaces

This guide explains how to use your organization's GPUs on Hugging Face to train the Neuro-Flyt 3D model.

Prerequisites

A Hugging Face Account.
An Organization with GPU billing enabled (or a personal account with GPU access).
A Write Access Token (Settings -> Access Tokens).

Steps

1. Create a New Space

Go to huggingface.co/new-space.
Owner: Select your Organization.
Space Name: neuro-flyt-training (or similar).
SDK: Select Docker.
Space Hardware: Select a GPU instance (e.g., T4 small or A10G).

2. Configure Secrets

In the Space settings, go to Settings -> Variables and secrets. Add the following Secret:

HF_TOKEN: Your Write Access Token (starts with hf_...).

3. Deploy Code

You can deploy by pushing the code to the Space's Git repository.

# 1. Install git-lfs if needed
git lfs install

# 2. Clone your Space (replace with your actual repo URL)
git clone https://huggingface.co/spaces/YOUR_ORG/neuro-flyt-training
cd neuro-flyt-training

# 3. Copy project files
cp -r /path/to/Drone-go-brrrrr/* .

# 4. Push to Space
git add .
git commit -m "Deploy training job"
git push

4. Monitor Training

Go to the App tab in your Space.
You will see the training logs in real-time.
The training will run for 500,000 steps.

5. Access Trained Model

Once finished, the script will automatically push the trained model (liquid_ppo_drone_final.zip) to your Model Repository (defined in train_hf.py or via arguments).
You can then download this model and use it locally with demo_3d.py.

Customization

Repo ID: Edit Dockerfile or train_hf.py to change the target Model Repository ID (--repo_id).
Steps: Change --steps in Dockerfile to adjust training duration.

Hardware & Training Recommendations

Which GPU?

A100 Large (80GB): The Ultimate Choice. If you want to train for 5M+ episodes in the shortest time possible, pick this. We have optimized the code to use 16 Parallel Environments and Large Batch Sizes (4096) to fully saturate the A100.
A10G Large (24GB): Excellent Value. Very fast and capable. It will handle the parallel training easily and is much cheaper than the A100.
T4 (16GB): Budget Option. It will work, but you won't see the massive speedup from the parallelization as clearly as with the Ampere cards (A10/A100).

Efficiency Optimization (Implemented)

To ensure the GPU doesn't sit idle, we have updated train_hf.py to:

Parallel Physics: Run 16 Drones simultaneously on the CPU.
Large Batches: Process 4096 samples at once on the GPU.
Result: Training is ~10-15x faster than the standard script.

How Many Episodes?

The environment max_steps is 1000.

Minimum (Proof of Concept): 500,000 Steps (500 Episodes). The drone will learn to hover and roughly follow the target.
Recommended (Robust): 1,000,000 - 2,000,000 Steps (1000 - 2000 Episodes). This allows the Liquid Network to fully adapt to the random wind turbulence and master the physics.
High Performance: 5,000,000+ Steps. For "perfect" flight control.

Efficiency Tip

Reinforcement Learning is often CPU-bound (physics simulation). To train efficiently:

Use a Space with many CPU vCores (8+) to run environments in parallel.
Use the A10G GPU to handle the heavy math of the Liquid Time-Constant (LTC) cells.