Spaces:
Sleeping
Sleeping
| # Frequently Asked Questions | |
| ## Are my data and models secure? | |
| Yes, your data and models are secure. AutoTrain uses the Hugging Face Hub to store your data and models. | |
| All your data and models are uploaded to your Hugging Face account as private repositories and are only accessible by you. | |
| Read more about security [here](https://huggingface.co/docs/hub/en/security). | |
| ## Do you upload my data to the Hugging Face Hub? | |
| AutoTrain will not upload your dataset to the Hub if you are using the local backend or training in the same space. | |
| AutoTrain will push your dataset to the Hub if you are using features like: DGX Cloud | |
| or using local CLI to train on Hugging Face's infrastructure. | |
| You can safely remove the dataset from the Hub after training is complete. | |
| If uploaded, the dataset will be stored in your Hugging Face account as a private repository and will only be accessible by you | |
| and the training process. It is not used once the training is complete. | |
| ## My training space paused for no reason mid-training | |
| AutoTrain Training Spaces will pause itself after training is done (or failed). This is done to save resources and costs. | |
| If your training failed, you can still see the space logs and find out what went wrong. Note: you won't be able to retrive the logs if you restart the space. | |
| Another reason for the space to pause is if the space is space's sleep time kicking in. If you have a long running training job, you must set the sleep time to a much higher value. | |
| The space will anyways pause itself after the training is done thus saving you costs. | |
| ## I get error `Your installed package nvidia-ml-py is corrupted. Skip patch functions` | |
| This error can be safely ignored. It is a warning from the `nvitop` library and does not affect the functionality of AutoTrain. | |
| ## I get 409 conflict error when using the UI | |
| This error occurs when you try to create a project with the same name as an existing project. | |
| To resolve this error, you can either delete the existing project or create a new project | |
| with a different name. | |
| This error can also occur when you are trying to train a model while a model is already training in the same space or locally. | |
| ## The model I want to use doesn't show up in the model selection dropdown. | |
| If the model you want to use is not available in the model selection dropdown, | |
| you can add it in the environment variable `AUTOTRAIN_CUSTOM_MODELS` in the space settings. | |
| For example, if you want to add the `xxx/yyy` model, go to space settings, create a variable named `AUTOTRAIN_CUSTOM_MODELS` | |
| and set the value to `xxx/yyy`. | |
| You can also pass the model name as query parameter in the URL. For example, if you want to use the `xxx/yyy` model, | |
| you can use the URL `https://huggingface.co/spaces/your_autotrain_space?custom_models=xxx/yyy`. | |
| ## How do I use AutoTrain locally? | |
| AutoTrain can be used locally by installing the AutoTrain Advanced pypi package. | |
| You can read more in *Use AutoTrain Locally* section. | |
| ## Can I run AutoTrain on Colab? | |
| To start the UI on Colab, you can simply click on the following link: | |
| [](https://colab.research.google.com/github/huggingface/autotrain-advanced/blob/main/colabs/AutoTrain.ipynb) | |
| Please note, to run the app on Colab, you will need an ngrok token. You can get one by signing up for free on [ngrok](https://ngrok.com/). | |
| This is because Colab does not allow exposing ports to the internet directly. | |
| To use the CLI instead on Colab, you can follow the same instructions as for using AutoTrain locally. | |
| ## Does AutoTrain have a docker image? | |
| Yes, AutoTrain has a docker image. | |
| You can find the docker image on Docker Hub [here](https://hub.docker.com/r/huggingface/autotrain-advanced). | |
| ## Is windows supported? | |
| Unfortunately, AutoTrain does not officially support Windows at the moment. | |
| You can try using WSL (Windows Subsystem for Linux) to run AutoTrain on Windows or the docker image. | |
| ## "--project-name" argument can not be set as a directory | |
| `--project-name` argument should not be a path. it will be created where autotrain command is run. | |
| This parameter must be alphanumeric and can contain hypens. | |
| ## I am getting `config.json` not found error | |
| This means you have trained an adapter model (peft=true) which doesnt generate config.json. | |
| It doesnt matter though, the model can still be loaded with AutoModelForCausalLM or with Inference endpoints. | |
| If you want to merge weights with base models, you can use `autotrain tools`. Please read about it in miscelleneous section. | |
| ## Does autotrain support multi-gpu training? | |
| Yes, autotrain supports multi-gpu training. | |
| AutoTrain will determine on its own if the user is running the command on a multi-gpu setup and will use | |
| multi-gpu ddp if number of gpus is greater than 1 and less than 4 and deepspeed if number of gpus is greater than or equal to 4. | |
| ## How can i use a hub dataset with multiple configs? | |
| If your hub dataset has multiple configs, you can use `train_split` parameter to specify the both the config and the split. | |
| For example, in this dataset [here](https://huggingface.co/datasets/timdettmers/openassistant-guanaco), | |
| there are multiple configs: `pair`, `pair-class`, `pair-score` and `triplet`. | |
| If i want to use `train` split of `pair-class` config, i can use write `pair-class:train` as `train_split` in the UI or the CLI / config. | |
| An example config is shown below: | |
| ```yaml | |
| data: | |
| path: sentence-transformers/all-nli | |
| train_split: pair-class:train | |
| valid_split: pair-class:test | |
| column_mapping: | |
| sentence1_column: premise | |
| sentence2_column: hypothesis | |
| target_column: label | |
| ``` |