# Text Classification using AutoTrain Advanced

In this notebook, we will train a text classification model using AutoTrain Advanced.
You can replace the model with any Hugging Face transformers compatible model and dataset with any other dataset in proper formatting.
For dataset formatting, please take a look at [docs](https://huggingface.co/docs/autotrain/index).

In [1]:
from autotrain.params import TextClassificationParams
from autotrain.project import AutoTrainProject

In [2]:
HF_USERNAME = "your_huggingface_username"
HF_TOKEN = "your_huggingface_write_token" # get it from https://huggingface.co/settings/token
# It is recommended to use secrets or environment variables to store your HF_TOKEN
# your token is required if push_to_hub is set to True or if you are accessing a gated model/dataset

In [5]:
params = TextClassificationParams(
    model="google-bert/bert-base-uncased",
    data_path="stanfordnlp/imdb", # path to the dataset on huggingface hub
    text_column="text", # the column in the dataset that contains the text
    target_column="label", # the column in the dataset that contains the labels
    train_split="train",
    valid_split="test",
    epochs=3,
    batch_size=8,
    max_seq_length=512,
    lr=1e-5,
    optimizer="adamw_torch",
    scheduler="linear",
    gradient_accumulation=1,
    mixed_precision="fp16",
    project_name="autotrain-model",
    log="tensorboard",
    push_to_hub=True,
    username=HF_USERNAME,
    token=HF_TOKEN,
)
# tip: you can use `?TextClassificationParams` to see the full list of allowed parameters

If your dataset is in CSV / JSONL format (JSONL is most preferred) and is stored locally, make the following changes to `params`:

```python
params = TextClassificationParams(
    data_path="data/", # this is the path to folder where train.jsonl/train.csv is located
    text_column="text", # this is the column name in the CSV/JSONL file which contains the text
    train_split = "train" # this is the filename without extension
    valid_split = "valid" # this is the filename without extension
    .
    .
    .
)
```

In [None]:
# this will train the model locally
project = AutoTrainProject(params=params, backend="local", process=True)
project.create()