Update sections/03_create_dataset.md
Browse files- sections/03_create_dataset.md +10 -10
sections/03_create_dataset.md
CHANGED
|
@@ -71,19 +71,19 @@ If your dataset is more complex
|
|
| 71 |
|
| 72 |
To load the dataset remotely,
|
| 73 |
|
| 74 |
-
dataset
|
| 75 |
|
| 76 |
optionally select specific split and/or columns to download a subset
|
| 77 |
|
| 78 |
-
|
| 79 |
-
dataset
|
| 80 |
-
path
|
| 81 |
-
name
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
|
| 86 |
If needed, convert data to pandas
|
| 87 |
|
| 88 |
-
import pandas as pd
|
| 89 |
-
df
|
|
|
|
| 71 |
|
| 72 |
To load the dataset remotely,
|
| 73 |
|
| 74 |
+
dataset = datasets.load_dataset(path = repo_id)
|
| 75 |
|
| 76 |
optionally select specific split and/or columns to download a subset
|
| 77 |
|
| 78 |
+
dataset_tag = "<dataset_tag>"
|
| 79 |
+
dataset = datasets.load_dataset(
|
| 80 |
+
path = repo_id,
|
| 81 |
+
name = dataset_tag,
|
| 82 |
+
data_dir = dataset_tag,
|
| 83 |
+
cache_dir = cache_dir,
|
| 84 |
+
keep_in_memory = True)
|
| 85 |
|
| 86 |
If needed, convert data to pandas
|
| 87 |
|
| 88 |
+
import pandas as pd
|
| 89 |
+
df = dataset.data['train'].to_pandas()
|