Spaces:

RosettaCommons
/

MolecularDatasetCurationGuide

Sleeping

maom commited on 18 days ago

Commit

10ac16e

verified ·

1 Parent(s): 42b2403

Update sections/03_create_dataset.md

Files changed (1) hide show

sections/03_create_dataset.md CHANGED Viewed

@@ -71,19 +71,19 @@ If your dataset is more  complex
 To load the dataset remotely,
-dataset \= datasets.load\_dataset(path \= repo\_id)
 optionally select specific split and/or columns to download a subset
-dataset\_tag \= "\<dataset\_tag\>"
-dataset \= datasets.load\_dataset(
-        path \= repo\_id,
-        name \= dataset\_tag,
-        data\_dir \= dataset\_tag,
-        cache\_dir \= cache\_dir,
-        keep\_in\_memory \= True)
 If needed, convert data to pandas
-import pandas as pd
-df \= dataset.data\['train'\].to\_pandas()

 To load the dataset remotely,
+    dataset = datasets.load_dataset(path = repo_id)
 optionally select specific split and/or columns to download a subset
+    dataset_tag = "<dataset_tag>"
+    dataset = datasets.load_dataset(
+        path = repo_id,
+        name = dataset_tag,
+        data_dir = dataset_tag,
+        cache_dir = cache_dir,
+        keep_in_memory = True)
 If needed, convert data to pandas
+    import pandas as pd
+    df = dataset.data['train'].to_pandas()