maom commited on
Commit
cf03a4c
·
verified ·
1 Parent(s): cd44827

Create 04_create_dataset_card

Browse files
Files changed (1) hide show
  1. sections/04_create_dataset_card +6 -0
sections/04_create_dataset_card ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ ## **4 Add an informative README.md**
2
+
3
+ The `README.md` is a markdown file that is displayed when goes to the front page for the dataset. It should give appropriate context for the dataset and guidance on how to use it. As a template, consider having these sections--where teh parts in brackets should be filled in. See the [MIP](https://huggingface.co/datasets/RosettaCommons/MIP/blob/main/README.md) dataset as an example.
4
+
5
+ | \# \<DATASET TITLE\> \<short descriptive abstract the dataset\> \#\# Quickstart Usage \#\#\# Install HuggingFace Datasets package Each subset can be loaded into python using the HuggingFace \[datasets\](https://huggingface.co/docs/datasets/index) library. First, from the command line install the \`datasets\` library $ pip install datasets Optionally set the cache directory, e.g. $ HF\_HOME=${HOME}/.cache/huggingface/ $ export HF\_HOME then, from within python load the datasets library \>\>\> import datasets \#\#\# Load model datasets To load one of the \<DATASET ID\> model datasets, use \`datasets.load\_dataset(...)\`: \>\>\> dataset\_tag \= "\<DATASET TAG\>" \>\>\> dataset \= datasets.load\_dataset( path \= "\<HF PATH TO DATASET\>", name \= f"{dataset\_tag}", data\_dir \= f"{dataset\_tag}")\['train'\] and the dataset is loaded as a \`datasets.arrow\_dataset.Dataset\` \>\>\> dataset \<RESULT OF LOADING DATASET MODEL\> which is a column oriented format that can be accessed directly, converted in to a \`pandas.DataFrame\`, or \`parquet\` format, e.g. \>\>\> dataset.data.column('\<COLUMN NAME IN DATASET\>') \>\>\> dataset.to\_pandas() \>\>\> dataset.to\_parquet("dataset.parquet") \#\#\# \<BREIF EXAMPLE OF HOW TO USE DIFFERENT PARTS OF THE DATASET\> \#\# Dataset Details \#\#\# Dataset Description \<DETAILED DESCRIPTION OF DATASET\> \- \*\*Acknowledgements:\*\* \<ACKNOWLEDGEMENTS\> \- \*\*License:\*\* \<LICENSE\> \#\#\# Dataset Sources \- \*\*Repository:\*\* \<URL FOR SOURCE OF DATA\> \- \*\*Paper:\*\* \<APA CITATION REFERENCE FOR SOURCE DATA\> \- \*\*Zenodo Repository:\*\* \<ZENODO LINK IF RELEVANT\> \#\# Uses \<DESCRIPTION OF INTENDED USE OF DATASET\> \#\#\# Out-of-Scope Use \<DESCRIPTION OF OUT OF SCOPE USES OF DATASET\> \#\#\# Source Data \<DESCRIPTION OF SOURCE DATA\> \#\# Citation \<BIBTEX REFERENCE FOR DATASET\> \#\# Dataset Card Authors \<NAME/INFO OF DATASET AUTHORS\> |
6
+ | :---- |