File size: 1,152 Bytes
c173f9e
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
# Questions and Answers

#### My dataset is a metadata curation of multiple datasets from different places, how should I license it?

If you are building on others' work, It is important to respect their licenses. How you do this will fall into three buckets

* The data cannot be used, e.g. because of a proprietary license restriction  
* The data can be used with or without some restriction. For example, if the source dataset is licensed under the creative commons open source license [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/), it requires redistribution to "ShareAlike". Or, the authors require signing a specific usage license, for example, the Rocklin lab requires registering the use of the MegaScale dataset so they can demonstrate impact to maintain grant support.  
* The license of the source data is not clear. In this case, it is best to reach out to the original authors and either request that they adopt a license (open source or otherwise), or get explicit permission to re-share the data

My dataset is made up of a bunch of small tabular datasets, should I make them each a sub-dataset or different "splits"?