Ten simple rules for using public biological data for your research
Journal: PLoS Computational Biology
Authors: Vishal H. Oza, Jordan H. Whitlock, Elizabeth J. Wilk, Angelina Uno-Antonison, Brandon Wilk, Manavalan Gajapathy, Timothy C. Howton, Austyn Trull, Lara Ianov, Elizabeth A. Worthey, and Brittany N. Lasseigne
With an increasing amount of biological data available publicly, there is a need for a guide on how to successfully download and use this data. The 10 simple rules for using public biological data are: (1) use public data purposefully in your research; (2) evaluate data for your use case; (3) check data reuse requirements and embargoes; (4) be aware of ethics for data reuse; (5) plan for data storage and compute requirements; (6) know what you are downloading; (7) download programmatically and verify integrity; (8) properly cite data; (9) make reprocessed data and models Findable, Accessible, Interoperable, and Reusable (FAIR) and share; and (10) make pipelines and code FAIR and share. These rules are intended as a guide for researchers wanting to make use of available data and to increase data reuse and reproducibility.
Nathaniel DeVoss – February 16th, 2023