Promoting Reproducible Research Practices at the Research Data Alliance
Update From the 14th Plenary of the Research Data Alliance:
Consistent with the National Academies of Science, Engineering, and Medicine’s recent report on Reproducibility and Replicability in Science, we define “reproducibility” as the ability to recreate computational results from the data and code used by the original researcher. At the ISPS Data Archive, it is our experience that reproducible data and code require curatorial activities to ensure that statistical and analytic claims can be computationally reproduced over time. These activities include, for example, a review of the computer code used to produce the analysis in order to verify that the code is executable and that it generates results identical to those presented in associated publications.
At the 14th Plenary of the Research Data Alliance (RDA) in Espoo, Finland, over 100 people gathered to explore the issues concerning curating for FAIR (Findable, Accessible, Interoperable, and Reusable) and reproducible data and code. The room was packed and the energy high, reflecting the need to confront the challenges of supporting the reproducibility of scientific research, and attesting to the strength of RDA as a forum for bringing together stakeholders with diverse viewpoints and expertise. Participants identified as researchers, data support professionals, repository managers, publishers, IT support, and software developers. The meeting, a Birds-of-a-Feather (BoF) on Curating for FAIR and Reproducible Data and Code, was organized by ISPS’ Limor Peer, Florio Arguillas of the Cornell Institute for Social and Economic Research, and Thu-Mai Christian of the Odum Institute Data Archive.
Our three institutions have been implementing practices and developing workflows and tools that support curating for reproducibility. At ISPS, we have been applying a curation workflow to the research outputs published on the ISPS Data Archive and have been developing software to manage the workflow, YARD. The Odum Institute Data Archive is working with the American Journal of Political Science (AJPS) and other journals to provide a data review service that performs data curation and verification of replication datasets. The Cornell Institute for Social and Economic Research (CISER) offers a Results Reproduction (R2) Service that computationally reproduces the results of research to ensure reproducibility and transparency. In 2016, we launched a consortium to advocate and support efforts to curate for reproducibility, CURE.
The BoF is intended to expand the community of practice around curating for reproducibility. At the Espoo meeting, the discussion raised questions we think are critical: What are the challenges of supporting the reproducibility of scientific research? What tools, services, expertise are necessary to support the latest norms and rigorous standards in research practice? By what standards should we consider research to be reproducible? Are standards different for data and for code? Is there a difference between curating for FAIR and for reproducibility?
In the coming weeks, we will be working toward establishing a RDA working group that will allow the RDA community to continue to address critical issues impacting scientific reproducibility. We welcome your thoughts and your input!