Reproducible Research: A view from the social sciences

Ben Marwick, UW Anthropology
April 2014

Overview

  • Definitions, motives, history, spectrum
  • Current practices
  • A selection of tools to improve reproducibility
  • Challenges, standards & our role in the future of reproducible research

Definitions

alt text Stodden, V., et al. 2013. “Setting the default to reproducible.” computational science research. SIAM News 46: 4-6.

“The goal of reproducible research is to tie specific instructions to data analysis and experimental data so that scholarship can be recreated, better understood and verified.” - Max Kuhn, CRAN Task View: Reproducible Research

History of reproducible research

  • Mathematics (400 BC?)
  • Write scientific paper, Galileo, Pasteur, etc. (1660s?)
  • Publish a pidgin algorithm and describe simulation datasets (1950s?)
  • Sell magtape of code and data (1970s?)
  • Place idiosyncratic dataset & software at website (1990s?)
  • Publish datasets and scripts at website, eg. biology, political science, genetics, statistics (2000s?)
  • Hosted integrated code and data (2020s?)

Gavish & Gonoho AAAS 2011, Oxberry 2013

Motivations: Claerbout's principle

“An article about computational result is advertising, not scholarship. The actual scholarship is the full software environment, code and data, that produced the result.” - Claerbout and Karrenbach, Proceedings of the 62nd Annual International Meeting of the Society of Exploration Geophysics. 1992

“When we publish articles containing figures which were generated by computer, we also publish the complete software environment which generates the figures” - Buckheit & Donoho, Wavelab and Reproducible Research, 1995.

Benefits are straightforward

  • Verification & Reliability: Easier to find and fix bugs. The results you produce today will be the same results you will produce tomorrow.
  • Transparency: Leads increased citation count, broader impact, improved institutional memory
  • Efficiency: Reuse allows for de-duplication of effort. Payoff in the (not so) long run
  • Flexibility: When you don’t 'point-and-click' you gain many new analytic options.

But the limitations are substantial

Technical

  • Classified/sensitive/big data
  • Software licensing issues
  • Competition
  • Neither necessary nor sufficient for correctness (but essential for dispute resolution)

Cultural & personal

  • Very few researchers follow even minimal reproducibility standards.
  • No-one expects or requires reproducibility
  • No uniform standards of reproducibility, so no established user base
  • Inertia & embarassment

Our work exists on a spectrum of reproducibility

alt text Peng 2011, Science 334(6060) pp. 1226-1227

Goal is to expose the reader to more of the research workflow

Current practices, or informal ethnographic observations of social science research workers

Qualitative observations

  • Enter data in Excel
  • Use Excel for data cleaning & descriptive statistics
  • Import data into SPSS/SAS/Stata for further analysis
  • Use point-and-click options to run statistical analyses
  • Copy & paste output to Word document, repeatedly

alt text

Qualitative observations

  • Version control is ad hoc
  • Excel handles missing data inconsistently and sometimes incorrectly
  • Many common functions are poor or missing in Excel
  • Scripting is possible but rare

alt text

Click trails are ephemeral & dangerous

alt text

  • Lots of human effort for tedious & time-wasting tasks
  • Error-prone due to manual & ad hoc data handling (column and row offsets are common)
  • Difficult to record - hard to reconstruct a 'click history'
  • Tiny changes in data or method require extensive reworking efforts

Case study: Reinhart and Rogoff controversy

alt text

  • Claimed that higher debt-to-G.D.P. ratios are associated with lower levels of G.D.P. growth
  • Identified the threshold to -ve growth at a debt-to-G.D.P. ratio of >90%
  • Substantial popular impact on autsterity politics

Case study: Reinhart and Rogoff controversy

alt text

Scripted analyses are superior

alt text

  • Plain text files will be readable for a long time
  • Improved transparency, automation, maintanability, accessibility, standardisation, modularity, portability, efficiency, communicability of process (what more could we want?)
  • But there's a steep learning curve

Literate statistical programming

The alternative to point-and-click analyses

“Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to humans what we want the computer to do.”– Donald E. Knuth, Literate Programming, 1984

For example… Let's calculate the current time in R.

time <- format(Sys.time(), "%a %d %b %X %Y")

The text and R code are interwoven in the output:

The time is `r time`

The time is Wed 09 Apr 3:08:04 PM 2014

Literate programming: for and against

For

  • Text and code all in one place, in logical order
  • Tables and figures automatically updated to reflect data and method changes
  • Automatic test when building document

Against

  • Text and code all in one place; can be hard to read sometimes, especially if there is a lot of code (externalising can help)
  • Can substantially slow down the processing of documents (caching can help)

A selection of my favourite tools for reproducible research (which also seem to be widely used by others in the social sciences)

Which programming language?

The machine-readable part

R: Free, open source, cross-platform, highly interactive, huge user community in academica and private sector

R packages: an ideal 'Compendium'?

alt text

“both a container for the different elements that make up the document and its computations (i.e. text, code, data, etc.), and as a means for distributing, managing and updating the collection… allow us to move from an era of advertisement to one where our scholarship itself is published” - Gentleman and Temple Lang 2004

Documentation of code simplified with roxygen2

alt text

Interactive data exploration with the rCharts package

Interactive data exploration with the rCharts package

IPython-style notebooks for R (RCloud, IRKernel)

alt text

Which document formatting language?

alt text

Markdown: lightweight document formatting syntax based on email text formatting. Easy to write, read and publish as-is.

The human-readable part

rmarkdown:

  • minor extensions to allow R code display and execution
  • embed images in html files (convenient for sharing)
  • equations

Dynamic documents in R

knitr - descendant of Sweave

Engine for dynamic report generation in R

alt text

  • Narrative and code in the same file or explicitly linked
  • When data or narrative are updated, the document is automatically updated
  • Data treated as 'read only'
  • Output treated as disposable

Pandoc converts output from rmarkdown in many popular formats

A universal document converter, open source, cross-platform

  • Write code and narrative in rmarkdown
  • use knitr to computate figures and tables
  • use pandoc to get HTML/PDF/DOCX

…with a single simple R function render

Tracking changes with version control

Payoffs

  • Eases collaboration
  • Can track changes in any file type (ideally plain text), and who made them
  • Can revert file to any point in its tracked history

Costs

  • Unfamiliar to most social scientists
  • Takes time to master

alt text alt text alt text

An environment for reproducible research

RStudio is a free, open source, cross-platform IDE for R

With integrated R console, deep support for markdown and git, a text editor, a workspace browser, a data viewer, package development tools, etc. etc.

RStudio 'projects' make version control & literate programming simple

alt text

Making data public

Payoffs

  • Free space for hosting (and paid options)
  • Assignment of persistent DOIs
  • Tracking citation metrics

Costs

  • Sometimes license restrictions (CC-BY & CC0)
  • Limited or no private storage space

alt text alt text alt text

Challenges, standards & our role in the future

alt text

alt text Stodden (IASSIST 2010) sampled American academics registered at the Machine Learning conference NIPS (134 responses from 593 requests (23%). Red = communitarian norms, Blue = private incentives

alt text Stodden (IASSIST 2010) sampled American academics registered at the Machine Learning conference NIPS (134 responses from 593 requests (23%). Red = communitarian norms, Blue = private incentives

Culture change is the biggest challenge

  • Promote culture change through positive attribution
  • Implement mechanisms to indicate & encourage degrees of compliance (ie. clear definitions for different levels of reproducibility), cf. Stodden's:
    • 'Reproducible': compendium of text-code-data online
    • 'Reproduced': compendium available and independently reproduced
    • 'Semi-Reproducible': when the full compendium is not released
    • 'Semi-Reproduced': independent reproduction with other data
    • 'Perpetually Reproducible': streaming data

Standards to normalise reproducible research

  • Schwab et al.: ER (Easily reproducible), CR (Conditionally reproducible), NR (Not reproducible)
  • Biostatistics kite-marking of articles (Peng 2009): D (data), C (code), R (both)
  • Reproducible Research Standard (Stodden 2009): we should release
    • The full compendium on the internet
    • Media such as text, figures, tables with Creative Commons Attribution license (CC-BY)
    • Code with one of Apache 2.0, MIT, LGPL, BSD, etc.
    • Original “selection and arrangement” of data with CC0 or CC-BY

Center for Open Science's badges

alt text

An incentive to share data and code by acknowledging open practices with badges in publications. Currently used by Psychological Science

A hierarchy of reproducibility for social scientists

  • Good: Use code. Minimize pointing and clicking (RStudio). Mention availability of code.
  • Better: Use version control. Help yourself keep track of changes, fix bugs and improve project management (RStudio & Git & GitHub or BitBucket)
  • Best: Use embedded narrative and code to explicitly link code, text and data, save yourself time, save reviewers time, improve your code. (RStudio & Git & GitHub or BitBucket & rmarkdown & knitr & data repository)

Our role in the future of reproducible research (Leveque et al 2012)

  • Train students by putting homework, assignments & dissertations on the reproducible research spectrum
  • Publish examples of reproducible research in our field
  • Request code & data when reviewing
  • Submit to & review for journals that support reproducible research
  • Critically review & audit data management plans in grant proposals
  • Consider reproducibility wherever possible in hiring, promotion & reference letters.

Thanks.

“Abandoning the habit of secrecy in favor of process transparency and peer review was the crucial step by which alchemy became chemistry.”

-Raymond, E. S., 2004, The art of UNIX programming: Addison-Wesley.

Colophon

Presentation written in Markdown (R Presentation)

Compiled into HTML5 using RStudio

Source code hosting: https://github.com/benmarwick/UW-eScience-reproducibility-social-sciences

ORCID: http://orcid.org/0000-0001-7879-4531

Licensing:

References

See Rpres file on github for full references and sources