Over the past few decades, economics research has gone through a kind of credibility revolution. In the empirical micro-economics literature on agriculture and development, increasing attention is being paid to the empirical methodology and identifying assumptions underlying causal claims. Most influential studies are now designed according to methods that are considered the gold standard in science and allow one to rigorously test what works and why. Another way to “make economics research more scientific” is to insist on reproducibility. Indeed, the ability of an independent researcher to carry out the same research and come to the same conclusions forms the basis of the scientific method. Why should this not hold for social science?
Recently, the issue of reproducibility in development economics research has come up in the context #wormwars war. In short, an influential study from 2004 demonstrating the efficiency and effectiveness of large scale treatment against intestinal worms in developing countries has been “debunked” through a series of papers funded by the International Initiative for Impact Evaluation (3ie), which pays researchers to replicate studies. Again, the longer story is more nuanced, and involves poor reporting and cultural differences between epidemiologists and economists. From this, we also take away that the term “reproducible research” is rather vague. Some people would argue that replications should involve studies that collect new data in different areas to answer the same question. Others would use the same data, but redo the entire analysis using different assumptions. Some would use the original data and try to replicate the statistical analysis.
The exact replication of research findings, even if the data is disclosed, is harder than one might think. Anyone who has run an actual analysis knows that data are murky and there are often many alternatives. As such, the researcher has to constantly make often pragmatic decisions: drop the individual of 34 years old that weighs 2.5 kg, take the log of income to make it more normal, remove all household heads above 45 from the analysis, etc., etc. In addition, running a regression on a clean dataset is one thing, but a real research project takes years to complete, has different outputs at different points in time, and files are sent around as attachments to collaborators around the world. This results in folders with code that look like this:
Which file do I run again to get that coefficient of 0.562 in table 5? And, oh, people make errors, too.
This suggests the extreme usefulness of a tool to manage and track all of these datasets, codes, and files. In fact, the work-flow of a research project—a complex, ever growing eco-system of Stata and R code that uses files as inputs to create new files, and inserts results into constantly changing reports and papers written by multiple authors—is similar to the work-flow in large, open source software development projects. As in software development, research often needs to produce beta releases as well (in the form of discussion papers, conference presentations, progress reports...). Researchers also often want to try out different ideas on their own before suggesting to coauthors to include them in the analysis (e.g. scaling welfare by adult equivalents instead of household size). In software development, this is typically accomplished through a (decentralized) revision control system such as Git, which is a bit like “track changes” in Word, but for your entire project and can track many people’s changes at the same time. For effective cooperation and sharing, it can be used together with an on-line repository hosting service, such as GitHub or Bitbucket.
Git does more than just keeping track of the entire history of a project. It is a tool for effective cooperation between people that are in different parts of the world. It also acts as a backup and encourages attribution as the system logs who has worked on which lines of code. Git also encourages you to document your code. However, there are some drawbacks as well. Git does not work that well with binary files (as Git needs to be able to look into the files for the track change feature to work) yet most people in our profession prefer their datasets in closed sources binaries, such as Stata .dta files or Excel .xls formats (luckily, Stata .do files that contain the code are plain text files). The command line interface can be intimidating for people who grew up using graphical interface operating systems such as Windows.
It is a good evolution that more and more journals now ask authors to turn over their datasets and code. However, it would be even better if we could encourage researchers to hand over the entire project history and not just the end result of a cleaned dataset and five lines of code to come to the tables in the published articles. This way, other researchers can trace back decisions made by the authors (such as this odd decision to remove all household heads above the age of 45 just before publication of the working paper) and check the sensitivity of the results to these decisions. Git is a tool that can be of help here. But it is a tool used by software developers, which can be intimidating to non-geeks, and requires some reading and practice to harvest its full potential. This is why the early adopters seem to come from the area of Big Data. And there is no doubt big data is becoming important in agricultural development as well.