Code Is the White Whale of Reproducibility In Science

And more than a little quixotic.

March 8, 2017, 1:02pm

From English to biology, many fields of academic research use computers and complex algorithms at some stage of an experiment or study, whether to crunch a large amount of data or to run a simulation. When so much research relies on code, the most basic part of ensuring that conclusions can be replicated is making sure that the code works. This is more important today than ever, since science as a whole is in somewhat of a crisis surrounding the reproducibility of results across a wide range of disciplines.

But it's no easy task. In fact, designing a system for scientists to share everything that other scientists need to check their code is something of a quixotic quest. Not only would researchers need to share their code with peers, but they'd also need to make sure that other variables—like their computing environment, workflow, and databases, for example—are also communicated to the research community. It's a lot to do.

A new tool, called Everware, aims to solve part of this problem. Developed by researchers at the National Research University Higher School of Economics in Moscow and at the University of Manchester, the open source program would allow researchers to share details of the computational aspects of their experiments with other scientists.

The idea is that with Everware, scientists would be able to run each other's code with the press of a button. In addition, "using Everware, participants could start from an existing solution instead of starting from scratch," the researchers wrote in a paper published to the arXiv preprint server this week.

"We're talking about reproducing here as the simplest thing we can do: Can I run your code on your data and get your same result?" said Victoria Stodden, an associate professor at the University of Illinois' School of Information Sciences, who focuses on computational reproducibility in science.

Although Stodden isn't associated with Everware, she developed a different web tool in 2011 for researchers to share some aspects of their computation along with their papers. The fact that solutions like Everware are still being proposed, six years later, shows how much work there is left to be done in figuring out a unified system.

But the problem isn't just having a tool, Stodden said.

"[As a researcher], I have to invest time in learning and understanding Everware and finding out if it's suitable for my problems," she said. "That's not something that scientists are rewarded for directly right now."

"If you're publishing just a couple papers a year, that may make a difference in your promotion," Stodden continued. "There's a catch-22 where you can't take the time to do these things, and you need to stick to what you're rewarded for."

The solution must be twofold, Stodden said. First, you need a tool that can support sharing the details of computation in a given study, and then you need to foster a culture that rewards researchers for actually using it. Given how deeply entrenched the pressure to publish is today, that second task may be the larger challenge.

Subscribe to pluspluspodcast, Motherboard's new show about the people and machines that are building our future.

Tagged:ScienceNewsresearchcodereproducibilityeverwarevictoria stoddenTechMotherboard

FYI.

This story is over 5 years old.

Code Is the White Whale of Reproducibility In Science

ONE EMAIL. ONE STORY. EVERY WEEK. SIGN UP FOR THE VICE NEWSLETTER.