Comment by dguest

6 years ago

My field is doing something similar.

Reproducible science is definitely a good goal, but reproducible doesn't mean maintainable. Really scientists should be getting in the habit of versioning their code and datasets. Of course a docker container is better than nothing, but I would much rather have a tagged repository and a pointer to an operating system where it compiles.

It's true that many scientists tend to build their results on an ill-defined dumpster fire of a software stack, but the fact that docker lets us preserve these workflows doesn't solve the underlying problem.

FYI, and for anyone else still learning how to version and cite code: Zenodo + GitHub is the most feature rich and user-friendly combination I've found.

https://guides.github.com/activities/citable-code/

  • Thank you for mentioning Zenodo. I really liked how EU funding agencies push for reproducibility/citability of data and code when you submit proposals to them.

    I haven’t filed any NSF stuff (yet) but didn’t come across any such hard requirements where you had to commit to something like zenodo or else to archive the result of your research work for archiving/citations purposes.

    • I <3 Zenodo. My societies don't require open data, but that's a generational shift.

      Also, if you do bio-type research, you can use Data Dryad too!

  • Zenodo is great! In theory you could also upload a docker image to Zenodo and give it a DOI, but it doesn't seem to have an especially elegant way to pull this image after the fact.