Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You seem to be making the assumption that scientists are interpreting their own data correctly in the first place. Reproducibility isn't just a check on outright fraud, it's also a way to check for mistakes in the original findings. The data isn't for lay-people and journalists, it's for other researchers.

As somebody who has had to compile (economic) datasets for public release, I understand that there is cleaning of raw data that needs to happen. However, this needs to happen in a transparent manner, and any transformations should be clearly called out or explained.



Or better yet, release the raw data and the transformation scripts.


It is important to automate the downstream processing so that you can perfom it both on the raw data and on the cleaned data. This lets you investigate the sensitivity of your conclusions to your various assumptions.

For example, if the conclusions are supported by the cleaned data but not the raw data, you need to worry about whether you have accidently written your conclusions in by hand during the cleaning process. One way to investigate this is to have two cleaning scripts, a lax one that only does no-brainer corrections, and a strict one that makes those delicate judgment calls. Then you can reprocess. If lax cleaning of the data supports your conclusions then they are fairly secure. If this sensitivity analysis reveals that the difference between lax and strict cleaning matters to your conclusions, then you groan, because you have waded into muddy waters and things are much harder than you initially thought. Maybe you have to recruit an assistant to clean the data blind, without knowing what the "right" answer is, or maybe you need to go on field trips to do direct checks on instruments.

If your research has important and expensive implications for public policy other people will want to do this kind of sensitivity analysis themselves and reach their own conclusions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: