Horror: Lack of documentation
Lack of documentation for an existing data set
Tell us your horror story, what happened?
When I started my PhD, I was told to work on unpublished data that was collected 3 years prior to me starting. This supposedly would give me insight in data and part of the topic I was working on. I received various folders that were full of data. After going through them, there were several datasheets with duplicate names but different contents, there where scripts that people did not know what they did or why, there where column names that where very difficult to ascertain what they indicated, and the exact equipment and settings used where quite unknown (especially since it was already several years ago that it was performed). In the end, it took me roughly 6 months to figure out what was done and what the data meant, and several talks with the manufacturer of the equipment used to get to the conclusion that the data were poor at best and should not be used for publication.
How long ago was it?
6 to 7 years ago
How was this solved?
Several meetings with the manufacturer of the equipment used, several meetings with the researchers that performed it several years before, tedious step by step replication of the data through the available poorly documented scripts. In the end, it could not fully be resolved due to poor description of methods, data, and scripts. It was a waste of time and resources for me but also the researchers that did it several years before.
How could this horror be avoided?
By planning and describing the data collection and analysis process. Although it takes quite some time to describe what you are doing / what you've done well, it takes even more time and frustration to try and figure out what was done several years ago. Even though you might think you know what you mean with a (poor) description of data for years to come, it is likely you and/or others will be scratching your/their head trying to figure out what you meant.
What lesson can we learn from this story?
Documentation takes time but is very valuable when you revisit the data.
Did you experience data horrors in your research life? Do you think that researchers could learn from your experience? Would you like to share your story with others?
We would love to help you in turning a creepy story around data into a positive lesson learned. Use this form below to share your story with the Dutch Data Horror Team.
Archiving panic? No backups? Bugs in your code? Licence confusion? Who ya gonna call? Just the thought of it!
And yet it happens every day.
During this Data Horror Week, researchers will share these horror stories, based on their own experience. To prevent you from making the same mistake!