The size of datasets that are collected for research project has increased in the last couple of years. Collecting data is getting easier and cheaper. This means that carefull management of your dataset is nessesary and advised. You should start thinking about the management and stewardship of your data in all stages of your research project. Start with the drafting of a Data Management Plan during the design phase of your project.Why Data Stewardship?
Making your dataset available long-term, will not only mean that your results are reproducible, but can also make science cheaper: reuse will be possible.
When working with personally identifiable data (for example in healthcare), you are required to meet privacy and security legislation.
Register your study with the Data Manager. Your research will be put on a list and the Data Manager can arrange certain documents and security measures if needed. Register your study here by answering a couple of basic questions.
When writing a DMP, you will be forced to think about all aspects of your research project regarding data. For example: have you thought about compute power, disk space during and after your project, security regarding your dataset? Find more information here.
Submit your DMP to your funder, if required. Your project plan needs to be reviewed at this point by the METC, together with the rest of your documentation. Also discuss your DMP with the Data Manager/Steward of your department. Contact us to set up a call or meeting.
Apply the measures that you've written up in your DMP. Make sure to write code with the help of software like Git/SVN, so you can keep track of all the changes. Also make sure you can recover your code version from your ouput files. Also apply a logical file naming scheme, so others might be able to make sense of your data.
Make sure you have the correct security measures in place. With personally identifiable data, make sure no data ever enters your personal devices. Are you certain only approved personel can access the dataset? If you need to send or store the data outside of the hospital network, make sure to encrypt it.
At the end of your project, publish your data as wel as your manuscript. Generate metadata using a good metadata scheme. You can also think about FAIRifying your data if you wish to generate more exposure.
The test does not take into account all aspects of data stewardship and management, but might get you thinking about why you need to correctly apply it.Start the test!