Why You Should Manage and Save Your Data
Having a data management plan shows a commitment to data sharing as an added objective to your research or project. Federal agencies require data management plans, and almost all of them have a data sharing policy. Many agencies have subdivisions that have additional requirements. Private funders also are beginning to implement data sharing requirements. It is important to assess the needs of the funding agency and tailor your data management plan accordingly.
Data Management Plans
Your data management plan should address the following:
- Describe your hardware architecture you will use to store your data.
- What is your plan for backups and disaster recovery? Will there be off-site copies of the data and how often will it be refreshed?
- What are the plans for handling sensitive data to ensure that regulations will be followed as appropriate for HIPAA, FERPA, Export Control Data, etc.?
- When will public access be provided?
- Describe your server platform and application/software for providing this access.
- Describe how the server will be secured?
- Are you prepared to monitor software updates, apply software patches, monitor logs for illegitimate/illegal access, and manage firewalls? If not, will someone else be responsible for hardware and software management? Who?
- Describe the resources you will commit to for long-term sustainability and scalability after the project funding has expired.
Choosing a Data Repository
Before you choose a data repository, consider the following:
- The subject(s) the repository will allow in their system.
- Your funding agencies may have a specific repository for your datasets.
- The journals in which you will be publishing may have a specific repository for your datasets or require it be in an open access repository, e.g. PLoS.
- Your scholarly society and colleagues may already be depositing datasets in a repository. Ask them.
- If you were required to write a data management plan to include with your grant proposal, what did you say you about sharing your research data?
- Check out the cost for using the repository. Do you have the funding to cover it? The cost to deposit and/or the maintenance fees depends on the repository. Not all repositories will charge to deposit your research data. If it is a repository requiring membership, then either the researcher must belong or the researcher's institution must belong.
- Check to see if the repository is able to preserve (not just backing up) your datasets. Does it have the technology and policy in place for preservation to ensure your datasets will be maintained for use in the future?
- Check out the metadata and vocabulary requirements being used by the repository. This information should include enough information about how the project was conducted so that it can be replicated. Your discipline may have already developed a standard vocabulary.
- Check out what file formats are acceptable. The repository may have additional restrictions.
- Check to make sure the datasets receive persistent identifiers, PIDs to identify the dataset. A DOI is the most commonly used PID for datasets and publications. PIDs are used to link the datasets with the publications.
- Check to see if your datasets can be restricted to specific users, if it is sensitive data. Can the datasets be restricted for a specific time period?
- Does the repository provide information on how to cite data reused by others? If you are doing all the work, check to see if you receive that credit.