How should I manage long term storage of research data and can it be used for anything else in the future?

Long Term Storage

There is no stipulation by the TCPS2 that data be destroyed by any particular date/ However, for the protection of the participant, identifiers should not be kept longer than is necessary to fulfil publication or other requirements.

Research teams should consider destroying audio recordings, video recordings, master lists, lists of contact information for incentives or recontact, for example, as soon as possible without reducing the integrity and usefulness of the data set.

Note that the long term storage of data is the responsibility of the Principal Investigator. 

Sharing Data

Anonymous Data

There are no research ethics restrictions on dissemination or use of anonymous data. You may release such data upon request. The TCPS2 definition of ‘anonymous’ is as follows:

  • Anonymous information – the information never had identifiers associated with it (e.g., anonymous surveys) and risk of identification of individuals is low or very low.

Anonymized or Coded Data

To share anonymized data, researchers must seek REB approval, but are not required to obtain consent from participants. The researcher must establish, in the REB Application, that the data are anonymized. The TCPS2 definition of ‘anonymized’ and ‘coded’ is as follows:

  • Anonymized information – the information is irrevocably stripped of direct identifiers, a code is not kept to allow future re-linkage, and risk of re-identification of individuals from remaining indirect identifiers is low or very low.
  • Coded information – direct identifiers are removed from the information and replaced with a code. Depending on access to the code, it may be possible to re-identify specific participants (e.g., the principal investigator retains a list that links the participants’ code names with their actual name so data can be re-linked if necessary).

In cases where a master list exists, the research team receiving the data must show whether or not they have access to this master list. If they do, then consent from the participant must be sought. 

If a data sharing agreement is in place indicating that the receiving research team does NOT have access to the code AND that raw data which result from the proposed study will NOT be released back to the primary research team, then consent may not need to be sought.

It is critical that the research team fully understand how to deidentify data fully, including how to manage indirect identifiers (such as combinations of demographic data).

Open Access

The REB encourages researchers to make raw data sets available to Open Access databases in keeping with funding agency polices.

The only exceptions to these policies of full, free, and open access are:

  • where human subjects are involved, privacy and confidentiality must be protected, and access to data and/or physical samples will be determined by the owner of the data or samples, in accordance with agreements signed by the subjects;
  • where local and traditional knowledge is concerned, rights of the knowledge holders shall not be compromised;
  • where data release may cause harm, specific aspects of the data may need to be kept protected (for example, locations of nests of endangered birds or locations of sacred sites); and,
  • where pre-existing data are subject to access restrictions.

The rights of the human participant must take precedence and according to the TCPS2 researchers must state how they will protect all data (whether identified, or anonymized) for the life cycle of the data until such time as it is destroyed. This would include data in archives.

The data protection plan must be outlined in the REB Application and explained in the consent form/information letter to participants. 

For previously approved protocols

  • if a researcher has included language in the consent form/information letter that indicates that data will not be shared with anyone outside the research team then they cannot share the raw data
  • if the researcher has not included any language in the consent form/information letter about data being shared (or not being shared) the REB will allow de-identification and sharing of the raw data. 

For new protocols

The preference for future use expressed at the time of collection must be considered, but there is no overt directive in the TCPS2 that this must be established during the consent process. Best practice, however, would be to inquire what the participant preference is for future use of data:

  • raw data sets may be made available to journals
  • raw data sets may be made available to other researchers (e.g., for replication studies and/or re-analysis for different research questions)
  • raw data may be made available for educational purposes
  • raw data may be made available in Open Access databases

In these cases (and any other case where raw, non-aggregate data are shared) participants must be informed that their data will be shared in this manner in the Letter of Information and Consent (LOI/C), including a clear description of the type of information which will be shared.  

NOTE: Indicating that all information will be kept confidential to the researchers will restrict you from sharing this information (even if anonymized) outside of the research team in the future.

Estate Planning: Long Term Stewardship

Information TBA

Wording for Consent Forms

Clinical Data 

For the reasons of transparency and education, it is strongly encouraged by many medical journals and other authorities to publish the anonymized data from clinical studies for public use (anonymized means no data which can identify you would ever be published). This data is visible to researchers or the general public after the study is over. Researchers may use this data to improve knowledge about [insert topic here].

We will publish the anonymized data from this study. [Consider including some examples of what anonymized data would like like.] You should note that there will be NO personal identifiers, such as your [insert as appropriate: name, address, date of birth, etc.] in this list. Nothing in published dataset would ever identify you specifically. There are guidelines for publishing safe, anonymized data and the researchers will be following these.

Optional: [insert sample table of anonymized dataset for participants’ information] If you are interested in the background behind Open Data, we invite you to start at the British Medical Journal's Open Data website at: www.bmj.com/open-data

Quantitative and Qualitative Data

All identifiable information will be deleted from the dataset collected so that the individual participant's identity will be protected. The de-identified data will be accessible by the study investigators as well as the broader scientific community. More specifically, the data [will/may be posted on specific database OR made available to other researchers upon publication] so that data may be inspected and analyzed by other researchers. The data that will be shared on [insert database/publication] will not contain any information that can identify you.

Support info
Area: 
Human Ethics
Category: 
Data collection, stewardship and dissemination