We need to share data to enable efficient and timely research
Data sharing maximises the value of collected data, minimises duplicative data collection, and promotes follow-up studies of secondary research questions using existing data.1 The importance of data sharing in advancing health is becoming increasingly recognised. The funders of health research around the world, including the National Health and Medical Research Council (NHMRC), have endorsed the call to increase the availability to the scientific community of public health research data from the research projects that they fund.2,3 Recently, cohort profiles and data source profiles have increasingly been published to facilitate data sharing.4,5 From the publications using the shared data, we have learned that data sharing increases the productivity of both original data collectors and subsequent data users.4
Why the urgent need?
Health data linkage, such as that done in Western Australia and the Northern Territory, has made it possible for researchers to use administrative data for Indigenous health research. However, there are no national guidelines for sharing de-identified data that are specifically collected from Indigenous communities for publicly funded individual research projects. With limited research funding, expertise and access to study participants for data collection for Indigenous health research, data sharing is urgently needed for three reasons.
Cost savings
First, the cost of collecting population-based data from Indigenous communities is generally much higher than that for the general Australian population because of a relatively small Indigenous population scattered throughout communities across a vast geographic area. With limited research funding, a relatively small group of Indigenous health researchers is trying to tackle a large number of health issues among heterogeneous communities. Therefore, unnecessary duplication in data collection from Indigenous communities could be avoided through data sharing and would provide savings in terms of the limited research resources.
Ethical obligations
Second, in addition to our obligation to protect the privacy and dignity of Indigenous patients who have provided personal information, we have an ethical obligation to maximise public health benefits to Indigenous community members. Since a single research project is generally funded for up to 5 years, and most of those years are allocated to data collection, original data collectors often do not have sufficient time and capacity to analyse and disseminate all the data collected within such a time frame.
Replicating findings
Third, replicability is one of the fundamental tenets of the scientific process. Public health researchers can only report fractional and selective findings of a research project. Due to the scarcity of comparable data in Indigenous health research, the chances for the findings from such data to be independently scrutinised are often lower than those from non-Indigenous health research. In addition, the complexity of the widely used multivariable statistical techniques for adjusting for potential confounders makes it even more difficult to reproduce published findings without access to the original data.
Barriers to data sharing
We face a challenging task due to barriers to data sharing in public health research.6 For original data collectors, possible reasons for not sharing include: ethics of data sharing; fear of being scooped; inadequate levels of recognition of the original data collectors; and lack of time, data-sharing standards, and financial, technical and infrastructure support.7-9
In terms of ethics, sharing health data that contain personal information in Australia should legally adhere to two sets of NHMRC guidelines approved under section 95 and section 95A of the Privacy Act 1998 (Cwlth). Although personal information data are generally collected in most original projects, often only de-identified data are used at the data sharing stage, and de-identified information is not “personal information” protected under the Privacy Act. There is no legislation and there are no guidelines to specifically regulate and guide the sharing of such de-identified data.
According to the NHMRC funding rules 2015, the NHMRC “encourages researchers to share and deposit research data arising from NHMRC supported research projects through an open access database.”10 Original collectors are still reluctant to share the de-identified data, some with the perceived fear of being scooped by others using their data before they can. This fear is perhaps unfounded because of the increased productivity that is enabled by data sharing.4 The lack of recognition of the original data collectors' contribution may also discourage them from sharing. It is a common perception that those who make their research data available to others receive inadequate levels of recognition, in terms of funding decisions, career advancement and assessment of research performance.11,12
The priorities
Obtaining valuable data from Indigenous communities, particularly remote communities, requires the ongoing commitment and hard work of original data collectors. Their contribution to research outputs should be adequately recognised by funding agencies, journals and research institutions. Appropriate resources, including technical and financial support, should also be allocated for sharing de-identified data. Importantly, legislation and guidelines are needed to make the sharing de-identified data a routine practice.
Provenance: Not commissioned; externally peer reviewed.
- 1. Ross JS, Krumholz HM. Ushering in a new era of open science through data sharing: the wall must come down. JAMA 2013; 309: 1355-1356.
- 2. Walport M, Brest P. Sharing research data to improve public health. Lancet 2011; 377: 537-539.
- 3. Wellcome Trust. Sharing research data to improve public health: full joint statement by funders of health research. http://www.wellcome.ac.uk/About-us/Policy/Spotlight-issues/Data-sharing/Public-health-and-epidemiology/WTDV030690.htm (accessed Jun 2015).
- 4. Wang Z, Dong B, Adegbija O, et al. Data sharing: a decade since the publication of the first cohort profile. Int J Epidemiol 2014; 43: 1986-1987.
- 5. Kowal P, Chatterji S, Naidoo N, et al. Data resource profile: the World Health Organization Study on global AGEing and adult health (SAGE). Int J Epidemiol 2012; 41: 1639-1649.
- 6. van Panhuis WG, Paul P, Emerson C, et al. A systematic review of barriers to data sharing in public health. BMC Public Health 2014; 14: 1144.
- 7. Nelson B. Data sharing: empty archives. Nature 2009; 461: 160-163.
- 8. Kush R, Goldman M. Fostering responsible data sharing through standards. N Engl J Med 2014; 370: 2163-2165.
- 9. Goodhill GJ. Open access: practical costs of data sharing. Nature 2014; 509: 33.
- 10. National Health and Medical Research Council. Researcher responsibilities and considerations. In: NHMRC funding rules 2015. Canberra: NHMRC, 2015.
- 11. O'Dowd A. New incentives are needed to boost research data sharing, says expert group. BMJ 2014; 348: g3685.
- 12. Wellcome Trust. Expert Advisory Group on Data Access report. Establishing incentives and changing cultures to support data access, 2014. http://www.wellcome.ac.uk/stellent/groups/corporatesite/@msh_peda/documents/web_document/wtp056495.pdf (accessed May 2015).
I am supported by an NHMRC Senior Research Fellowship (APP1042343).
No relevant disclosures.