Prior to the COVID-19 pandemic, Ireland did not have an established standard mechanism or structure to facilitate statistics-based health research, in particular where data was being used from a variety of diverse sources. It quickly became apparent from the commencement of the COVID-19 pandemic that such a secure, data protection compliant environment for controlled statistical analysis by epidemiologists and other approved researchers would be required to support the Irish public health response.
Public health data falls into a specific category of personal information that requires exacting data protection, access controls and processing security, as provided for under Article 9 of the General Data Protection Regulation (GDPR), and as further set out in the Data Protection Act 2018 and the Health Research Regulations 2018 (SI 314 of 2018).
The CSO is the National Statistical Institute for Ireland. Given its legal status and the existing technical and statistical structures in place for the secure processing of large volumes of data, including special category data, the Office was identified as the appropriate organisation to lead the rapid development of such a mechanism.
A solution was developed in agreement between each of the key stakeholders, the HSE, Department of Health and the CSO, and in consultation with the Office of the Data Protection Commissioner. In summary, defined HSE data flows (listed at Appendix 1) are transferred to the CSO using advanced secure and encrypted transmission methods. This data is received in the CSO by a dedicated business unit called the Administrative Data Centre (ADC), which is a specialist team responsible for decrypting, processing, pseudonymising and storing the records in a format accessible for statistical analysis (hereafter termed the
statistical datasets). Access to the HSE raw data is confined to a limited number of ADC staff for processing purposes only. Specific detail on the technical aspects and methods used to process the data within the CSO is not outlined in this summary DPIA for operational and security reasons, but has been included in the internal use operational DPIA and shared with the Office of the Data Protection Commissioner.
In the pseudonymisation process, all direct identifiers such as names and addresses are removed by CSO. Additionally, once in receipt of HSE data, the CSO converts the identifier numbers in each dataset that remain to a Protected Identifier Key (PIK). PIKs are a unique and non-identifiable number which is internal to the CSO. Using PIKs enables the CSO and approved researchers to link and analyse data for statistical purposes, while protecting the security and confidentiality of the individual data. All access requests for analysis purposes are with respect to pseudonymised data only.
Descriptors of data flows and datasets involved are registered on the internal ADC Data Portal, though the data itself is neither viewable nor accessible from this Portal. Approved CSO statisticians may make an application for access for defined statistical purposes, as can HSE/Department of Health approved epidemiologists/researchers who have been appointed as Officers of Statistics under Section 20(b) of the Statistics Act and are bound by its confidentiality obligations. In the latter case, access to the statistical datasets, if approved, is controlled via a secure read-only CSO access mechanism, termed the Researcher Data Portal (RDP), which operates under the control of the Office’s Research Co-ordination Unit (RCU). Researchers access the CSO RDP via a Citrix connection,
which uses two-factor authentication as well as a unique username and a password, which must be reset at time of first login. The microdata at all times remains on a CSO server. Copying or removal are prohibited; Access Control Lists are used, and subject to systematic oversight and review. Only final output records are available for further use, and these outputs are subject to detailed Statistical Disclosure Control (SDC) oversight by designated CSO statisticians. The term RMF is used to describe such pseudonymised statistical datasets. More information about the RCU and Researcher Microdata File (RMF) mechanism are available here. DPIA 1204 deals with access by researchers in detail.
The CSO operates in compliance with Article 32 of the GDPR, regarding security of processing, and, having regard to the Office’s state of the art technology, costs of implementation, and the nature, scope, context and purposes of processing, operates a stringent regime of technical and organisational measures to ensure a level of security and data protection appropriate to the sensitivity and personal nature of the records concerned.
All of the foregoing mechanisms are designed to support the twin purposes of the Project. These purposes are as follows:
Purpose 1: Informing professional analysis in support of the public health response to the COVID-19 pandemic, including by approved staff involved in the Irish Epidemiological Modelling Advisory Group (IEMAG) of the National Public Health Emergency Team (NPHET).
Purpose 2: Publication of public-facing insights to inform all interested parties as to the evolution of the COVID-19 pandemic nationally, including information published by the CSO on the COVID-19 Information Hub.
Go to: Roles And Responsibilities
Learn about our data and confidentiality safeguards, and the steps we take to produce statistics that can be trusted by all.