Back to Top

 Skip navigation

Data Issues and Recommendations

Open in Excel:

The audit has clarified four main issues. 

Data collection

Some areas and dimensions in the audit have less data available for analysis and a few dimensions did not appear at all in the audit. In order to produce robust and representative analysis on any minority group or to study equality in any sector, there is a need for good quality data to be collected with all the necessary breakdowns to provide a view of all cohorts. Below some of the dimensions of equality that might benefit from having improved data collection are discussed.

Sex and Gender Identity

As noted earlier, while the term gender is used in the equality legislation - and appears in many of the data sets - it is principally used as a synonym for sex and is generally collected in male/female (binary) format in the data sets covered in the audit.

The topic of Gender Identity is increasing in importance alongside a growing recognition that not all people identify with their current sex or indeed with any sex, and also that people’s identifications may change over time. It will become more important in the future to have data that captures gender identity to produce robust data and analysis.

The United Nations Economic Commission for Europe (UNECE) discussed the potential impact of a gender identity question in its publication ‘Recommendations for the 2020 Censuses of Population and Housing’. They advocated a rigorous testing programme before the topic is included in a census.

The CSO introduced new questions on both gender identity and sexual orientation in its General Household Survey2 in the first quarter of 2019. The first release from this iteration of the survey was published in July 2019. The CSO will monitor the collection and production of data on both gender identity and sexual orientation. This will allow the CSO to develop questions which are well understood and acceptable to users, stakeholders and respondents and produce confidential, robust and consistent data.

From a legal perspective, there is an EU requirement on National Statistical Institutes to collect census data on sex with a binary classification scheme but no requirement as yet to collect data on gender identity.

Sexual Orientation

There are no data sets in this audit with a specific question on Sexual Orientation. It can be sometimes deduced from a marker for same-sex marriage, however this also creates issues as it does not account for the broad spectrum of sexual orientations.

A program to test the impact of any new questions on this topic should be completed before including it in any survey or data collection, as discussed above on Gender Identity.

Nationality vs Ethnicity

The Equal Status Acts 2000-2015 includes race, colour, nationality or ethnicity as discrimination under the ground of race. Only two data sets collected race, (see section 2.7 above), with nationality and sometimes ethnicity collected in another 24 data sets. Race, ethnicity and nationality need to be clearly distinguished. For example, an immigrant with Irish citizenship may have a different ethnic background and could be more vulnerable to inequality or discrimination but this will not be visible in the data collected. In the data sets audited, nationality was collected more frequently than ethnicity or race. 

Member of Traveller Community

Irish Travellers were recognised as a distinct ethnic group in March 2017. Membership of the traveller community was gathered in a question on ethnicity in some audited cases but was a separate question in other data sources.

This can make it difficult to know if a data set contains data relevant to Irish Travelers. As discussed above, nationality is collected more frequently than ethnicity, therefore data on the Irish Traveller community may be dependent on the availability of ethnicity data.

It is recommended that ‘Irish Traveller’ be included in ethnicity questions but it would also be essential to collect this data separately if there is no ethnicity question.

Disability

Data on disability is often dependent on self-disclosure. It can be defined using either the very broad definition in equality legislation, the more detailed definitions used by the CSO, (in the Census of Population and the Labour Force Survey (LFS)), or inferred from self-reported health surveys, which offer a more medical definition.

It can also be reliant on the degree and nature of self-disclosure, (e.g. it is often optional and based on a very basic tick box on form which provides little information). Given the particular complexity of disability data considerations, this presents unique challenges that will need to be examined further to generate improved levels and quality of disability data.

Statistical Disclosure

Questions on all above topics may also give rise to statistical disclosure control issues due to small population cohorts. These often arise for National Statistical Institutes when considering publishing data on relatively small population cohorts at small area levels, which is a defining geography for census data. Similar confidentiality issues may also arise when preparing cross tabular outputs. Therefore, care should be taken when collecting and disseminating any data on minority groups. The EU Equality Guidelines have some advice on confidentiality, building trust and the use of booster samples when publishing data on small population cohorts.

Themes

From the summary we can also see that some themes are lacking in data. However, after this report is published, the owners of data sets will be able to contact the CSO if details need to be amended. It is possible that many themes are missing data because the CSO did not have all the necessary contacts in each area. If the next iteration of this data audit continues to find that some themes are lacking in data, then this topic will be revisited by the CSO.

Harmonisation and Classifications

This audit contains a breakdown of the classifications used in about 30% of the audited data sets. It is clear from this small sample that there are many different classifications for each variable. Standardised classifications make comparisons and matching of different surveys and data sources much easier and allows a broader analysis to be made.

The use of standard classifications at both the collection and dissemination phase of statistical production should be promoted and supported across the Irish national statistical system. By improving the use of common standards, this will lead to an improvement in the consistency and coherence of outputs. Standard classifications will also ensure that data is comparable over time and will provide a common link between different data collections.

Intersectionality

Intersectional data needs to be available so that the impact of policy interventions for various sub-groups in larger cohorts can be assessed or information can be cross-tabulated across different cohorts. This allows for measurement of any disproportionate impacts based on the multiple identities of any individual whose data is captured, e.g.: women with disabilities as a sub-set of data on gender, Travellers with disabilities as a sub-set of data on Travellers or ethnicity, etc.

Data Protection and GDPR

The legal basis now exists to allow public bodies to process equality data. The identification of an appropriate legal basis for processing equality data under Article 6 and a permissible condition under Article 9 of the GDPR is a matter for each public body as a data controller. However, the collection and processing of equality data using section 51 of the Data Protection Act, 2018 is also legal for public bodies.

It is clear to the CSO, from discussions related to the EU Equality Data Guidelines and this audit exercise, that some public bodies are unsure about what was changed by GDPR and if they can collect certain equality data. In some cases, this uncertainty has led to variables being removed from the data collected.

While there is a responsibility to protect personal data and comply with regulations, it is very important to ensure that data is collected on all populations, including minority cohorts, to ensure that statistical analysis can be used to inform future legislation, policies and services.

Go To: Appendix

Why you can Trust the CSO

Learn about our data and confidentiality safeguards, and the steps we take to produce statistics that can be trusted by all.