Back to Top

 Skip navigation

Background Notes

Background Notes

Online ISSN: 3088-6589
CSO statistical release, , 11am

Data sources

This publication presents statistics on earnings and employment participation based on administrative data sources. The primary data source is employee tax data from the Revenue Commissioners, and social transfer data from the Department of Social Protection which is linked to data from the CSO and other sources to provide demographic breakdowns of earnings.

Revenue employee tax data

Revenue's employee tax data contains a complete register of all employments and is the most accurate source of remuneration. It provides details of gross annual earnings and number of weeks worked in the year for all employments. The weekly earnings are calculated by dividing the gross annual earnings, as declared to Revenue, by the number of weeks worked in the year for each employment.   

For years 2017-2023 the employee tax data used for the Justice background publication came from employer end of year returns, P35, submitted to Revenue. The P35 was an annual return that was completed by all registered employers after the tax year end, up to 2018.  

Since 1 January 2019, Revenue have operated real-time reporting of payroll, “PAYE Modernisation" (PMOD). Employers are required to report their employees’ pay and deductions in real-time to Revenue each time they operate payroll. Information is provided to Revenue at individual payslip level. Earnings Analysis using Administrative Data Sources for 2019 to 2023 is based on the more detailed employee tax data provided from Revenue’s PMOD system.

Department of Social Protection (DSP)

The Central Records System of the Department of Social Protection provides information on age, nationality, gender, and county of residence. Using a unique identifier (see 'Protected Identifier Key (PIK)' below) each employee on the employee tax data files can be linked to their individual demographic characteristics on the Department of Social Protection datasets. Therefore, the earnings dataset is enhanced by adding the demographic details.

The Irish Probation Service (IPS)

The Irish Probation Service provide annual data of individuals who have served probation orders to the Central Statistics Office to publish statistics related to re-offending. The information includes the personal identity characteristics (name, age, address, etc.) and the justice related characteristics (offence type, release date).

The Irish Prison Service

The Irish Prison Service provide annual data of individuals who have been released from custodial sentences to the Central Statistics Office to publish statistics related to re-offending. The information includes the personal identity characteristics (name, age, address, etc.) and the justice related characteristics (offence type, release date).

Protected Identifier Key (PIK)

Before using personal administrative data for statistical purposes, the CSO removes all identifying personal information including the Personal Public Service Number (PPSN). The PPSN is a unique number that enables individuals to access social welfare benefits, personal taxation and other public services in Ireland. The CSO converts the PPSN to a Protected Identifier Key (PIK). The PIK is a unique and non-identifiable number which is internal to the CSO. Using the PIK enables the CSO to link and analyse data for statistical purposes, while protecting the security and confidentiality of the individual data. The Revenue, DSP and CSO records were linked using the PIK for this project. All records in the datasets are anonymised and the results are in the form of statistical aggregates which do not identify any individuals.

Data matching

The results presented in this release are based on a data-matching exercise of four administrative data sources:

  • Employee tax data from the Revenue Commissioners.
  • The Central Records System of the Department of Social Protection.
  • Probation information from the Irish Probation Service
  • Prison information from the Irish Prison Service.

The linkage and analysis was undertaken by the CSO for statistical purposes in line with the Statistics Act, 1993 and the CSO Data Protocol.

Matching process

Step 1: Linking data of individuals on probation to CSO’s administrative data holdings 

Individuals linked with probation and custodial justice sanctions during 2020 were initially matched with administrative data from the Central Records System of the Department of Social Protection in order to assign a Protected Identifier Key (PIK). In this process, 2,880 of 3,478 persons associated with probation supervision in 2020 (82%) and 2,166 of 2,604 persons linked with custodial release during 2020 (83%) were successfully assigned a PIK. 

Step 2: Matching linked probation and individuals released from prison data to administrative data on employment 

Using the matching identifier (PIK) persons associated with justice sanctions during 2020 were then linked with data relating to income from employee tax returns data provided by the Revenue Commissioners.

Table 5.1 provides a detailed breakdown of the number of matches between persons that received a prison release or probation order in 2020 and CSO’s administrative data relating to income for each year. On average, 38% of probationers and 30% of former prisoners were linked with an PAYE record for each year, although the persons matched between years were not always the same.

Table 5.1 Persons associated with justice sanctions in 2019 successfully linked to administrative data holdings by year, 2017-2023

In a small number of cases more than one employment record was linked to a person with justice sanction (either probation or prison) when they participated in more than one employment within a tax year (e.g., had a 2nd job or changed work during the reference year). In addition, a small number of matches were not matched to the administrative employment data used in the analysis as they fell below the threshold of earnings being used (e.g., the individual earned less than €500 in a year or worked less than 2 weeks).  

In the absence of a unique identifier in use for criminal justice datasets like the probation and prison service data, matching of records is more difficult and will likely not result, as in the case for this publication, in a 100% match rate. The matching performed for this publication resulted in a significant matching rate (83%) despite the limitations of the exercise given the absence of a unique identifier.

Earnings thresholds and exclusions

For the purposes of the analysis relating to earnings the CSO excluded employees earning less than €500 per annum and employments where the duration was less than two weeks in the year. Also excluded were secondary employments earning less than €4,000 per annum, extremely high earnings values and missing employer and employee reference numbers. Employment activity in NACE sectors A, T, and U has also been excluded from the analysis. Unlike CSO’s current main publication of Earnings Analysis using Administrative Data Sources, this publication includes employments that took place outside of October during the reference year. This exclusion is currently applied to the CSO’s existing publication of administrative earnings in order to comply with Eurostat’s guidelines however the restriction was not applied in the analysis conducted on the probationers in order to include as many of the matched cohort as possible for the analysis. Overall, 38% of the probationers and 30% of former prisoners from 2020 were linked to active employments during the reference period from 2017 to 2023. The effects of the exclusion are outlined in the current publication of Earnings Analysis using Administrative Data Sources.

NACE Industrial Classification

The sectoral employment figures are based on the EU NACE Rev. 2 (Nomenclature généraledes activités économiques dans les Communauté européenne) classification as defined in Council Regulation (EC) no 1893/2006.

For further information please see the CSO standard classification of NACE.

Why you can Trust the CSO

Learn about our data and confidentiality safeguards, and the steps we take to produce statistics that can be trusted by all.