Back to Top

 Skip navigation

CSO Data Protocol for how the CSO manages the combining of CSO and non-CSO data came into effect in May 2005. The Protocol covers any work undertaken within the CSO to match the individual records contained in two or more data holdings, at least one of which originates outside the Office.

It also covers any assistance the CSO may give to other public authorities to enable them to link data holdings under their control for statistical purposes.

The tables below detail CSO Divisions engaged in data matching, datasets matched and outputs obtained.

Queries may be e-mailed to Dataoffice@cso.ie.

Register of Data Matching Activities

CSO Division: Administrative Data Governance and Analysis

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Business Register Data, Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables (SPP35)

Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA),  Pseudonymised Post Primary Pupil Details (DES),  Pseudonymised SOLAS Client and Course Details (SOLAS), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Child Benefit Data (Welfare), Pseudonymised Early Childcare and Education Scheme Data (Children), Pseudomymised National Vehicle and Driver File, Driver Details (DTTAS), Pseudonymised Stamp Duty on Property Transactions Data (Revenue), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Grant Application and Payment Data (SUSI), Pseudonymised Linked PAYE Real Time Data Test Data  with extra DEASP Variables (Revenue), Pseudonymised Primary Pupil Details (DES), Pseudonymised HSE Drugs Payment Scheme Data (HSE)

To create a Person Activity Register to provide structural analysis of populations and sub-populations, over time.

On-going

Populate activity indicator dataset

Used in Population estimates (PECADO) as input for admin census

Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables 

Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue)

This is an update of a previous project to include the inclusion of PMODAnalysis as SPP35 is no longer being updated. To build a register from already pseudonymised data sources in order to allow the creation of single source income register. Ongoing Creates the dataset called PIR which is used as an input into various statistical outputs.
None

ITForm11Bus_Analysis and SPP35  datasets

To examine the possibility of matching administrative data with a view to developing a dataset from which farm employee wage statistics cam be prepared. Ongoing The output will be in tabular format
SILC dataset

Person Income Register

To investigate the possibility of utilising more administrative income data in the SILC survey  Annual To create an experimental SILC product that incorporates a larger level of administrative data.  
None

RTB – Residential Tenancies Board register of tenancies, BER – Building Energy Rating from SEAI, CRS and BOMI – DSP Central Register System, NVDF Driver Details (National Vehicle and Driver File), eStamping – Stamp Duty on property transactions, GRO Birth Deaths Marriages (General Register Office, DSP), PPSN Details - Revenue, P35 - Revenue, PCRS – HSE Primary Care Reimbursement Service GMS, and other administrative data sources as required.
An Post Geodirectory

To create geospatial reference data to be added to admin datasets. On-going

This project will enable the compilation of small area statistics form admin data including Census-like population estimates.
Pseudonymised geospatial info will be generally available within the CSO for appropriate statistical projects.

Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables,  Pseudonymised Person Income Register Data  

Directory of Irish Property Addresses, including Eircodes (GeoDir), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Child Benefit Data (Welfare), Stamp Duty on Property Transactions Data (Revenue), Pseudonymised Integrated Short Term Payment System Data (Welfare), Building Energy Rating details for domestic premises (SEAI),  Pseudonymised Stamp Duty on Property Transactions Data (Revenue), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pobal Deprivation Indices Data (TrutzHaa), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue)

This is an update of previous project to include an additional data source and update project purpose. The project purpose is to solely create a publication which examines the types of cohort who purchase properties in Ireland. Once off

Frontier Publication, PXStat files and also potentially a value added dataflow on the ADC portal or a new analysis tier data flow which has been created by pseudonymising eStamping.

 

Directory of Irish Property Addresses, including Eircodes (GeoDir), Central Record System - Client, Payment and Employment Details (Welfare), Local Property Tax Returns (Revenue), Landlord and Tenant Details from the Register of Tenancies (RTB)

The purpose of the project is to develop a dataset with the potential to be used as an occupied residence sampling frame. Such a dataset could be an option as a sampling frame for CSO postal household surveys or could be used as an indicator of occupied properties, to assist Census 2021 enumerators. Ongoing

The statistical output will be a property dataset containing addresses and names of occupiers. The dataset will be an occupied residence dataset, as indicated by the latest LPT and RTB data instances. The dataset will have the potential to be used as an occupied residence household survey sampling frame and Census 2021-oriented indicator of occupied properties. Whether the output is used for such purposes, and, if so, how it is used, is outside the scope of the current project.

Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables (CensusAnalysis) (CSO)

Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare)

To check the accuracy of the Administrative Data Centre’s (ADC) geocoding process and to identify issues that may need improvement. This to be done by matching persons in both datasets to see if their place of residence appears in the same geographical areas i.e. small areas (SA), electoral divisions (ED) etc. One-Off

Report or paper in aggregated tabular form.

 

PPSN and Personal Details Data (Revenue), Household Sampling Frame (Revenue)

Provide home Addresses to DCU for a selection of individuals sampled for the Structure of Earnings Survey to facilitate post out of survey notices to those individuals at their place of residence. One-Off

Dataset containing CSO_ID (identifier created by ADC for SES survey) and home address

Census of Population 2016 Data

Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised HSE Computerised Infectious Disease Reporting System (HSE)

To produce a socio-economic analysis of the COVID-19 pandemic and COVID-19 mortality differentials using the Census of Population 2016, the anonymised Central Record System datasets and data sources of the Health Service Executive (HSE) which have been supplied to the CSO to support analysis of COVID-19 related issues. One-Off

Tabular  output.

 

 Back to Top

 

CSO Division: Agriculture, Transport and Tourism

CSO Dataset Matched

 

Non-CSO Dataset Matched

Reason

 

Frequency

 

Statistical Outputs Obtained

Agriculture Register

Farm database from Department of Agriculture and Food

Update CSO Agriculture register

Annual

Details of farm 'births'

Census of Agriculture 2010

Survey of Agricultural Production Methods 2010

Farm Structure Survey 2013

Annual June Crops & Livestock Survey

Animal Identification & Movement database for cattle & the Single Payment Scheme for crops

December Sheep & Goat Census (DAFM)

To enable CSO to fulfil requirements for Agriculture data under Regulation 1165/2008, Regulation 1166/2008 and Regulation 543/2009.

Annual

Census of Agriculture 2010; 

Annual June Crops & Livestock Results;

Farm Structure Survey 2016 Results (due May 2018)

Annual June Crops & Livestock Survey

CRS Client ITForm11Per_Analysis AgriSingleFarm  ITForm11Bus_Analysis   SPP35

Match the Annual June Crops & Livestock Survey 2016 returns with CRS Client data to check the age, marital status and gender of the farm holder in these returns.

2017

Farm Structure Survey 2016 data set

 › Back to Top

CSO Division: Balance of Payments

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Balance of Payments Data Section 110 Revenue Data To match BOP data to ADL Section 110 company data in order to identify SPEs Once Off Tabular
None Central Bank S110 Companies To investigate under-coverage of SPEs in Balance of Payments data. CRO number and name of S110 companies will be matched to a population of SPEs provided by Central Bank. Annual Internal Report
Business Register Data CRO Accounts Details Data (DandB) The project will link CRO Accounts data to Business Register data for the purposes of monitoring the population and survey coverage of Irish aircraft leasing companies, informing survey recruitment and validating respondent data. Ongoing The project will facilitate the incorporation of CRO Accounts data within an internal database of aircraft leasing companies, facilitating the identification of company groups, asset holdings and turnover data. 

 › Back to Top

CSO Division: Business Statistics, Business Register & Purchasing Power Parities

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

CSO data for BERD, CIS, CIP, ASI and Business Register

The Dept of Jobs, Enterprise and Innovation's Annual Business Survey of Economic Impact (ABSEI)

 

 

 

 

 

 

 

 

 

 

The purpose of this Data Matching Project is to ascertain from Dept of Jobs, Enterprise and Innovation (DJEI) (formerly Forfás), using data from their Annual Business Survey of Economic Impact (ABSEI), a list of the likely performers of R&D in Ireland. The data matching will be done by CSO in line with the Memorandum of Understanding in place (under the Statistics Act, 1993) between the CSO and DJEI, and the results of the matching will be sent to DJEI.

Ongoing

 

 

 

 

 

 

 

 

 

 

 

 

An anonymised matched file of likely R&D active firms in Ireland.

Business Register

CRO data primarily relating to company ownership and company accounts

Enhace the usefulness of the CRO data by classifying the records by economic activity

Quarterly

Improved economic statistics

Business Register

 

Revenue - VAT; PREM (employer registrations); Income Tax; Corporation Tax; P35 files

Update CSO register

Quarterly and annual

Improved CSO business register

Business Register

 

Companies Registration Office registration file

Improve the quality of the CSO business register

Monthly

Improved CSO business register

Business Register

CRO file containing most recent Annual Return

Help fulfill European requirements and also help with sampling

Continuous

 

Improved quality business register as a basis for statistical surveys, etc.

CSO Business Register

GEO Directory

Improve location of Enterprise

Continuous

Improved quality business register as a basis for statistical surveys, etc.

Business Register

EuroGroups Register

Contribute to the setup and maintenance of the EuroGroups Register as required under EU law.

Continuous

Improved quality of statistical outputs that are affected by multinational groups, e.g. FD statistics, Outward FATS, Inward FATS.

 Business Register Data

Pseudonymised Corporation Tax Data (Revenue) 

Biennial access is required to the Research & Development fields on the CTAnalysis file to identify potential enterprises carrying out R&D in Ireland, to produce statistics in accordance with European Commission Regulation (EC) No 995/2012.

Annual

Biennial results. Principal Variables:

Detailed information on research and development expenditure; Sources of funds for research and development expenditure; Detailed information on research and development personnel; Recruitment of researchers; Research and development collaboration.

 

Business Register Data 

Pseudonymised Integrated Short Term Payment System Data (Welfare), Integrated Short Term Payment System Data (Welfare), Vat Information and Exchange System Acquisitions Data (Revenue), Vat Information and Exchange System Dispatches Data (Revenue), Pseudonymised VAT Trader Returns (VAT3 and RTD) Data (Revenue), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Linked Covid Refund Scheme Data with extra DEASP Variable (Revenue)

To identify signs of activity in the Irish business economy during the COVID-19 period of restrictions on trade and subsequent re-opening.

One-Off

Tabular output, presenting aggregated statistics by various economic and demographic characteristics including economic sector, size class, region.

All outputs will be have checked in line with standard CSO practices regarding confidentiality.

 › Back to Top

 

 

CSO Division: Census Management

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Census 2016 Analysis tier

HEA analysis tier graduate data

To improve the quality of the census and its coherence with external data sources and to remove the need to collect data that is available on adminsitrative registers.

Once-off

Internal report

Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables (CensusAnalysis), Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables (SPP35)  

Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Quarterly National Household Survey Data (CSO), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Child Benefit Data (Welfare), Pseudonymised Early Childcare and Education Scheme Data (DCYA), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudonymised Live Register Claims Data from DEASP Integrated Short Term System (Welfare), Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables (CSO), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB)~ Pseudonymised Local Property Tax Returns (Revenue), Pobal Deprivation Indices Data (TrutzHaa), Pseudonymised Consolidated Income Tax Forms 11 and 12 and P35L Data (Revenue), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Grant Application and Payment Data (SUSI), Pseudonymised PPSN and Personal Details Data (Revenue), Pseudonymised Primary Pupil Details (DES)

The goal of this data matching project is to identify administrative data sources which can be used to impute missing data in the census and to reduce respondent burden by using data already available in administrative registers.

Annual

The output will be in the form of an internal report recommending how administrative data sources could be used to add value and improve the quality of census data in the 2021 census.

Census 2016 Housing Data

Water Consumption Details for Residential Properties (IrishWat), Gas Usage Details for Residential and Commercial Customers (GasNetwk), New Residential Electricity Network Connections (ESBNetwk), Household Sampling Frame (Revenue)

The goal of this data matching project is to address possible issues around vacancy rates that arose in previous Census 2016. Explore if administrative data can be used to provide evidence of household occupancy for Field Supervisors in real time during Census collection 2021.

Ongoing

Final output is an indicator for occupancy for each dwelling using a series of utility datasets. The indicator will be available to census field supervisors through the field case management system to advise the enumerator around unoccupied or vacant dwellings, or as a quality check, in real time during Census collection 2021. Census field staff will not have access to the datasets or any values contained in these datasets.

 

Local Property Tax Returns (Revenue), Landlord and Tenant Details from the Register of Tenancies (RTB)

This project uses remote sensing techniques and data to aid census enumeration. A model has been developed which analyses high resolution aerial imagery and returns the precise location of all objects the model believes are buildings in the image it consumes. The intention of this DMP is to verify if this model can be used to confirm secondary residential units exist by comparing it to locations of known tenancies. The result will be a statistical measure of the models accuracy.

One-Off

By comparing the coordinates of the deep-learning model with locations of possible secondary dwellings on RTB and LPT, it will be possible to count the number of coincident pairs, this can be taken (somewhat) as a statistical measure of accuracy and this is the only anticipated output.

Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables (CSO), Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables (CSO), Pseudonymised Housing Assistance Payment - Analysis Tier (CSO)

Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised SOLAS Client and Course Details (SOLAS), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Child Benefit Data (Welfare), Pseudonymised Early Childcare and Education Scheme Data (Children), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudonymised National Vehicle and Driver File, Driver Details (DTTAS), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pobal Deprivation Indices Data (TrutzHaa), Pseudonymised Help to Buy Scheme Data (Revenue), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Directory of Irish Property Addresses, including Eircodes (GeoDir), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Grant Application and Payment Data (SUSI), Pseudonymised Primary Pupil Details (DES), Pseudonymised HSE Drugs Payment Scheme Data (HSE), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue)

The goal of this data matching project is to evaluate the Administrative Data landscape from a Census perspective and possibly identify new administrative data sources as we examine the process of developing an Administrative Census.

Annual

A preliminary baseline report to MB on the current administrative data position to compile an Administrative Census and possible additional attributes from Administrative Data sources. A final report to MB on what an Administrative Census file can deliver versus a traditional Census. Potentially publishing experimental statistics into the public domain.

None PPSN and Personal Details Data (Revenue), Central Record System - Client, Payment and Employment Details (Welfare), Local Property Tax Returns (Revenue), Landlord and Tenant Details from the Register of Tenancies (RTB), Central Record System - Client Details (Welfare), Higher Education Student and Course Details (HEA)  The goal of this data matching project is to geocode administrative data to create a census like population count at granular levels of geography such as small area and electoral division. The key output from this element of the project will be the addition of EIRCODEs and/or small areas to administrative data sources.  Annual  The planned final output of the Administrative Census project is a one-person, one-record population file initially relating to a reference period of April 2019 with each record allocated an Eircode or a Small Area code from administrative sources. This file will then be used to publish census like population counts broken down by small area and\or electoral division. 
 Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables, Pseudonymised Census of Population, with GeoDirectory and DEASP Variables, Pseudonymised Quarterly National Household Survey Data  Pseudonymised QQI Course and Award Details Data (QQI),  Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised SOLAS Client and Course Details (SOLAS), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Child Benefit Data (Welfare), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pseudonymised Building Energy Rating details for domestic premises (SEAI), Pobal Deprivation Indices Data (TrutzHaa), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Water Consumption Details for Residential Properties (IrishWat), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Grant Application and Payment Data (SUSI), Pseudonymised PPSN and Personal Details Data (Revenue), Pseudonymised Primary Pupil Details (DES), Pseudonymised HSE Drugs Payment Scheme Data (HSE), Pseudonymised Housing Assistance Payment - Analysis Tier (LCouncil), Leaving Certificate results from SEC (SEC), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Housing Agency social housing waiting lists (DeptHous) The goal of this data matching project is to identify administrative data sources which can be used to impute missing data in the census and to reduce respondent  and processing burden by using data already available in administrative registers. One-off The output will be in the form of an internal reports recommending how administrative data sources could be used to add value and improve the quality of census data in the 2021 census.

 › Back to Top

 

CSO Division:  Environment & Climate

CSO Dataset Matched

Non-CSODataset Matched

Reason

Frequency

Statistical Outputs Obtained

CSO Business Energy Use Survey, Business Register, Census of Industrial Production and Annual Services Inquiry.

 

 

Environmental Protection Agency Emissions Trading Scheme and Sustainable Energy Authority of Ireland Large Industry Energy Network and Public Sector Energy Programme. To validate the Business Energy Use Survey returns by matching the EPA (Environmental Protection Agency) Emissions Trading Scheme (ETS) file and SEAI (Sustainable Energy Authority of Ireland) LIEN (Large Industry Energy Network) and Public Sector Energy Programme (PSEP) with CSO Business Energy Use Survey, Business Register, Census of Industrial Production and Annual Services Inquiry. The data matching project will be repeated as required for future publications of the Business Energy Use Surveys. Validation of the Business Energy Use survey returns and ultimately publication of the Business Energy Use Survey.

CSO Business Register

Irish Water non-domestic datasbase

The purpose is to obtain data on water consumption by NACE sector to meet Eurostat and other requirements on water statistics e.g. Inland Waters questionnaire and Water Framework Directive.

Ongoing

The output will be in tabular format.

CSO Business Register and CSO Trade Register

EPA Pollution Release and Transfer Register.

 

Dublin City Council National Trans Frontier Shipment Office.

Matched for Waste Statistics in the Environmental Statistics Division

Ongoing

Linkage created between EPA PRTR register and CSO Business Register.

NTFSO matched to CSO external trade statistics register.

Census 2011 housing data and Census 2016 housing data

Gas Networks Ireland 2011 and 2016 residential gas consumption in size classes.

To facilitate examination of factors affecting households that are located near to the mains gas pipeline changing from solid fuel use to gas.

Once-off

Anonymised Research Microdata File

BusReg - Business Register Data

VATREG

TO match Bus Reg with VATREG to Obtain latest email and contact names for Green Pilot Survey

Once-off

List

Survey on Income and Living Conditions Data, Household Budget Survey Data, Census of Population 2011 Data, Census of Population 2016 Data

Better Energy Warmer Homes Data (SEAI), Electric Meter Data (ESB), Air Quality Data (EPA), Building Energy Rating Details (SEAI), Long and Short Term Social Welfare Payments Data (Welfare), Gas Usage Details for Residential and Commercial Customers (GasNetwk)

To analyse the factors leading to energy poverty; the impact of the environment on health; and related issues.

Ongoing

The Statistical outputs will be a statistical release with an analysis of factors leading to energy poverty, the impact of the environment on health and related issues.

Business Register Data

VAT Registrations Data (Revenue)

The objective is to match the CSO Business Register with the Revenue VAT Register to obtain the latest email and contact names for the Waste Generation Survey and the Environmental Expenditure Survey.

Annual

Replacement of email addresses to conduct e-form survey.

Census of Population 2011 Data, Census of Population 2016 Data, Census 2011 Housing Data, Census 2016 Housing Data  Building Energy Rating Details (SEAI) The objective of this data matching project is to facilitate research to assess the extent of residential solid fuel use in Ireland and identify the factors that determine households' use of solid fuels. One-Off An anonymised RMF will be produced. Access to the RMF has already been requested by University College Cork, for research into residential solid fuel use. Proposed outputs include reports, policy briefs and academic papers.

 › Back to Top

CSO Division: Growing Up In Ireland

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Growing Up in Ireland data

Central Applications Office (CAO) 

The opportunity to link the Growing Up in Ireland (GUI) data to Central Applications Office (CAO) data will allow insights into the decision-making processes of Young People (YP) and of how they decide to apply for different courses in Universities and Institutes of Technology.

One-off

Report. Tabular/Aggregated. Publication of findings.

 › Back to Top

 

CSO Division: International Trade In Goods

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Trade Register Data

VAT Information and Exchange System Acquisitions Data (Revenue), Vat Information and Exchange System Dispatches Data (Revenue)

The VIES data is matched with Intrastat and VAT Trade data by VAT number to provide partner country for below threshold trade estimates.

Ongoing

The arrivals and disposals data are matched with the below threshold trade to provide a country breakdown at trader level. This data is used in TEC (Trade by Enterprise Characteristics) outputs.

 › 

 

CSO Division: Income, Consumption and Wealth

 

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

SILC dataset

SUSI data

 

To assess whether adminsitrative data can be used to replace variables on the Survey on Income and Living Conditions (SILC) to reduce the burden on respondents, particularly with respect to education grants.

Initially once off, maybe annually depending on the results.

Tabular, diagrams and written comment

HFSC Household Finance and Consumption Survey

AgriSingleFarm – Pseudonymised Single Farm Payment Data
Agri - Basic Payment Scheme Area file
DAFM - Sheep and Goat Census
AIMS Analysis – Pseudonymised Animal Identification and Movement Data
BER Analysis – Pseudonymised Building Energy Rating Details
CensusAnalysis – Pseudonymised Census of Population 2016 with Geodirectory and DEASP Variables
CRS_Client – Pseudonymised Central Record System – Client Details
DSPpayments – Pseudonymised Long and Short Term Social Welfare Payments Data
RTBAnalysis – Pseudonymised Landlord and Tenant Details from the Register of Tenancies
SUSIAnalysis – Pseudonymised Grant Application and Payment Data
Revenue’s P35L: “SPP35 – P35L dataset for analysis”
Revenue’s Form 11: “ITForm11Per_Analysis - Income Tax Form11 Person Analysis files”

Verification of data in the Household Finance and Consumption Survey (HFCS), possible imputation of missing values.

We will also assess whether administrative data can be utilised to replace some survey questions and thus lessen the burden on respondents.

It is anticipated that the HFCS will be produced every 3 years form 2020 Tabular, diagrams, written comment. 

HFSC Household Finance and Consumption Survey

Housing Assistance Payment (HAP)

To verify data provided by respondents in the Household Finance and Consumption Survey (HFCS) and to match data in cases of non-response. Ongoing The data will provide the monthly rent paid by a HAP household and also the amount paid on their behalf to the landlord. These are core variables in the HFCS used to calculate expenditure and social transfers. 

Survey on Income and Living Conditions Data

Pseudonymised Housing Assistance Payment - Analysis Tier (HAP)

To assess whether administrative HAP data can be used to replace variables on the Survey on Income and Living Conditions (SILC) to reduce the burden on respondents, and for data validation. Annual Tabular, diagrams, written comment. All information will be published within CSO guidelines for web, electronic and paper dissemination & standard EU templates for Eurostat requirements.
Queries requested will be provided within CSO guidelines for confidentiality.
Survey on Income and Living Conditions Data  Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue) PMOD will replace P35 administrative data for employee income from SILC 2019.  This matching project is for the use of PMOD income data in SILC processing, reducing the burden on survey respondents and increasing the accuracy of SILC data. Annual Tabular, diagrams, written comment. All information will be published within CSO guidelines for web, electronic and paper dissemination & standard EU templates for Eurostat requirements.
Queries requested will be provided within CSO guidelines for confidentiality.
Household Finance and Consumption Survey Data, Survey on Income and Living Conditions Data, Business Register Data, Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables,Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables  Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Integrated Short Term Payment System Data (Welfare), Single Farm Payment Data (DAFM), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Linked Covid Refund Scheme Data with extra DEASP Variable (Revenue) Analyse the affect COVID-19 has had on the financial viability of Irish households and assess the impact income support schemes (TWSS & PUP) have had in supporting households. Ongoing

Impact of COVID-19 on financial viability of households including Debt Sustainability Rates, Income to Loan Ratios, Negative Equity Rates.

Aggregated statistics presented in tabular form by various economic and demographic characteristics including Economic Sector, Size class, Income Distribution, Gender, Age group, Region.

Regression results presented in tabular form with coefficients, standard errors, P-values and model metrics.

Household Finance and Consumption Survey Data, Survey on Income and Living Conditions Data, Business Register Data, Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables, Pseudonymised Census of Population, with GeoDirectory and DEASP Variables  Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudonymised Income Tax Form 11, Business Details Data (Revenue), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Single Farm Payment Data (DAFM), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Linked Covid Refund Scheme Data with extra DEASP Variable (Revenue), Pseudonymised DEASP Covid 19 Illness Claims (Welfare), Pseudonymised Maternity Benefit Payments Data from DEASP (Welfare), Pseudonymised Linked EWSS Data with extra DEASP Variable (Revenue)

Update to Application ID: 1175 to include EWSS and COVID-19 Illness Benefits data.

Analyse the affect COVID-19 has had on the financial viability of Irish households and assess the impact income support schemes (TWSS, EWSS & PUP) have had in supporting households.

Ongoing

Impact of COVID-19 on financial viability of households including Debt Sustainability Rates, Income to Loan Ratios, Negative Equity Rates.
Aggregated statistics presented in tabular form by various economic and demographic characteristics including Economic Sector, Size class, Income Distribution, Gender, Age group, Region.
Regression results presented in tabular form with coefficients, standard errors, P-values and model metrics.

Survey on Income and Living Conditions Data, Business Register Data, Earnings Hours and Employment Costs Survey Data, Earnings Analysis using Administrative Data Sources Data, Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables, Pseudonymised Census of Population, with GeoDirectory and DEASP Variables ( Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudonymised Income Tax Form 11, Business Details Data (Revenue), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised VAT Trader Returns (VAT3 and RTD) Data (Revenue), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Single Farm Payment Data (DAFM), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Linked Covid Refund Scheme Data with extra DEASP Variable (Revenue), Pseudonymised DEASP Covid 19 Illness Claims (Welfare), Pseudonymised Maternity Benefit Payments Data from DEASP (Welfare), Pseudonymised Linked EWSS Data with extra DEASP Variable (Revenue)

The PMOD Analysis Group aims to utilise the Revenue's PAYE Modernisation data along with other administrative and survey datasets to develop a standardised approach to the analysis of PMOD linked data and to produce a range of new, timely and informative outputs for the CSO. These outputs will include: population pyramid, earnings and employment analysis of employee cohorts, effect of COIVD-19 employment support payments etc.

Ongoing

Ongoing

› Back to Top

CSO Division:   ISS Coordination Horizontal Reports

 

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Census of Population 2011 Data, Census of Population 2016 Data, Census 2011 Housing Data, Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables (CensusAnalysis), Census 2016 Housing Data,Pseudonymised Person Income Register Data

Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Grant Application and Payment Data (SUSI)

 

This project aims to provide insight into social and economic characteristics of individuals living across a range of six geographical urban/rural defined areas, defined by population density and access to services and amenities.  CSO data will be the starting point (and make up the majority of the report) but by matching with non-CSO data, additional insights will be achieved.

One-Off

Report. Tabular/Aggregated. Publication of findings. 

Census of Industrial Production Data, Annual Services Inquiry Data, Business Register Data, Pseudonymised Person Income Register Data (PIR), Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables (CensusAnalysis), Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables (SPP35), Annual Business Survey of Economic Impact (ABSEI) Data Corporation Tax Historical Tax Year (April to April) Returns Data (Revenue), Income Tax Form 11 Data (Revenue), Pseudonymised Corporation Tax Data (Revenue), Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised SOLAS Client and Course Details (SOLAS), CRO Accounts Details Data (DandB), Pseudonymised Corporation Tax Historical Tax Year (April to April) Returns Data (Revenue), Consolidated Income Tax Forms 11 and 12 and P35L Data (Revenue), Pseudonymised Consolidated Income Tax Forms 11 and 12 and P35L Data (Revenue), Pseudonymised Springboard and ICT Student and Course Details (HEA), CT-CRO Linking File (Revenue), Pseudonymised Grant Application and Payment Data (SUSI)  Analysis on skills by sector:
The objective of this project would be to identify the key skills and education of workers by the sector in which they work. The sectors would also be subdivided between companies considered productive and non-productive at an aggregate level. It will help identify where there are potential skill gaps/shortages or where certain skills are over subscribed in non-related sectors.
Ongoing  Publication/report (tabular/aggregated) 
Census of Industrial Production Data, Annual Services Inquiry Data, Business Register Data, Pseudonymised Person Income Register Data (PIR), Annual Business Survey of Economic Impact (ABSEI) Data   Pseudonymised Flows of Jobs and Persons Data (DEASP), Revenue Sources (REVENUE) 

A Network Analysis of Productivity Spillovers via Labour Mobility:

The objective of this research project is to analyse clusters of firms, in terms of their knowledge and skill flows, when workers switch jobs between multinational enterprises and domestic firms (and vice-versa) and assess to what extent positive or negative productivity spillovers may occur, if any.

 
Ongoing  Report/paper (tabular/aggregated), including peer-review working paper. 
Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables (CensusAnalysis)  Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised SOLAS Client and Course Details (SOLAS), Pseudonymised Flows of Jobs and Persons Data from DEASP and Revenue Sources (CSO), Pseudonymised Central Record System - Client Details (Welfare), Pseudomymised National Vehicle and Driver File, Driver Details (DTTAS), Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables (CSO), Pseudonymised Stamp Duty on Property Transactions Data (Revenue), Pseudonymised Stamp Duty (1980-2009) Data (Revenue), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pseudonymised Vehicle Registrations Data (Revenue), Pseudonymised Help to Buy Scheme Data (Revenue), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised PPSN and Personal Details Data (Revenue), Pseudonymised  Vehicle Licencing Data (DTTAS), Pseudonymised HSE Drugs Payment Scheme Data (HSE), Pseudonymised Housing Assistance Payment - Analysis Tier (HAP), Pseudonymised Housing Agency social housing waiting lists (DeptHous) This project will develop and build a social and economic aggregate statistical analysis of offenders (before and after prison). It will help with:
o Understanding offenders interactions, at an aggregate level, with the State before and after release e.g. are they registering for welfare support, housing, education
o Measure/gauge reintegration into the community after prison
This information will be used to help inform policy discussions and development regarding the offender population 
One-Off

 

 

 
Report/Paper (tabular aggregated data) 
Pseudonymised Person Income Register Data, Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables  Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised SOLAS Client and Course Details (SOLAS), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Child Benefit Data (Welfare), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudomymised National Vehicle and Driver File, Driver Details (DTTAS), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pseudonymised Help to Buy Scheme Data (Revenue), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Grant Application and Payment Data (SUSI), Pseudonymised PPSN and Personal Details Data (Revenue), Pseudonymised Primary Pupil Details (DES), Pseudonymised HSE Drugs Payment Scheme Data (HSE), Pseudonymised Housing Assistance Payment - Analysis Tier (HAP), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue)  The goal of this data matching project is to identify and analyse migration flows using administrative data sources.  One-Off  Report. Tabular/Aggregated. Publication of findings. 
Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables,  Person Income Register Data, Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables   Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised SOLAS Client and Course Details (SOLAS), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Child Benefit Data (Welfare), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudonymised National Vehicle and Driver File, Driver Details (DTTAS),  Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pseudonymised Help to Buy Scheme Data (Revenue), Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Grant Application and Payment Data (SUSI), Pseudonymised PPSN and Personal Details Data (Revenue), Pseudonymised Primary Pupil Details (DES), Pseudonymised HSE Drugs Payment Scheme Data (HSE), Pseudonymised Housing Assistance Payment - Analysis Tier (HAP), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue) The goal of this data matching project is to obtain population activity counts and identify and analyse migration flows using administrative data sources.
Note that this project expands the aims of project ID 1126 (above) by including population counts and including the dataset SPP35 in the data matching proposal.
One-Off Report. Tabular/Aggregated. Publication of findings. 
Pseudonymised Flows of Jobs and Persons Data from DEASP and Revenue Sources, Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables, Pseudonymised Person Income Register Data, Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables  Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised SOLAS Client and Course Details (SOLAS),  Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised National Vehicle and Driver File, Driver Details (DTTAS), Pseudonymised Stamp Duty on Property Transactions Data (Revenue), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Stamp Duty (1980-2009) Data (Revenue), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pseudonymised Vehicle Registrations Data (Revenue), Pseudonymised Help to Buy Scheme Data (Revenue),Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised PPSN and Personal Details Data (Revenue), Pseudonymised  Vehicle Licencing Data (DTTAS), Pseudonymised HSE Drugs Payment Scheme Data (HSE), Pseudonymised Housing Assistance Payment - Analysis Tier (HAP), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Housing Agency social housing waiting lists (DeptHous) This project will develop and build a social and economic aggregate statistical analysis of offenders (before and after prison). It will help with:
o Understanding offenders interactions, at an aggregate level, with the State before and after release e.g. are they registering for welfare support, housing, education
o Measure/gauge reintegration into the community after prison
This information will be used to help inform policy discussions and development regarding the offender population.
One-Off Report/Paper (tabular aggregated data)

› 

 

CSO Division: Labour Market and Earnings

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

None

DSFA CRS; P35 file from Revenue Commissioners; CSO Central Business Register 

To investigate the extent to which foreign nationals engaged with and remained in employment

Annual

 

 

 

CSO Statistical release; other aggregate tables

 

 

Census 2016 Analysis tier

EAADS (subset of P35 analysis dataset)

To match Census 2016 analysis level data to the data  being used to prepare for the Earnings Analysis using Adminsitrative Data Sources (EAADS) release.

Once-off

Tables, charts

CSO’s Earnings, Hours and Employment Costs Survey (EHECS) data
CSO’s Central Business Register
CSO’s Employer Identification Inquiry (EII) – a small survey run specifically for the EAADS to ensure correct alignment of NACE codes.
CSO’s Structure of Earnings Administrative Data Project (SESADP) 2011-14.
CSO’s Census 2016 – approved for matching to the EAADS 2014 and later (DMP118).

Revenue’s P35L: “SPP35 – P35L dataset for analysis” data flow on ADC
Department of Employment Affairs and Social Protection (DEASP) data:
"CRS Client table from DEASP - Analysis" data flow on ADC
"DSP CRS from DEASP - Analysis" data flow on ADC

It is proposed that several data sources (both administrative and survey) will be used in the creation of the Earnings Analysis using Administrative Data Sources (EAADS) release.

The EAADS provides Structure of Earnings Statistics of employees within Ireland and is predominantly an administrative data project. Matching the proposed data sources will allow for an accurate and detailed EAADS to be produced, in alignment with what was previously released for 2011-14.

 Annual Tabular, diagrams, written comment. All information will be published within CSO guidelines for web, electronic and paper dissemination & standard EU templates for Eurostat requirements.

Queries requested will be provided within CSO guidelines for confidentiality.

Labour Force Survey Data, Business Register Data, Earnings Hours and Employment Costs Survey Data, Earnings Analysis using Administrative Data Sources Data, Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables, Pseudonymised Census of Population, with GeoDirectory and DEASP Variables 

Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Linked Covid Refund Scheme Data with extra DEASP Variable (Revenue), Pseudonymised Linked EWSS Data with extra DEASP Variable (Revenue)

Analysis of the income support schemes put in place in response to COVID 19. Ongoing Tabular output, presenting aggregated statistics by various economic and demographic characteristics including Economic Sector, Size class, Earnings bands, Gender, Age group, Region.

Business Register Data, Earnings Hours and Employment Costs Survey Data 

Business Register Data (CSO), Earnings Hours and Employment Costs Survey Data (CSO)

To match data from EHECS with real time data from Revenue, Business Register data and data in relation to the Temporary Wage Subsidy scheme to investigate the impact of the Covid19 crisis and assess whether administrative data could be used to impute EHECS variables in the context of low response rates. Quarterly Statistical release and aggregate tables.

› Back to Top

 

CSO Division: Methodology

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables, Pseudonymised Person Activity Register Data P35L,

 

Employee Level Data (Revenue), Long and Short Term Social Welfare Payments Data (Welfare)

 

To evaluate new weighting methods to obtain detailed occupation statistics for the population through combining primary administrative data, secondary administrative data and longitudinal survey data.

Ongoing

 The anticipated outputs are statistical aggregates for sex, age group, 2 digit Nace sector, weekly income and occupation group

 

Census of Population 2016, Person and Dwelling Data (CensusNameData)  Directory of Irish Property Addresses, including Eircodes (GeoDir) The purpose of this project is to assess whether the GeoDirectory can be matched with pseudonymised Census data to draw samples for the Household Travel Survey. One-Off Tabular format for internal CSO analysis
  Directory of Irish Property Addresses, including Eircodes (GeoDir)~ Household Sampling Frame (Revenue) To investigate adding Eircode data to the Household Sampling Frame (Occupied Residence Frame dataset). Quarterly A quarterly register of the private residential occupied dwellings of approximately 1.9 million households with the name and address (including Eircode) of the main occupant. Approximately 14,500 will eventually be used to post out HTS survey forms every quarter. Addresses in the above exclusion list will not be included in the sample.
None Directory of Irish Property Addresses, including Eircodes (GeoDir), Household Sampling Frame (Revenue) To create a register of private occupied residential dwellings in Ireland in order to derive a sample of the households for the quarterly Household Travel Survey. Quarterly A quarterly register of the private residential occupied dwellings of approximately 1.9 million households with the name and address (including Eircode) of the main occupant. Approximately 14,500 will be used to post out HTS survey forms every quarter.
Occupied Residence Frame New Residential Electricity Network Connections (ESBNetwk) To try to measure the extent, if any,  of under coverage on the Occupied Residence Frame (ORF). One-Off The project will produce a micro data file for 2019 of all possible domestic addresses, in particular indicating those which are not on the ORF but could be. This is a research project so the only output generated will be a report and associated tables, which will not be disclosure or contain any confidential data. This report may be circulated internally in the CSO to a select number of persons.

 

› 

 

 

 

CSO Division: National Accounts

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Business Register; Balance of Payments; Census of Industrial Production; Annual Services Inquiry

Revenue Corporation Tax files

Examine consistency between Revenue profits data and relevant data from CSO surveys; derive additional NA variables

Annual

Improved estimates for NA variables (mainly profits)

 

 

Business Register

Revenue Commissioners P35 file and P35LF files

To obtain estimates of wages and salaries, ECSI and Other Labour Costs in the National Income Accounts at A64 and 2-digit Nace level.

Annual

 

Annual Compensation of Employees estimates at overall and detailed Nace level, numbers employed, average wage/ECSI/COE per employee at overall and detailed levels 

Business Register

Revenue Commissioners P35 file and DSFA CRS files

To obtain county based average income data

Annual

To produce regional accounts and county household income

CSO Business Register, CIP, ASI, Trade and BOP data

Revenue Commissioners P35, Corporation Tax files and Dunn & Bradstreet (details of all companies on the CRO register) files

To create a datafile for use internally by CSO’s National Accounts and BOP divisions.

Annual and twice yearly.

The data will be disseminated in National Accounts, Financial Accounts and Balance of Payments related aggregate tables.

None

HIQA list of inspected nursing homes and bed numbers.
Revenue Data Files (CT File & IT form 11 data file)

Matching HIQA list of inspected nursing homes and bed numbers to Revenue Data Files (CT File & IT form 11 data file) to estimate average cost of nursing home beds.

Annual

Tabular output.  

Business register

Pensions Authority Source Dataset (PensionsAuthoritysrc) To code Pensions contributions paid by Employers to Institutional Sector and Nace activity Annual Improved estimates for National Accounts CoE (D1) and labour costs (D12)

Census of Industrial Production Data, Annual Services Inquiry Data, Business Register Data 

Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables (Revenue, DEASP, CSO) To improve and extend the National Accounts supply and use tables by reconciling differences between the CSO Business Statistics and the National Accounts income estimates. Ongoing Improved data quality, distributional national accounts, dis-aggregated supply and use tables, economic growth accounts.

National Accounts Data

Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue) The Large Cases Unit receives quarterly survey forms with Labour costs itemised from around 60 companies. This project will compare these submitted values with the Gross Pay field from Pseudonymised Linked PAYE Real Time Data with extra DEASP Variables. This will allow consistency checks across the data. Quarterly A comparison of survey data with Pseudonymised Linked PAYE Real Time Data with extra DEASP Variables data for labour costs.

  › Back to Top

 CSO Division: Prices

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

None

Stamp Duty Returns, Business Energy Rating Certificates, Geodirectory, Pobal Haase-Praschke Deprivation Index

 

 

The project is to develop an alternative Residential Property Price Index (aRPPI) based on Stamp Duty Returns (SDRs) as opposed to the existing RPPI based on mortgage data.

Ongoing

A (possible) alternative RPPI. This index would be tabulated and published in the format of the current RPPI electronic release. 

Census of Agriculture 

Stamp Duty Returns, Property Registration Authority of Ireland Data, Geo Directory

The purpose of this Data Matching Project is to calculate agricultural land prices by region and land type. Annual  Tables for Eurostat and possible future CSO release.

Labour Force Survey Data, Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables 

Live Register Analysis (Welfare) 

The purpose of the project is to estimate unemployment rates at county level using small area estimation techniques. Ongoing Unemployment rates by county

Labour Force Survey Data

Directory of Irish Property Addresses, including Eircodes (GeoDir), Live Register Analysis (Welfare), Central Record System - Client, Payment and Employment Details (Welfare), Central Record System - Client Details (Welfare)

This project explores small area estimation that combine data from administrative and survey sources to produce estimates for small areas or domains. Quarterly Dissemination of details on births by nationality

› 

 

CSO Division: Quality Management, Support & Assurance

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

LFS data

 

 

Department of Social Protection Live Register Data, Department of Social Protection client register Data, Department of Social Protection CRS_SRC and An Post Geodirectory Data

This project explores small area estimation methods that combine data from administrative and survey sources to produce estimates for small areas or domains.

Once a suitable methodology has been identified, the unemployment estimates could be produced monthly or quarterly along with either the LFS or Monthly Unemployment releases.

 

Dissemination of details on births by nationality

 

QNHS / (LFS (BACKCASTED QNHS SERIES FROM Q1 1998 TO Q2 2017)

JobChurn, Census Analysis, SPP35, Person Activity Register

To evaluate new weighting methods to obtain detailed Occupation statistics for the population through combining primary administrative data (Person Activity Register & P35), secondary administrative data (JobChurn) and longitudinal survey data (LFS (BACKCASTED QNHS SERIES FROM Q1 1998 TO Q2 2017 ) - available from the Irish Social Science Data Archive).  

On-going

SAS datasets 

Pseudonymised Census Data

GeoDirectory  

The purpose of this project is to assess whether the GeoDirectory can be matched with pseudonymised Census data to draw samples for the Household Travel Survey.

Once-off

Tabular format for internal CSO analysis

 › Back to Top

 

 

CSO Divsion: Secondary Data Sources & Innovation

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Census of Population 2016, Person and Dwelling Data (CensusNameData)

 

Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare)

To explore the feasibility of using administrative data lists to evaluate Census coverage

Annual

 

The statistical outputs expected is a report and possibly a dataset of aggregated coverage indicators.

None 

Pseudonymised HSE Computerised Infectious Disease Reporting System (HSE), Pseudonymised HSE coronavirus test referrals and test facilities (HSE), Pseudonymised Hospital Inpatient Discharge Data (HSE), C19 Covid Care Tracker Application Data Analysis Tier (HSE)

Pseudonymised  COVID-19 person based HSE datasets are linked by CSO staff and permitted researchers to undertake statistical analysis to inform the national response to COVID-19.

Ongoing

Statistical outputs that have value in informing the public and national response to COVID-19

 › Back to Top

 

 

CSO Divsion: Social Analysis

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

CSO: Mortality Data;Census 2016 data

 

DEASP:CRS DATA

To produce an updated version of the Mortality Differentials in Ireland release using the 2016 Mortality and Census data

Every 5 years in line with Census

 

Tabular

None

Garda Síochana; PULSE data; Probation Service Case Tracking system (CTS) 

To determine a method for matching these data sources in the absence of a unique identifier.

Annual

Probation Recidivism Cohort;  Prison Recidivism Cohort

Address Matching Tool Sets using GeoDirectory (GeoDirAMToolSets) 

Directory of Irish Property Addresses, including Eircodes (GeoDir), Registered Deaths Data (GRO) 

The purpose of the project is to understand if there are variations in mortality in the Mid-West Region.

One-Off

The researchers are analysing the mortality data to see if there are any variations in mortality in the Mid-West - They intend producing a report.

 › 

 

 

CSO Divsion: Social Data Collection

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Survey on Income and Living Conditions Data

 

 

Revenue P35 file, Revenue form 11 file 

 

Verification of income data

 

Ongoing

 

Anonymised micro data and aggregated output tables

Survey on Income and Living Conditions Data

 

 

Farm database from Department of Agriculture and Food, Cattle movements and Single Farm Payments

Calculation of farming income for SILC survey

 

Annual

 

 

Anonymised micro data and aggregated output tables

Survey on Income and Living Conditions Data

Pseudonymised Corporate Customer System Data (DAFM), Pseudonymised Single Farm Payment Data (DAFM)

The purpose of this work is to link respondents to the SILC survey with their Basic Farm Payments.   The Basic Farm Payments are used in the calculation of farm income.   Basic Farm payments are also known as "Single Farm Payments"

Annual

We expect to obtain the Basic Farm Payment component of income for farm households in the SILC (the main aim of which is to collect all household income).

Survey on Income and Living Conditions Data Pseudonymised Housing Assistance Payment - Analysis Tier (HAP) To obtain rent details for SILC respondents who are renting their homes through the HAP system.  The cost to the householder of renting their home and the financial value of the benefit to the householder of being on the HAP scheme will also be obtained.  These feed into data on housing costs and housing benefits in the overall SILC results. Annual The outputs will be: (1) the amount of rent paid by HAP tenants who responded to the SILC and (2) the value of the benefit to these HAP tenants of being on the HAP scheme.  This data will not be used alone, it will be included in the SILC results as a whole.
Survey on Income and Living Conditions Data Pseudonymised Grant Application and Payment Data (SUSI) The purpose of the matching exercise is to link SILC respondents to any income they may have obtained through Education grants from SUSI.  The SILC collects information on all household income of which education grants may be a component. Annual The output will be the component of household income that is obtained from SUSI education grants.
Survey on Income and Living Conditions Data Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB) The matching exercise will be done for 2 different reasons.   To obtain details of rent paid by SILC respondents who are renting their homes.  To obtain details of rent received by SILC respondents who are landlords. Annual The outputs will be: (1) The amount of rent paid by SILC respondents who are renting their dwellings.  The amount of rent received by SILC landlords who are letting dwellings.
Survey on Income and Living Conditions Data Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue) The purpose of the matching exercise is to obtain income and deductions made at source i.e. tax, USC, PRSI, pension contributions etc. for SILC respondents who are in the PAYE system. Annual Gross income and deductions from income made at source (e.g. tax, PRSI, USC, pension contributions, etc.) for SILC respondents who are paid through the PAYE system.

 

  › 

 

CSO Division: Statistical Systems Co-Ordination Unit 

 

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Census 2016 dataset

Revenue P35 file, Revenue form 11 file, Revenue Local Property Tax file,Revenue PPSN details file

DSP Integrated Short Term Payment System; DSP Central Record System
Residential Tenancies Board File

The project will create two new data products based on linking administrative files to the Census file to demonstrate the advantage of linkable data.

This project is in the early stages and is ongoing

Analysis of Vacant Housing.
Income and welfare dependency maps for small areas
Micro data file available for statistical purposes within the CSO only.

Census 2011 - Census Main Persons Dataset, Census 2016 - COP2016_NDI_DATA_V1

ESB new connections, LPT - Local Property Tax, HTB - Help-to-buy Scheme, BER - Building Energy Rating file, Geodirectory

To produce new experimental building completions statistical series using additional data from ESB, Census, Revenue and Geodirectory data sets Quarterly Aggregate tabular format
 None  

Post primary Pupils Database
SPP35 linked employer employee file
IT form 11 (subset to indicate type of activity/trade)
SOLAS PLSS database of further training
QQI analysis dataset of awards
HEA Student Records System
DSP CRS and Jobseekers Longitudinal Database (JLD)

At the request of SOLAS, the CSO and SOLAS have agreed to collaborate on a project to evaluate outcomes of graduates of SOLAS funded further education courses. This data is held by SOLAS in the Programme Learner Support System (PLSS). A statistical product detailing this Outcomes analysis will be jointly produced.  Annual Report, tabular/aggregated, publication of findings
None

DES Post Primary and Exam Datasets
SPP35 linked employer employee file
IT form 11
SOLAS/FAS database of further training and PLSS
QQI analysis dataset of awards
HEA Datasets on Student Enrolment and Graduation
DSP (JLD, CRS, DSP Payments and unemployment data)
SUSI Dataset

The CSO has recently undertaken a statistical collaboration with the HEA to analyse the outcomes for graduates of higher education courses, in particular mature students and graduates of “Springboard” courses.
Linking data across the datasets described below will allow us to develop profiles of the activities of these graduates from higher education courses, in terms of their employment, unemployment, continued education, earnings, etc.

Annual/Biannual Report (either hard copy or electronic T4 release), tabular/aggregated
Pseudonymised Person Income Register Data (PIR), Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables (CensusAnalysis), Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables  

Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised SOLAS Client and Course Details (SOLAS), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Child Benefit Data (Welfare), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Primary Care Reimbursement Service Data (HSE), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Person Income Register Data (CSO), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Directory of Irish Property Addresses, including Eircodes (GeoDir), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Grant Application and Payment Data (SUSI), Pseudonymised Primary Pupil Details (DES), Pseudonymised HSE Drugs Payment Scheme Data (HSE)

This project will explore the economic and social characteristics of individuals with a disability using the Census and administrative data sources, exploring potential themes related to employment, education/training, housing, health and welfare.

One-Off A report and/or electronic release exploring the social and economic characteristics of individuals with disabilities.
Business Register Data,  Pseudonymised Person Income Register Data (PIR)  Pseudonymised QQI Course and Award Details Data (QQI), Pseudonymised Higher Education Student and Course Details (HEA), Pseudonymised Post Primary Pupil Details (DES), Pseudonymised SOLAS Client and Course Details (SOLAS), Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables (CSO), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Jobseekers Longitudinal Dataset (Welfare), Pseudonymised Springboard and ICT Student and Course Details (HEA), Pseudonymised Grant Application and Payment Data (SUSI), Leaving Certificate results from SEC (SEC), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Pobal Programmes Implementation Platform - Childcare Providers (Children), Pseudonymised Department of Edcuacation Teaching Staff Information (DES), Pseudonymised Teaching Council Register of Teachers (DES) The Educational Longitudinal Database (ELD) is a statistical framework for the compilation and analysis of learner outcomes over many years. The ELD provides the basis for a series of projects that the CSO has established in collaboration with Irish public sector bodies to examine learner outcomes across a range of educational levels and programmes. Ongoing Reports with aggregated data in graphs and tables will be produced, as will some tables for Statbank. Reports may be produced in collaboration with other agencies or by agencies working alone (but with oversight from CSO for quality and data protection matters).
Labour Force Survey Data, Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables, Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables,  Pseudonymised Person Income Register Data  Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Child Benefit Data (Welfare), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudonymised Stamp Duty on Property Transactions Data (Revenue), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pseudonymised Building Energy Rating details for domestic premises (SEAI), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Directory of Irish Property Addresses, including Eircodes (GeoDir), Pseudonymised Water Consumption Details for Residential Properties (IrishWat), Pseudonymised Domestic Wastewater Treatment System Registrations (LGMA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Gas Usage Details for Residential and Commercial Customers (GasNetwk), Pseudonymised New Residential Electricity Network Connections (ESBNetwk), Pseudonymised Meath County Council iHouse (LGMA), Pseudonymised Housing Assistance Payment - Analysis Tier (HAP), Property Registration Authority (PRA) folio, consideration, and other data (PRA), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Housing Agency social housing waiting lists (DeptHous) We plan to create report(s) on social housing and their occupants - including social renters - through the use of public sector administrative data in order to provide evidence and insights for policy makers in the sector, as well as providing statistical information to assist with the Rebuilding Ireland project. Ongoing Report/publication(s) on social housing in Ireland. 
Business Register Data, Pseudonymised Linked P35L Employee Level Data with extra DEASP and CSO Variables, Pseudonymised Person Income Register Data, Pseudonymised Census of Population, with GeoDirectory and DEASP Variables  Pseudonymised Central Record System - Client Details (Welfare), Pseudonymised Integrated Short Term Payment System Data (Welfare), Pseudonymised Income Tax Form 11, Business Details Data (Revenue), Pseudonymised Stamp Duty on Property Transactions Data (Revenue), Pseudonymised Income Tax Form 11, Person Details Data (Revenue), Pseudonymised Landlord and Tenant Details from the Register of Tenancies (RTB), Pseudonymised Local Property Tax Returns (Revenue), Pseudonymised Building Energy Rating details for domestic premises (SEAI), Pseudonymised Help to Buy Scheme Data (Revenue), Pseudonymised Consolidated Income Tax Forms 11 and 12 and P35L Data (Revenue), Pseudonymised Long and Short Term Social Welfare Payments Data (Welfare), Pseudonymised Directory of Irish Property Addresses, including Eircodes (GeoDir),  Pseudonymised Water Consumption Details for Residential Properties (IrishWat), Pseudonymised Domestic Wastewater Treatment System Registrations (LGMA), Pseudonymised Central Record System - Payment and Employment Details (Welfare), Pseudonymised Gas Usage Details for Residential and Commercial Customers (GasNetwk), Pseudonymised New Residential Electricity Network Connections (ESBNetwk), Pseudonymised Housing Assistance Payment - Analysis Tier (LCouncil), Pseudonymised Property Registration Authority (PRA) folio, consideration data (Bfacts), Pseudonymised Linked PAYE Real Time Data with extra DEASP Variable (Revenue), Pseudonymised Housing Agency social housing waiting lists (DeptHous), Pseudonymised Networks electricity consumption and customer data (ESBNetwk) We plan to create report(s) and analysis on the rental sector in Ireland - looking at it's participants (landlords, renters) and rental properties. This will be undertaken through the use of public sector administrative data and will look provide evidence and insights for policy makers in the sector. Ongoing Report/publication(s) on the rental sector in Ireland

 

 › 

 

CSO Division: Sustainable Development Goals & Indicator Reports

 

CSO Dataset Matched

Non-CSO Dataset Matched

Reason

Frequency

Statistical Outputs Obtained

Census of Population 2016 Data

Directory of Irish Property Addresses, including Eircodes (GeoDir), OSi National Mapping Database (PRIME 2) (OSi)

 

To use the coordinates of the Census 2016 geography dataset and the coordinates of a number of destination points to calculate the shortest-path distance of residential dwellings to various services and infrastructure. This is to examine the effect of proximity to certain day-to-day services relative to where people are living.

One-Off

It is proposed to produce a publication on proximity containing, inter alia, average distance by county and urban-rural, an investigation of settlements with core services and an analysis of isolated dwellings in rural areas.

 

Address Matching Tool Sets using GeoDirectory (GeoDirAMToolSets) (CSO), Census 2016 Housing Data (CSO), Pseudonymised Census of Population 2016, with GeoDirectory and DEASP Variables (CSO)

OSi National Mapping Database (PRIME 2) (OSi)

The objective of this project is to continue the work on the examination of the proximity of the population to everyday services and infrastructure by measuring the shortest-path distance from an origin (the coordinate of a residential dwelling on the Census 2016 dataset) to a destination (the coordinate of a particular facility or infrastructure).

Ongoing

CSO has a central role in the production of indicators for the Sustainable Development Goals (SDGs). There are three indicators; 11.2.1 (Proportion of population that has convenient access to public transport, by sex, age and persons with disabilities), 11.7.1 (Average share of the built-up area of cities that is open space for public use for all, by sex, age and persons with disabilities), and 9.1.1 (Proportion of the rural population who live within 2km of an all-season road.

 


 
 › Back to Top

 Archive of completed Data Matching Activities