Back to Top

Pulse Survey now running Five years on, we're measuring the lasting impact of COVID-19 on our lives in our latest short Pulse Survey. CSO Pulse Surveys are anonymous and open to all. #CSOTakePart

 Skip navigation

Irish Population Estimates from Administrative Data Sources, 2021

CSO Frontier Series Research Paper

CSO research publication, , 11am
Frontier Series Output

CSO Frontier Series outputs may use new methods which are under development and/or data sources which may be incomplete, for example new administrative data sources. Particular care must be taken when interpreting the statistics in this release.
Learn more about CSO Frontier Series outputs.

Key Findings

  • Based on the Irish Population Estimates from Administrative Data Sources (IPEADS) experimental methodology the population of Ireland in April 2021 was estimated to be 5.28 million. The age structure of the population reflects the impact of changing birth rate and migration patterns over recent decades.

  • Statistics on age, sex, nationality, principal economic status and NACE sector can be produced and broken down to varying degrees by local authority, local electoral area (LEA) and electoral division.

  • IPEADS 2021 does not produce population statistics for households or small areas. However, improved inclusion of Eircodes across a number of key administrative datasets has led to 80% of records containing an Eircode up from 60% in 2020. As a result of this, an experimental LEA population is presented for the first time.

  • As this is the second iteration of IPEADS Research Paper it was possible to identify a newly active population containing those who were not administratively active in 2020. This cohort is presented by sex and nationality.

Statistician's Comment

The Central Statistics Office (CSO) has today (11 July 2023) released Irish Population Estimates from Administrative Data Sources, 2021. Commenting on the release, Rob Kelly, Statistician in the Life Events and Demography Section, said:

This research paper, as the second iteration of Irish Population Estimates from Administrative Data Sources (IPEADS), continues to demonstrate the policy-relevant research projects the CSO is developing as part of its leadership role in the Irish Statistical System. While serving primarily as a means of demonstrating the potential of administrative data for delivering new classes of statistical products, IPEADS also demonstrates an experimental and evolving platform for the development of population estimates. It must also be emphasised that as an experimental methodology, IPEADS estimates must be interpreted with caution and are not comparable with official CSO population estimates such as Census data."

Editor's Note

This experimental research paper is focused on the production of population statistics from administrative data. This experimental research paper is intended to both illustrate the potential of administrative data to produce demographic statistics and highlight the challenges that arise. These are not the official population statistics and should be used with caution. 

Introduction

IPEADS 2021, like the previous release IPEADS 2020, features experimental statistics that are entirely based on administrative records only. It is important to note that such administrative datasets are designed for the operational needs of Irish public bodies and not as statistical data sources. However, it is generally accepted that the activity of individuals in administrative datasets can be used to indicate their presences in Ireland. IPEADS 2021 aims to estimate the population of Ireland in April of the reference year by applying a CSO-developed methodology for measuring activity in administrative datasets.

The use of administrative data in population estimation is not unique to IPEADS or even the CSO and has been driven by both the benefits of using administrative data in censuses and the greater difficulties encountered in traditional census. The benefits include reduced cost, reduced burden on respondents, improved timeliness and greater frequency of results. The challenges which administrative data usage can address include difficulties in recruiting field staff as well as establishing contact with householders.

However, the administrative data landscape is constantly developing. This can bring opportunities and challenges. The methodology needs to be flexible to deal with changes in the data sources. It is important to note that when using administrative data to estimate the population, different methodologies will result in different estimated counts. For example, using a ‘signs of life’ approach will result in a higher count than attempting to apply a strict usual residence criterion. See the Methodology section for more details.

Data collection

Currently population statistics in Ireland are produced through the Census of Population and the Population and Migration Estimates. The primary source of data collection is the Census of Population which is generally carried out every five years. In between censuses, estimates are calculated by trending forward the previous Census of Population using births and deaths data from administrative records and migration estimates from the CSO Labour Force Survey.

In the Irish Population Estimates from Administrative Data Sources (IPEADS), all statistics are produced using data collected from administrative records only. These records were not initially created for measuring the population, but rather for service delivery and day to day operations of public bodies.  However, activity in administrative data can be a sign of presence in the State. This experimental work attempts to estimate the population of Ireland in April 2021 by measuring activity in the administrative data using a new methodology devised by the CSO.

At an international level, an increasing number of countries are moving towards greater use of administrative data in censuses, a move supported by the United Nations Economic Commission for Europe (UNECE) recommendations. This is evident from the UNECE Census Wiki, which compiles information on the 2020 round of censuses as reported by member countries, see Background NotesUNECE reports note this move has been motivated by the benefits of using administrative data sources in census, including reduced cost, reduced burden on respondents, improved timeliness and greater frequency of results. It is becoming increasingly difficult to conduct traditional censuses with challenges experienced in recruiting field staff as well as establishing contact with householders.

The Census has been postponed twice in recent years due to unforeseen circumstances. In 2001, an outbreak of foot and mouth disease in Ireland delayed the census by a year. The census originally scheduled for 2021 was delayed by a year due to the global COVID-19 pandemic. Robust administrative population data available on an annual basis would offset some of the risk of an information deficit in the event of delays to the census and insulate users from the delays and postponements which have occurred with the traditional census model.

The approach is still experimental however and there are aspects of using administrative sources for counting the population that will require further development. These include the relatively limited range of variables available (e.g. commuting patterns, use of the Irish language, religion) from administrative sources compared with the Census of Population and the production of statistics at small geographic areas. The CSO's access to some outstanding public sector data flows will help with both issues, as will the increased level of Eircodes being collected in the public sector with the move to online public administration of services. The CSO is committed to addressing these issues in partnership with Public Sector Bodies and will work towards consistently improving these estimates based on data improvements over the coming years. The key metrics in assessing these data improvements for the population count are the percentage of records with Eircodes, (currently estimated in the region of 80%) and the percentage of records coming from ‘real time’ sources (approximately 70%), where the activity recorded is from a recent time period. The CSO is committed to addressing these issues and working towards consistently improving these estimates over the coming years.

Defining the population

There are several ways in which a population can be counted. The most familiar approach is the one taken in the Census of Population, which records the ‘de facto’ population, or the number of persons present in the country on Census Night based upon completion of the census questionnaire. This count can include temporary visitors to the country and exclude persons who usually live in the country but are abroad on the night of the census.

This experimental publication attempts to count the ‘usually resident’ population of Ireland in April 2021. For someone to be usually resident in Ireland in April 2021, they will have lived in the country for a continuous period of at least 12 months including April 2021.

Usual residence is a widely used statistical concept, recognised by many international bodies, including the European Union.

In a survey or census, it is possible to get the required information to determine usual residency by asking people specific questions. This is not possible using administrative data sources, but usual residence can be imputed by looking at the level of activity in different administrative data sources over a period of time. There are challenges with both methods of defining usual residence. For example, in the census respondents may misinterpret the question, while in administratively sourced population estimates there will often be coverage issues in individual administrative datasets and definitional issues.

The IPEADS publication is part of a body of developmental work undertaken by the CSO in recent years to look at alternative methods to count the population. The methodology in this report aims to estimate an administrative count of the population based, as closely as possible, on usual residence by applying a specific set of rules around activity in the 24 months from Jan 2020-Dec 2021.

The estimates are produced by linking administrative records from various data sets that have been pseudonymised to maintain privacy. Rules are then applied to decide who should be included in or excluded from the usually resident population. In general, the rules apply as follows;

Include:

  • persons (and, where identified, their current partners and/or children) active over a period of 12 months or more, including April 2021, in key real-time datasets, for example persons in receipt of weekly and monthly social welfare payments or in the PAYE system for employment and pensions.
  • persons in annual datasets, for example enrolment in education or annual self-employed returns.

Exclude:

  • persons not recorded as active in administrative data between April and December 2021.
  • persons active for a period of less than 12 months around April 2021.
  • persons born on or after 01 April 2021.
  • persons who died before 01 April 2021.

It is important to note that when using administrative data to estimate the population, different rules to the above could be applied. This would result in different estimated counts. For example, applying a rule that any person with any activity recorded in administrative data in 2020 or 2021 would be considered usually resident would result in a higher population count than is recorded in this publication. Conversely, a rule counting only persons with recorded activity in administrative data strictly within 12 months around April 2021 would result in a smaller population count. See also Methodology section.

This publication is categorised as a CSO Frontier Series Output. Particular care must be taken when interpreting the statistics in this release.

CSO Frontier Series outputs may use new methods which are under development and/or data sources which may be incomplete, for example new administrative data sources. Publishing outputs under the Frontier Series allows the CSO to provide useful new information to users and get informed feedback on these new methods and outputs whilst at the same time making sure that that the limitations are well explained and understood.

This new experimental report produces the first estimates of the population based on administrative data.

Official population estimates are currently based on the Census of Population and Population and Migration Estimates. This report focuses on activity in administrative data, where engagement with the administrative system is taken as an indication of residence in Ireland.

The underlying assumptions and methodologies are different in this experimental release to the official published estimates of the population, and therefore disparities are to be expected and can be seen.

Many attributes available from census data (e.g. religion or health status) are not available from administrative data, and some datasets lack geographical detail. While the methodology attempts to correct for some of these issues, we anticipate improvements in data holdings over time through the continued adoption and implementation of the National Data Infrastructure. The CSO will continue to develop the methodology to estimate the population and will continue to report on future iterations of this project.

It should be noted that the CSO can only measure activity in the available administrative sources. With the exception of children or current spouse/partners connected to active persons, persons who did not have any interaction with the state during the reference period applied by the ‘rules’ will not be included in the population estimates.

As well as the strict legal protections set out in the Statistics Act, 1993, and other existing regulations, the CSO is committed to protecting individual privacy and all identifiable information from each of the data sources used in this analysis, such as name, date of birth and addresses, are removed before use and only anonymised statistical aggregates are produced. For further information on the data sources, linking procedures and limitations of this report, see the Methodology chapter. Further information on privacy can be found in the Background Notes.

Why you can Trust the CSO

Learn about our data and confidentiality safeguards, and the steps we take to produce statistics that can be trusted by all.