Menu toggle

Australian Bureau of Statistics (ABS)

As the statistical agency of the Government of Australia, the Australian Bureau of Statistics provides key statistics on a wide range of economic, population, environmental and social issues, to assist and encourage informed decision making, research and discussion within governments and the community.

ACEMS and the ABS collaborate on cutting edge statistical, methodological, and applied challenges, as the organisations focus on the use of big data and powerful technology to keep Australia at the cutting-edge of statistical analysis.

The breadth of research conducted includes growing expertise in machine learning and data analytics, exploring developments in estimations with other National Statistical Organisations (NSOs), use of remotely sensed data, data linkage, and recommender systems.

The Australian Bureau of Statistics (ABS) and ACEMS enjoyed a fruitful 2021, with significant developments, engagements, projects, and outputs, to deliver benefits beyond ACEMS, including through enduring collaborations with ACEMS universities and members.

2021 Overview

Highlights for 2021 included:

  • ACEMS Members from multiple Nodes continuing to serve on the ABS Methodology Advisory Committee.
  • Ongoing collaborations with the ABS and ACEMS Members via the new QUT Centre for Data Science.
  • Three new PhD positions co-supervised by ACEMS CIs and the ABS.
  • International engagement and service to the United Nations, including an invited presentation to the UN Global Working Group on Big Data, and development of new software and training.
  • ABS’ participation in the ACEMS Ideas Challenge, which delivered six submissions for new ABS-ACEMS collaborations, including five from the ABS, and resulted in a collaboration to advance knowledge and connections from across disciplines and sectors on the salient issue of responsible Artificial Intelligence (AI), relevant not only to the ABS, and Australian Public Service (APS), but to other sectors and industries.
  • The ABS and ACEMS collaborating to bring together key stakeholders for a Responsible Data Science working dinner, Symposium and workshop, exploring important issues as well as key challenge areas to help ABS with input from across ACEMS and its extended academic and industry network.
  • Advancing knowledge in areas including Synthetic Data, with the establishment of a Consortium supported by ACEMS Research Sprint Scheme, and a workshop bringing together researchers and industry including ABS.
  • Representatives from across the ABS featuring in ACEMS media and events, such as a podcast and webinar, to discuss timely and important topics such as the ABS Census and to share collaborative achievements and plans.

ABS Methodology Advisory Committee

Over the years, ACEMS has supported the ABS methodology to advance and apply new statistical methods, through service by a group of senior ACEMS researchers on the ABS Methodology Advisory Committee (MAC).

ACEMS Professors James Brown, Scott Sisson, Robert Kohn and Kerrie Mengersen

ACEMS professors who served on the MAC in 2021 were: AI James Brown, CI Scott Sisson, CI Kerrie Mengersen, and CI Robert Kohn. ACEMS members and CIs who have served on the MAC previously include CI Rob Hyndman and CI Louise Ryan.

ABS-QUT Government Systems Domain Launch

On 5 October 2021, the ABS’ industry representative Bernadette Giuffrida from ABS’ Methodology Division joined ACEMS in launching the Government Systems domain of the QUT Centre for Data Science. ACEMS AI Gentry White and QUT’s Kevin Desouza are leading this industry sponsored domain, building on and integrating work led by the ACEMS-ABS partnership.

Overview of Government Systems Domain and ACEMS Collaboration with ABS

The Government Systems domain is about partnering with government agencies to improve people’s lives, by developing cutting-edge, data-driven solutions to drive innovation in public agencies, public policy processes, and governance systems. Its research:

  • Addresses the urgent challenges facing public agencies trying to make sense of their vast data reservoirs to create and implement evidence-driven public policy.
  • Designs and evaluates computational solutions driving innovation in the business of government.
  • Solves challenges faced by public agencies designing, developing, and deploying data-driven systems.

The aim of this event was to:

  • Share industry perspectives and details on some of the opportunities and current collaborations.
  • Gather individuals and groups across the university that would like to be involved in shaping this domain.
  • Facilitate identification of relevant projects and relationships with government.
  • Provide a platform to workshop some of the directions for the domain.

A Partnership to Support Capability and Evidence Based Policy to Create Greater Public Value (Slides from Dr Guiffrida’s presentation)

The ABS’ Bernadette Giuffrida delivered a talk discussing the role of the ABS’ methodology section, which has a research focus and supports blue-sky thinking, as well as the enabling partnership with ACEMS and future partnership with the QUT Centre for Data Science.

The planned benefits from the partnership – a first of its type for the ABS – include increasing capability, supporting strategic priorities (including timely insights from data and reducing respondent burden) and other research with a future focus, plus ethical framework considerations. It will support the methodology division’s testing and understanding of methods, and proof of concepts, as required prior to the wider adoption of new methods across and beyond the ABS.

The partnership will also provide benefits to the Data Profession Stream of the Australian Public Service, which ABS leads, being a technical forum for people across Government agencies to share experiences, partnerships, projects, and outcomes, and support the recruitment of data science talent to the public service.

Identified areas of interest to pursue from this event included:

  • Responsible use of data science in methods of producing, and in communication of, official statistics.
  • Technical developments including in processing different types of data.
  • Administrative improvements including system interoperability, adoption and implementation.
  • Policy issues including governance, algorithmic regulation and ethical frameworks.
  • Developing internal data science capabilities and data science knowledge in government organisations, including to evaluate different methods.
  • Identifying and addressing hidden backlogs.
  • Supporting different levels of government, from local to nations to international organisations, and whole of government.

ABS-ACEMS PhD Positions

The ABS has funded three PhD projects which will be supervised by ACEMS CI Kerrie Mengersen. The three projects include:

  • Automating area design in the Australian Statistical Geography Standards (ASGS). This work involves the development of a set of processes to efficiently update the ASGS.
  • Identifying labour markets and functional areas within the ASGS. This work includes the development of a model or algorithm to identify labour markets and functional areas to support design of the ASGS.
  • Official statistics from combined sources - model development using an agriculture case study. This includes the development of a production process to combine survey data with alternative data sources using deep neural nets, delivering small area outputs and enabling a transition from survey data to administrative inputs.

United Nations Working Group - and Invited Presentation

The ABS has facilitated ACEMS Members including Kerrie Mengersen, Jacinta Holloway-Brown and others working with the United Stations.

In 2021, a highlight was an invited presentation in March 2021 to the UN Global Working Group on Big Data in New York on the topic of “Training program on earth observations for agriculture statistics”

ACEMS Ideas Challenge: Supporting the ABS & Responsible Data Science

ACEMS Stakeholder Engagement set a challenge for ACEMS partners and members to propose ideas for collaborations, aimed to help partners by leveraging ACEMS capabilities. The ABS, and ACEMS, responded with five different project ideas which included model building, deep neural networks and machine learning.

Machine learning methods are beginning to be utilised in public service agencies and National Statistical Organisations like the Australian Bureau of Statistics (ABS). Plus, there is even more potential for these methods to be applied for benefits including cost savings, service quality improvements, and reducing survey respondent burden.

The ABS has strong expertise and experience in traditional sampling and statistical modelling methods, and sought to harness the capabilities and network of ACEMS in relation to machine learning, to ensure that any methods the ABS adopts for its statistics and processes are transparent, justifiable, explainable and rigorous.

Specifically, the ABS sought ACEMS support via the Ideas Challenge to help the ABS to rigorously understand and responsibly and carefully adopt useful machine learning techniques. Particular areas of interest to the ABS, informed by earlier engagements including seminars hosted by and academic literature cited by ACEMS, included:

  • Investigating the use of attention mechanisms for explainability.
  • Estimating uncertainty for deep neural networks (DNN).
  • Investigating applications of contrastive explanations in the ABS.
  • Representivity of Training and Test Data.
  • Efficient Sampling Techniques for Labelling Training Data for Model Building.

ACEMS’ Stakeholder Engagement Officer Angela Dahlke and the Government Systems domain team at QUT led the organisation of a number of engagements customised to deliver knowledge-transfer to the ABS, from a diversity of sources - within the ACEMS researcher and partner network, and extending to different disciplines and sectors/industries.

A number of ABS and ACEMS participants and guests at the Brisbane location of the hybrid Symposium

These included:

  • A hybrid responsible data science symposium attended by 22 ABS staff plus ACEMS members across multiple nodes (both in person in Brisbane and online). The event included cross-disciplinary and cross-sector presentations from academia and industry as well as a keynote presentation by Susan Shaw (ABS).
  • A hybrid workshop linked to the symposium which included research presentations by the ABS, discussions of challenges, and multi-stakeholder interactive feedback (both in person and online).
  • A working dinner, preceding and informing the above events, with three ABS guests, including key-ABS collaborator, the ABS QUT Government Systems Co-chair Bernadette Guiffrida, together with ACEMS researchers and thought leaders across academia (statistics, computer science, justice, law, mathematics, engineering, philosophy) plus representatives from different industries including indigenous health, technology, manufacturing, engineering, and defence.

From an ABS perspective, the symposium was a chance to identify areas of emerging interest and further engage with research priorities in the area of responsible data science.
I had high expectations heading into the day, and they were exceeded. With every speaker, I found myself writing down points that I want to remember or want to look up and read more about. We are going to have a lot of conversations that continue from this”

Bernadette Giuffrida, ABS Assistant Director and QUT Centre for Data Science ABS Co-Chair of Government Statistics

These events were beneficial to all who attended - including ABS and other industry - as well as academics and students from ACEMS and other disciplines.

It was very interesting having people from the ABS here, who already have strong governance structures around what they can and cannot do with data. They’ve been very thoughtful about what they might do with artificial intelligence. I think there’s a lot we can learn by following in the footsteps of our best actors in this space”

Kate Devitt, Chief Scientist, TAS-D CRC Symposium Speaker & Workshop Participant

Outline of the Responsible Data Science Symposium & Engagement Tool

Learn more about the Symposium here in this article, and watch a selection of the video presentations (excluding those not publicly accessible).

Responsible Data Science - Workshop

Following the Symposium, the ABS together with industry and academic attendees were invited to join the ABS workshop on responsible data science, facilitated by ACEMS. The workshop featured presentations by ABS’ Claire Clarke and Sean Buttsworth, and then explored the suggested projects from the ABS as part of the Ideas Challenge.

The workshop provided a forum to discuss these projects from multiple disciplinary and sector lenses to further inform and support the ABS’ work.

# Topic Details
1 Investigate the use of attention mechanisms for explainability Attention is a feature of some natural language processing deep learning approaches - it focuses the model on relevant words in an input text in order to produce more accurate predictions for applications such as translation. However, attention has recently been proposed as a way of extracting explanations from other deep learning applications - it can be used to highlight which parts of an input a model is focussing on when making decisions. This project proposes investigating the application of attention mechanisms for generating explanations in the type of applications likely to be relevant to the ABS. It is expected that this work will be largely conceptual, focussed around a literature review and making recommendations for more detailed follow-up work, but the ABS is currently conducting a project on NLP using online job advertisement data and it may be feasible for a brief practical application to be done in conjunction with this other project.
2 Continue work on estimating uncertainty for DNNs A recent internship project at the ABS investigated the use of replication methods for estimating uncertainty from deep neural networks. This work recommended further investigations into extra-neural networks and MC dropout as potentially useful methods for estimating uncertainty. The project went some way towards setting up code to implement these methods so it should be relatively straightforward to conduct some simulations in the short time frame available under the Ideas Challenge. Results from the two methods could then be compared to results from the earlier work.
3 Investigate applications of contrastive explanations in the ABS Contrastive explanations integrate understandings from the social sciences about how people ask questions and interpret meaning to generate more useful explanations of predictions or decisions made by machine learning models. This project would investigate the applicability of contrastive explanations to the ABS context and make recommendations about possible uses. Given the lack of readily available pre-existing implementations, and the short time frame of this project, it is expected that this project will be largely conceptual.
4 Representivity of Training and Test Data This project would consider aspects of representivity of training and test data for supervised learning. Some specific questions that the ABS would like to see investigated are:
  • Does training data need to be representative of the application population in the sampling sense or is the appropriate concept one of adequate coverage of predictor and/or target variable distributions?
  • Does the test data need to be representative of the application population - in what sense and for what reason?
  • What techniques are available to help deal with the problem of concept drift - i.e. the training data is not representative of the application population?
  • What is the nature of biases that could arise from lack of representivity?
5 Efficient Sampling Techniques for Labelling Training Data for Model Building Training data is needed for fitting supervised machine learning models but may not always be plentiful, especially when labelling needs to be performed manually. In such a situation we may wish to target the new sample cases to be labelled to improve model performance for the least additional resource.. Active learning techniques make use of the model outcomes to determine such a sampling strategy. This project would explore the effectiveness of active learning techniques including the potential for bias versus the resource savings. A real-world ABS application is available - one that utilises named entity recognition models for classifying web-scraped job advertisements. The bulk of this project would be to evaluate the application of active learning techniques to this real-world application.

Synthetic Data - Workshop - Research - Consortium

The ABS, amongst other organisations across government and industry, is interested in synthetic data and its use, including to provide safe access to data.

In view of this interest, and limits to current methods/approaches, ACEMS has supported:

  • research into synthetic data - including an ACEMS Research Sprint Scheme funded project “Flexible generative models for creation of synthetic data”.
  • co-hosted knowledge-transfer events including a synthetic data workshop with guest speakers from multiple ACEMS nodes and other institutions.
  • Formation of a synthetic data consortium with external stakeholders including the ABS (amongst several ACEMS partners, affiliates and others) to be involved and benefit from specific synthetic data use cases.

ABS and other government, industry and academic guests attended the ACEMS co-hosted Synthetic Data workshop with the Australian Data Science Network (ADSN) on 10 September 2021. This workshop featured presenters from and beyond ACEMS, as featured in the table below, and enabled knowledge exchange.

The ABS contributed to workshop discussions, including with Chris Mann (ABS, Methodology Transformation) who has ABS experience in confidentiality and statistical disclosure control methods. He shared ABS’s interest in synthetic data methods as a means to increase safe access to data as well as to enable experimentation and research using “realistic” but un-real data

Other industry guests included: Cancer Council Queensland, Sax Institute, Services Australia, Integrity Systems Company / Meat and Livestock Australia, and others as well as additional guests who requested access to the workshop recordings.

To learn more, watch these workshop presentations (or click on the individual videos):

Overview of Synthetic Data Generation by ACEMS PhD Student Conor Hassan

Watch Presentation

GRATIS: GeneRAting TIme Series with diverse and controllable characteristics by ACEMS CI Professor Rob Hyndman

Watch Presentation

Deep Learning Techniques for Dealing with Lack of Data by Associate Professor Richi Nayak, QUT

Watch Presentation

Synthetic data generation using moment-based density estimation by Dr Bradley Wakefield, University of Wollongong (NIASRA)

Watch Presentation

Generating artificial video data to train machine learning algorithms by Dr Anthony Paproki

Watch Presentation

Learning with Limited Spatio-Temporal Data: Generative Adversarial Networks (and Alternative Approaches) by Professor Fiona Salim, RMIT

Watch Presentation

Webinar - Monitoring the nation's pulse: The what, who, how and why of the Australian Census

Guest speakers from the ABS and ACEMS featured in a webinar exploring the Australian Census, the value of this data (including its use in the AusEnvHealth atlas), data science, and an industry-academic partnership:

The webinar features the following presenters, and talks, followed by open panel discussions

  • “What’s new in the 2021 Census – the “what”” by Mark Harding – Program Manager, 2021 Census Data Operations.
  • “Value of the Census Data – the “who, how and why” by Caroline Deans – Director, 2021 Census Dissemination.
  • “Partners in Data Science” by Gentry White – Data Science and Government Statistics Chair.
  • “AusEnHealth Project: Climate and Air Quality Vulnerability Index Development & ABS Data Use” by Aiden Price – Research Associate - Project Manager for the Australian Environmental Health Atlas.

The panel discussion covered topics such as: planning for the next census, including potential new data sources, understanding Australia’s evolving data needs, ROI, potential challenges and opportunities for data science such as increasing efficiency and data accuracy, reducing error and respondent burden, doing more with less, using other data, getting better information out faster and more efficiently, and avoiding redundancy in collection.

Learn more about the presentations, with links to individual videos, in the table below. Watch the webinar

“Monitoring the nation's pulse: The what, who, how and why of the Australian Census” Overview of Panelists’ Presentations
Panellist Talk
What’s new in the 2021 Census – the “what” – Mark Harding, ABS Watch video
The 2021 Census design has been guided by its overarching objectives: smooth-running Census, garners strong support from the community, and produces high quality data. Mark Harding will talk through what is new about the 2021 Census, and in particular how the ABS has adopted a user-centred design approach to delivering the Census. This year the ABS has faced the added challenge of running a Census during the pandemic. Mark will describe how the ABS has responded to COVID-19 and the impacts on Census field operations.
Value of the Census Data – the “who, how and why” – Caroline Deans, ABS Watch video
Census data is used to inform important decisions about transport, schools, health care, infrastructure and business. While many people are aware of how the Government uses the data, the Census is also heavily relied on by community groups and small businesses to improve the lives of individuals. Caroline Deans will cover some case study examples on the varying uses of Census data.
Caroline will also talk through what happens to the information collected, from when the data is collected through to when it is transformed into meaningful statistics. The 2021 Census is being conducted at a most interesting time and the data from this Census will be very important to show how the pandemic is affecting our economy and society.
Partners in Data Science – Gentry White, ACEMS & QUT Centre for Data Science Watch video
Gentry White discusses the unique partnership between ACEMS, the QUT Centre for Data Science and the ABS, outlining their current program of research and plans for the future.
AusEnHealth Project: Climate and Air Quality Vulnerability Index Development & ABS Data Use – Dr Aiden Price Watch video
The changing nature of many hazards, coupled with growing and ageing populations and infrastructure in exposed areas is leading to increased vulnerability across Australia and internationally.
AusEnHealth is a multi-agency funded project with the aim to provide tools to support the assessment of population vulnerability through an environmental health lens. This has been achieved by combining air quality and climate data with demographics data, the latter being comprised almost entirely of Census data from the ABS. Other collaborators include BOM and the Centre for Air Pollution, Energy and Health Research (CAHA).
A vulnerability index is calculated from exposure plus sensitivity minus adaptive capacity.

ACEMS Podcast

Every five years, the Australian Bureau of Statistics (ABS), conducts a national Census. The official date of the next census is Tuesday, 10 August. The last Census, in 2016, ran into big problems on Census night, with the website having to be shut down. In this episode, ahead of the 2021 Census on 10 August, ACEMS features the Deputy Australian Statistician with the Australian Bureau of Statistics, Ms Teresa Dickinson, who led the 2021 Census. She explains what they've learnt from the 2016 Census and explores how the COVID-19 pandemic has affected the Census, the future of the Census and how things might change.