ABSTRACT

Many datasets of interest to statisticians are subject to privacy conditions. This can constrain access, analysis, sharing and release of results. In this presentation, we will consider two ways in which this issue might be addressed. The first is through federated learning, in which the analysis is undertaken in such a way that the data remain in situ and private. The second is synthetic generation of the data, such that the simulated data retains salient characteristics but retains the required privacy. We provide some extensions to the class of models that can be considered in federated learning, and an overview of synthetic generation of tabular data. The exposition of these ideas will be motivated by the creation of an Australian Cancer Atlas.

This research is in collaboration with QUT colleagues Conor Hassan and Dr Robert Salomone, and is funded by the Australian Research Council and Cancer Council Queensland.

Recent Posts