jump to main area
:::
A- A A+

Seminars

Data Perturbation

Time:11:00 AM~12:00 PM.
  • 2022-11-14 (Mon.), 11:00 AM
  • Auditorium, B1F, Institute of Statistical Science.
  • Lecture in English. Online live streaming through Cisco Webex will be available.
  • Prof. Xiaotong Shen
  • School of Statistics, University of Minnesota, USA

Abstract

Data perturbation is a technique for generating synthetic data by adding "noise" to original data, which has a wide range of applications, primarily in data security. Yet, it has not received much attention within data science. In this presentation, I will describe a fundamental principle of data perturbation that preserves the distributional information, thus ascertaining the validity of the downstream analysis and a machine learning task while protecting data privacy. Applying this principle, we derive a scheme to allow a user to perturb data nonlinearly while meeting the requirements of differential privacy and statistical analysis. It yields credible statistical analysis and high predictive accuracy of a machine learning task. Finally, I will highlight multiple facets of data perturbation through examples.

Please click here for participating the talk online

Download

1111114 Prof. Xiaotong Shen ( 沈曉彤 教授 )(En).pdf
Update:2022-11-11 14:46
scroll to top