【資料科學專題演講】Distributed sequential federated learning

The analysis of data stored in multiple sites has become more popular, raising new concerns about the security of data storage and communication.  Federated learning, which does not require centralizing data, is a common approach to preventing heavy data transportation, securing valued data, and protecting personal information protection. Therefore, determining how to aggregate the information obtained from the analysis of data in separate local sites has become an important statistical issue.  The commonly used averaging methods may not be suitable due to data non-homogeneity and incomparable results among individual sites, and applying them may result in the loss of information obtained from the individual analyses. Using a sequential method in federated learning with distributed computing can facilitate the integration  and accelerate the analysis process. We develop a data-driven method for efficiently and effectively aggregating valued information by analyzing local data without encountering  potential issues such as information security and heavy transportation due to data communication.  In addition, the proposed method can preserve the properties of classical sequential adaptive design, such as data-driven sample size and estimation precision when applied to generalized linear models. We use numerical studies of simulated data and an application to COVID-19 data collected from 32 hospitals in Mexico, to illustrate the proposed method.

【主講人】中央研究院統計科學研究所 張源俊研究員

【講題】Distributed Sequential Federated Learning  
【時間】 2023年3月23日 15:30-16:30
【地點】成功大學統計學系三樓視聽教室 62331