Institute of Statistical Science Academia Sinica [Seminar Feed] Statistics, Stat, Edu en-us Tue, 18 Feb 2020 23:30:37 +0800 PHP Novel Statistical Methodologies for the Analysis of Computer and Biology Experiments Abstract

    In this talk, I will present two new methodologies used for computer and biology Experiments. In computer experiments, Gaussian process (GP) is a popular choice for approximating a deterministic function, but the role of transformation in GP modeling is not well understood. I will argue that using a transformation on the response can make the deterministic function approximately additive, which can then be easily estimated using an additive GP. Such a GP is named as Transformed Additive Gaussian (TAG) process. I will explain efficient techniques for fitting a TAG process and the advantages of using TAG. Some extensions of TAG for modeling large-scaled and high dimensional data will also be discussed. The second methodology is motivated by a biology experiment in the study of T cell signaling. The biology experiments possess some common features in many applications, such as random effects and varying coefficients, but a method that can quantify the features has not yet been systematically developed in the literature. To fill in the research gap, we propose the local linear varying coefficient frailty method. The method provides a rigorous quantification of an early and rapid impact on T cell signaling from the accumulation of bond lifetime, which can shed new light on the fundamental understanding of how T cells initiate immune responses.  The theoretical properties of the estimators from the method, including the bias correction property near the boundary, will be presented along with discussions on the asymptotic bias-variance trade-off.

Tue, 11 Feb 2020 10:31:25 +0800
TBD Wed, 8 Jan 2020 16:03:37 +0800 Scalable inference for Bayesian Non-parametrics


Bayesian nonparametric (BNP) models provide elegant methods for discovering underlying latent features within a data set, but inference in such models can be slow.  We partition the latent measure into a finite measure containing only instantiated components, and an infinite measure containing all other components. We then select different inference algorithms for the two components: uncollapsed samplers mix well on the finite measure, while collapsed samplers mix well on the infinite, sparsely occupied tail. The resulting hybrid algorithm can be applied to a wide class of models, and can be easily distributed to allow scalable inference without sacrificing asymptotic convergence guarantees.

Tue, 11 Feb 2020 11:01:40 +0800
Data-driven multistratum designs with the generalized Bayesian D-D criterion for highly uncertain models Abstract

      Multistratum designs have gained much attention recently. Most criteria, such as the D criterion, select multistratum designs based on a given model that is assumed to be true by the experimenters. However, when the true model is highly uncertain, the model used for selecting the optimal design can be seriously misspecified. If this is the case, then the selected multistratum design will be not efficient for fitting the true model. To deal with the problem of high uncertain models, we propose the generalized Bayesian D-D (GBDD) criterion, which selects multistratum designs based on the experimental data. Under the framework of multistratum structures, we develop theorems and formula that are used for conducting Bayesian analysis and extracting information about the true model from the data to reduce model uncertainty. The GBDD criterion is easy and flexible in use. We provide several examples to demonstrate how to construct the GBDD-optimal split-plot, strip-plot, and staggered-level designs. By comparing with the D-optimal designs and one-stage generalized Bayesian D-optimal designs, we show that the GBDD-optimal designs have higher efficiency on fitting the true models. The extensions of the GBDD criterion for more complicated cases, such as more than two stages of experiments and more than one class of potential terms, are also developed.


KEY WORDS: Bayesian D criterion, D criterion, split-plot design, staggered-level

design, strip-plot design, two-stage experiment.

Sat, 15 Feb 2020 17:23:30 +0800