In this talk, I will present two new
methodologies used for computer and biology Experiments. In computer
experiments, Gaussian process (GP) is a popular choice for approximating a
deterministic function, but the role of transformation in GP modeling is not
well understood. I will argue that using a transformation on the response can
make the deterministic function approximately additive, which can then be
easily estimated using an additive GP. Such a GP is named as Transformed
Additive Gaussian (TAG) process. I will explain efficient techniques for
fitting a TAG process and the advantages of using TAG. Some extensions of TAG
for modeling large-scaled and high dimensional data will also be discussed. The
second methodology is motivated by a biology experiment in the study of T cell
signaling. The biology experiments possess some common features in many
applications, such as random effects and varying coefficients, but a method
that can quantify the features has not yet been systematically developed in the
literature. To fill in the research gap, we propose the local linear varying
coefficient frailty method. The method provides a rigorous quantification of an
early and rapid impact on T cell signaling from the accumulation of bond
lifetime, which can shed new light on the fundamental understanding of how T
cells initiate immune responses. The
theoretical properties of the estimators from the method, including the bias
correction property near the boundary, will be presented along with discussions
on the asymptotic bias-variance trade-off.

Abstract

Bayesian nonparametric (BNP) models provide elegant
methods for discovering underlying latent features within a data set, but
inference in such models can be slow. We
partition the latent measure into a finite measure containing only instantiated
components, and an infinite measure containing all other components. We then
select different inference algorithms for the two components: uncollapsed
samplers mix well on the finite measure, while collapsed samplers mix well on
the infinite, sparsely occupied tail. The resulting hybrid algorithm can be
applied to a wide class of models, and can be easily distributed to allow
scalable inference without sacrificing asymptotic convergence guarantees.

Multistratum
designs have gained much attention recently. Most criteria, such as the D
criterion, select multistratum designs based on a given model that is assumed to
be true by the experimenters. However, when the true model is highly uncertain,
the model used for selecting the optimal design can be seriously misspecified.
If this is the case, then the selected multistratum design will be not
efficient for fitting the true model. To deal with the problem of high
uncertain models, we propose the generalized Bayesian D-D (GBDD) criterion,
which selects multistratum designs based on the experimental data. Under the
framework of multistratum structures, we develop theorems and formula that are
used for conducting Bayesian analysis and extracting information about the true
model from the data to reduce model uncertainty. The GBDD criterion is easy and
flexible in use. We provide several examples to demonstrate how to construct
the GBDD-optimal split-plot, strip-plot, and staggered-level designs. By
comparing with the D-optimal designs and one-stage generalized Bayesian D-optimal
designs, we show that the GBDD-optimal designs have higher efficiency on fitting
the true models. The extensions of the GBDD criterion for more complicated cases,
such as more than two stages of experiments and more than one class of
potential terms, are also developed.

KEY WORDS:
Bayesian D criterion, D criterion, split-plot design, staggered-level

design, strip-plot
design, two-stage experiment.