Institute of Statistical Science Academia Sinica [Seminar Feed] Statistics, Stat, Edu en-us Tue, 20 Aug 2019 15:08:10 +0800 PHP Inference for the Degree Distributions of Preferential Attachment Networks with Zero-Degree Nodes Abstract


    The tail of the logarithmic degree distribution of networks decays linearly with respect to the logarithmic degree is known as the power law and is ubiquitous in daily lives. A commonly used technique in modeling the power law is preferential attachment (PA), which sequentially joins each new node to the existing nodes according to the conditional probability law proportional to a linear function of their degrees. Although effective, it is tricky to apply PA to real networks because the number of nodes and that of edges have to satisfy a linear constraint. This paper enables real application of PA by making each new node as an isolated node that attaches to other nodes according to PA scheme in some later epochs. This simple and novel strategy provides an additional degree of freedom to relax the aforementioned constraint to the observed data and uses the PA scheme to compute the implied proportion of the unobserved zero-degree nodes. By using martingale convergence theory, the degree distribution of the proposed model is shown to follow the power law and its asymptotic variance is proved to be the solution of a Sylvester matrix equation. These results give a strongly consistent estimator for the power-law parameter and its asymptotic normality. This talk will review the theory of this new modeling approach and will illustrate how to use it to big network analysis.

Wed, 7 Aug 2019 16:24:21 +0800
Neuroimaging studies of chronic pain (menstrual pain) and the search for human pain biomarkers Abstract

    Pain is a subjective and multidimensional experience, encompassing the sensory-discriminative, affective-motivational, and cognitive-evaluative dimensions. Acute pain is adaptive and protects individuals from further damage. However, chronic pain is often maladaptive and leads to functional neuroplasticity changes and structural reorganization. We focus on primary dysmenorrhea (PDM), which is menstrual pain in the absence of identifiable pathology. The dual characteristics of PDM thus serves as a genuine clinical model of chronic pain.

  Functionally, pain perception occurs in the brain and emerges from neural information processing at varying spatial and temporal scales. By using nonlinear multiscale sample entropy (MSE) analysis, we studied the irregularity/uncertainty of brain signals across different time scales, which can be regarded as brain complexity and reflects the adaptability of the nervous system. Loss of complexity is thus considered as a representation of pathologic dynamics.

    In this talk, I will first introduce chronic pain, the clinical significance of studying primary dysmenorrhea (PDM), and the use of PDM as a clinical model of chronic pain. Secondly, I will outline our recent findings on genetic neuroimaging studies in PDM. Lastly, I will briefly share the perspectives from pain experts and the International Association for the Study of Pain (IASP) on the search for human pain biomarkers.

Wed, 31 Jul 2019 15:10:34 +0800
TBD Mon, 12 Aug 2019 11:12:42 +0800 Information content of Multi-Class Classification Abstract 

  In Multi-Class Classification (MCC), each label is attached with a possibly high dimensional and large sized point-cloud. I will start from nonparametrically building a label embedding tree, and then deriving a label predictive graph. Both label embedding tree and predictive graph reveals the nature of information content of (MCC): Heterogeneity. This is the platform for Data-driven Intelligence (D.I.). D.I. is shown to achieve nearly perfect, if not perfect, predictions. We then argue that achieving perfect prediction is indeed the prerequisite of all data analysis in general. Throughout our computational developments, data from PITCHf/x database is used. I will also mention how to scale our algorithmic paradigm in the setting of Extreme MCC involving with many hundreds or thousands of labels.


  At the end, if time allows, I will mention issues related to Multi-Label Classification (MLC) and Multiple Response problem in order to shed some lights on the future competition between D.I and A.I. (Artificial Intelligence).

Wed, 24 Jul 2019 11:38:42 +0800
Sensitivity Analysis and Visualization for Functional Data Abstract 

    During analyzing functional data process, the presence of outliers can greatly influence the results on modeling and forecasting of functional data, which may lead to the inaccurate conclusion. Hence, detection of such outliers becomes an essential task. Visualization of data not only plays a vital role in discovering the features of data before applying statistical models and summary statistics but also is an auxiliary tool in identifying outliers. The research involved visualization and sensitivity analysis for functional data has not yet received much attention in the literature to date. Thus, this becomes the focus of this paper. To this end, we propose a method combined influence function with the iteration scheme motivated by Zou et al.(2012) for identifying outliers in functional data, and develop new visualization tools for displaying features and grasping the outliers in functional data. Furthermore, comparisons between our proposed methods with the existing methods are also investigated. Finally, we illustrate these proposed methods with simulation studies and real data examples .

Wed, 31 Jul 2019 11:08:15 +0800
Regression Tree for Counts Abstract

    A regression tree method for count data called CORE is introduced. Besides a Poisson regression, a count regression such as negative binomial, hurdle, or zero-inflated regression which can accommodate over-dispersion and/or excess zeros is fitted at each node. Likelihood function is used to guide the selection of split variables and split sets. We then use node deviance in the tree pruning process to avoid overfitting. CORE is free of variable selection bias. It is shown to have an edge over the existing methods in the simulation and real data studies.

Fri, 16 Aug 2019 17:27:57 +0800