**Abstract**

Despite many exciting empirical successes in data science, there are significant gaps between theory and inspired-by-theory practices. For example, the standard lasso (\ell_1-penalized least squares regression) is not appropriate for variable selection in non-linear statistical models; uniformly random sampling suggested by theory is not adopted in practical compressive medical imaging; the standard bounded gradient/curvature condition in convex optimization just does not hold in some crucial applications. In this talk, I will present how we addressed these gaps—rigorously—via identifying more general conditions on the problem settings, or even developing completely new theoretical frameworks.

]]>**Abstract**

We propose a general index model for survival data, which generalizes many commonly used semiparametric survival models and belongs to the framework of dimension reduction. Using a combination of geometric approach in semiparametrics and martingale treatment in survival data analysis, we devise estimation procedures that are feasible and do not require covariate-independent censoring as assumed in many dimension reduction methods for censored survival data. We establish the root-n consistency and asymptotic normality of the proposed estimators and derive the most ecient estimator in this class for the general index model. Numerical experiments are carried out to demonstrate the empirical performance of the proposed estimators and an application to an AIDS data further illustrates the usefulness of the work.

]]>**Abstract**

In this talk, we consider a nested family of multivariate baseline proportional hazards model for analyzing survival data. The family contains the Cox proportional hazards model and the continuously stratified proportional hazards model as special cases. It maintains the practically desirable hazard-ratio interpretation of target parameters, while allowing the control of multi-dimensional covariates in a nonparametric manner. The model also allows data-adaptive dimension reduction to reduce the effect of curse of dimensionality. Our goal is to strike a balance between flexibility and parsimony. Under the proposed model, we characterize the semiparametric efficiency bound for parameters of interest. Further, we propose a complete estimation procedure for the parameters coupled with partial sufficient dimension reduction. We also show that the proposed pseudo maximum likelihood estimator is semiparametric efficient.

]]>**Abstract**

In this study, we use a vector functional auto-regressive model to analyze the supply and demand (S-D) curves of a Limit Order Book (LOB) simultaneously. The S-D curves are represented by a linear combination of multi-resolution B-spline basis functions. The corresponding coefficients of the basis functions are shown to follow a vector auto-regressive model, which can be applied to the prediction of future S-D curves. Numerical results indicate that the proposed method has satisfactory performance and the areas under the S-D curves are capable of improving the classification of market trends.

]]>