Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Zhao, P. (2020). Statistical Analysis for Complex Data by Generalized Indirect Dependency Learning and
Slack Empirical Likelihood. Retrieved from https://purl.lib.fsu.edu/diginole/2020_Summer_Fall_Zhao_fsu_0071E_16060
Dependence is one of the most important concepts in probability and statistics. To detect and measure dependency between response variables and predictors, various models have been constructed, from simple models like least square regressions to complex ones like deep neural networks. However, less literature focuses on detecting the dependency structures among responses and incorporating this structure information to facilitate other analyses. Dependency among responses can appear when the responses are multivariate or the data are observed as groups. Markov Random Field (MRF) and Generalized Estimating Equations (GEE) are proposed to detecting the dependency structures and improve the efficiency of analysis of mean effect under these circumstances. However, these methods may not perform well when data is complex, such as non-Gaussian, heavy-tailed or skewed. In Chapter 1, we consider a generalized indirect learning with dependency (GIDL) framework to detect and apply the dependent structure between responses in both multivariate and grouped dependencies, when data are non-Gaussian, heavy-tailed or skewed. We focus on statistical analysis that covers asymptotic distribution and signal selection of dependency structures and asymptotic distribution of the coefficients of predictors with the assistant of structure information. In Chapter 2, we develop a slack empirical likelihood (SEL) inference framework that can handle non-Gaussian, heavy-tailed or skewed types of data. Empirical likelihood is powerful because an inference problem is transformed into an optimization one. Besides, fewer distributional assumptions are required than traditional likelihood-based inference methods. Modern statistical models often implicitly put nonsmooth constraints through regularizations on the possible solutions for parameter estimations, like the restricted cone induced by `1 penalized regressions. Traditional empirical likelihood can not handle such nonsmooth constraints induced by regularizations when the solutions could appear at the boundary of constrained regions because the estimating equation is not sample additive. By carefully examining the directional derivatives at the nonsmooth region and introducing slack variables such that the modified estimating equations are sample additive, we extend the empirical likelihood framework such that the nonsmooth constraints can be handled by a joint optimization on parameters and the slack variable. We show some examples that our framework works for some traditional constrained empirical likelihood problems such that the correct asymptotic distribution can be derived. Besides, some explorations on modern statistical problems including high dimensional regression with regularizations are provided.
A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy.
Bibliography Note
Includes bibliographical references.
Advisory Committee
Yiyuan She, Professor Directing Dissertation; Zhenghao Zhang, University Representative; Fred W. Huffer, Committee Member; Jonathan R. Bradley, Committee Member.
Publisher
Florida State University
Identifier
2020_Summer_Fall_Zhao_fsu_0071E_16060
Zhao, P. (2020). Statistical Analysis for Complex Data by Generalized Indirect Dependency Learning and
Slack Empirical Likelihood. Retrieved from https://purl.lib.fsu.edu/diginole/2020_Summer_Fall_Zhao_fsu_0071E_16060