Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Evaluating the performance of models predicting a binary outcome can be done using a variety of measures. While some measures intend to describe the model's overall fit, others more accurately describe the model's ability to discriminate...
A major objective in modern human genetics research is to better understand the molecular mechanisms underlying human complex traits. Although genome-wide association studies (GWASs) have been successful in detecting thousands of trait...
Sequence data are extensively studied in numerous domains, such as time series forecasting, audio analysis, Natural Language Processing (NLP), Etc. However, the observed sequences show highly non-stationarity and long-term dependence in...
A longitudinal study is a research design that collects observations measured repeatedly from particular individuals over prolonged periods of time. Nowadays, longitudinal studies are widely used in health sciences, social science, ...
Meta-analysis is a widely used tool to combine research findings from multiple studies in many disciplines. In this thesis we develop novel methods to deal with two critical issues in systematic reviews and meta-analyses, such as...
Volatility modeling and forecasting are crucial in risk management and pricing derivatives. High-frequency financial data are dynamic and affected by the microstructure noise. For the univariate case, we define the two-scale realized...
Motivated by understanding the devastating financial crisis in 2008 that was partially caused by underestimation of financial risk, we propose a class of time-varying mixture models for risk analysis and management. There are various...
Statistical depth, a commonly used analytic tool in non-parametric statistics, has been extensively studied for multivariate and functional observations over the past few decades. Although various forms of depth were introduced, they are...
This paper considers a few problems in the area of bioinformatics. First the problem of comparing distributions of chromosomal shapes estimated from wild type and gene knock-out Hi-C data. For each contact data matrix, we estimate...
We review the Sorvali dilatation of isomorphisms of covering groups of Riemann surfaces and extend the definition to groups containing glide-reflections. Then we give a bound for the distance between two surfaces, one of them resulting...
Dependence is one of the most important concepts in probability and statistics. To detect and measure dependency between response variables and predictors, various models have been constructed, from simple models like least square...
Meta-analysis is a widely used tool for synthesizing results from multiple studies. As such, it plays an essential role in evidence-based medicine and clinical decision-making. However, a few concerns may seriously affect the validity of...
Recent advances in computing and measurement technologies have led to an explosion in the amount of data that are being collected in many areas of application. Much of these data have network or graph structures, and they are common in...
With the increasing popularity of information technology, especially electronic imaging techniques, large amount of high dimensional data such as 3D shapes become pervasive in science, engineering and even people's daily life, in the...
This dissertation includes four research projects. The first two projects mainly focus on the depth method on temporal point process data. The third project is the depth method on spatial point process. This is an extension of the first...
Over the past 30 years, magnetic resonance imaging has become a ubiquitous tool for accurately visualizing the change and development of the brain subcortical structures (e.g., hippocampus) across time and group. Although subcortical...
Longitudinal studies are widely used in various fields, such as public health, clinic trials and financial data analysis. A major challenge for longitudinal studies is repeated measurements from each subject, which cause time dependent...
Statistical depth functions have been well studied for multivariate data and functional data but remained under-explored for point process until very recently Liu and Wu made their first attempt. Generally, neither depth functions for...
The prediction of financial time series is an essential topic in quantitative investment. In this dissertation, we proposed two types of new models. They are bidirectional encoder representations from Transformers-based financial...
Spatial boundary analysis has attained considerable attention in several disciplines including engineering, spatial statistics, and computer science. The inferential question of interest is to identify rapid surface changes of an...
Since the introduction of anti-hypertensive medications in the mid-1950s, there has been an increased use of blood pressure medications in the US. The growing use of anti-hypertensive treatment has affected the distribution of blood...
Estimation of functions is an extremely rich and well-researched topic of research with broad applications spanning several scientific fields. We develop a shape based framework for probability density and general function modelling. The...
In this dissertation, I will present four research projects that I have been working on during my doctoral study at Florida State University. The first three projects focus on statistical modeling and analysis of neuroscience data. The...
Spatial datasets are becoming increasingly more common over the recent decades. Rapid devel- opments in technology has brought an abundance of information and data. Big spatial datasets produce many computational challenges. In this...
This dissertation consists of three projects in two research areas: 1) modeling of temporal point process (with one project) and 2) protein design in computational structural biology (with two projects). My research work in these three...
The ready availability of public-use data from various large national complex surveys has immense potential for the assessment of population characteristics--means, proportions, totals, etcetera. Using a model-based approach, complex...
This research provides theoretical and computational developments in statistical shape analysis of shape graphs, and demonstrates them using analysis of complex network-type object data such as retinal blood-vessel (RBV) networks. The...
This dissertation studies statistical shape analysis of planar objects. The focus is on two different representations. The first one considers only the boundary of planar shapes, a comprehensive analysis framework including...
In the genetic study, the advance of high-through technology allowed scientists to collect data on a larger scale and with more complexity. Thus, it is common that the collected data is high-dimension and heterogenous, i.e. the number of...
Though there are many feature selection methods for learning, they might not scale well to very large datasets, such as those generated in computer vision data. Furthermore, it can be beneficial to capture and model the variability...
This dissertation contains two related parts. The first part focuses on affine lines, and the second part is about affine planes. In the first part, it develops tools for statistical analysis on spaces of affine lines. Motivated by the...
Extraction and Analysis of 2D and 3D Data from Images and of Phylogenetic Tree Data from RNA and DNA Sequences and Consistency of Spherical Depth on Object Spaces
2D and 3D images form the bulk of object data. 3D configurations are extracted from digital camera images, and the surface of the scene pictured is physically reconstructed up to a similarity, via 3D printing. The virtual reconstructions...
Bayesian additive regression trees (BART) are a Bayesian machine learning tool for nonparametric function estimation, which has been shown to have outstanding performance in terms of variable selection and prediction accuracy. Unmodified...
The forecasting of financial time series has been very important in both finance industry and academia for several years due to the volatile and unstable nature of financial systems. However, researchers believe that financial time...
In modern statistics, many data sets are of complex structure, including but not limited to high dimensionality, higher-order, and heterogeneity. Recently, there has been growing interest in developing valid and efficient statistical...
Replicability is the cornerstone of scientific research. In this dissertation, we study replicability analysis of multiple studies from high throughput experiments, where tens of thousands of features are examined simultaneously. In the...
Skewed data are ubiquitous in various research fields, including environmental, financial, and biomedical areas. Skewed data greatly challenge the classical statistical analysis especially when we perform classification in high...
Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.