Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
This thesis is to identify the underlying structures of multivariate time series and propose a methodology to construct predictive VAR models. Due to the complexity of high dimensions in multivariate time series, forecasting a target...
Shape analysis is a widely studied topic in modern Statistics with important applications in areas such as medical imaging. Here we focus on two-sample hypothesis testing for both finite and infinite extrinsic mean shapes of...
Feature selection is an important technique for high dimensional statistics and machine learning. It has many applications in computer vision, natural language processing, bioinformatics, etc. However, most of the feature selection...
Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis....
The past several decades have seen great advances in the field of organizational politics. At the individual level, political skill has garnered the majority of the scholarly focus, whereas it's motivational counterpart, political will, ...
We present a preliminary test for nonlinear structure in large data sets. This procedure consists of transforming the data to remove the correlations, then discretizing the data and finally, studying the cell counts in the resulting...
In this dissertation, we study joint sparsity pursuit and its applications in variable selection in high dimensional data. The first part of dissertation focuses on hierarchical variable selection and its application in a two-way...
Thanks to the advancement of data-collecting technology in brain imaging, genomics, financial econometrics, and machine learning, scientific data tend to grow in both size and structural complexity, which are not amenable to traditional...
Statisticians often encounter data in the form of a combination of discrete and continuous outcomes. A special case is zero-inflated longitudinal data where the response variable has a large portion of zeros. These data exhibit...
Evaluating the performance of models predicting a binary outcome can be done using a variety of measures. While some measures intend to describe the model's overall fit, others more accurately describe the model's ability to discriminate...
On nonparametric regression for current status data
Description:
In some studies, it is not possible to observe directly the time at which an event of interest occurs, instead each experimental unit is examined at one time only, and it is noted whether or not the event of interest has occurred. This... We show that the estimate of the conditional distribution function from the LNPML method can be characterized as a solution to an isotonic regression problem and hence easily computed. This estimate does well in our simulation studies....
Analytical models developed using field data can provide useful information with acceptable confidence to evaluate and predict the operational characteristics of a highway. As such, this study presents statistical models that can be used...
Genome-wide association studies (GWAS) have significantly contributed to the identification of genetic variants by leveraging thousands of loci associated with complex traits and diseases, leading to breakthroughs in human genetics...
Recurrent events data are rising in all areas of biomedical research. We present a model for recurrent events data with the same link for the intensity and mean functions. Simple interpretations of the covariate effects on both the...
It is often assumed that all uncensored subjects will eventually experience the event of interest in standard survival models. However, in some situations when the event considered is not death, it will never occur for a proportion of...
Polychotomous quantal response models are widely used in medical and econometric studies to analyze categorical or ordinal data. In this study, we apply the Bayesian methodology through a mixed-effects polychotomous quantal response...
Image analysis often requires dimension reduction before statistical analysis, in order to apply sophisticated procedures. Motivated by eventual applications, a variety of criteria have been proposed: reconstruction error, class...
Most of the data encountered is bounded nonlinear data. The Universe is bounded, planets are sphere like shaped objects, and life growing on Earth comes in various shapes and colors that can hardly be represented as points on a linear...
Due to the large variance in the stochastic nature of evolution and DNA, it can be difficult to be sure that inferences that use genetical data are perfectly correct. However, as we incorporate more data into our analysis then the...
Over the last three decades, traffic crashes have been one of the leading causes of fatalities and economic losses in the world. Since vulnerable road users have specific characteristics, they are perceived to have a higher risk of being...
TESTING WHETHER NEW IS BETTER THAN USED OF A SPECIFIED AGE
Description:
This research contributes to the theory and methods of testing hypotheses for classes of life distributions. Two classes of life distributions considered in this dissertation are: (1) The New Better Than Used (NBU) Class: The life... The NBU and NBU-t(, 0) classes have dual classes (New Worse Than Used and New Worse Than Used At t(, 0), respectively) defined by reversing the inequality. The NBU-t(, 0) class is a new class of life distributions and contains the NBU class. We study the basic properties of the NBU-t(, 0) class and propose a test of H(, 0): F(x+t(, 0))(' )=(' )F(x)F(t(, 0)) for all x (GREATERTHEQ) 0, versus H(, A... We extend our test of H(, 0) versus H(, A) to accommodate randomly censored data. For the censored data situation our test is based on the statistic (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI) where F is the Kaplan-Meier (1958, J. Amer. Statist. Assoc. 53, 457-481) estimator of(' )F. Under mild regularity conditions on the amount of censoring, a consistent test of H(, A) for the randomly censored model is obtained. In Chapter III we develop a two-sample NBU test of the null hypothesis that two distributions F and G are equal, versus the alternative that F is "more NBU" than is G. Our test is based on the statistic (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI) where m and n are the sample sizes from F and G, and F(, m) and G(, n) are the empirical distributions of F and G. Asymptotic normality of T(, m, n), suitably normalized, is a direct consequence of Hoeffding's (1948, Ann. Math. Statist. 19, ... Our test of H(, A) utilizes the Kaplan-Meier estimator. However, there are other possible estimators of the survival function for the randomly censored model. . . . (Author's abstract exceeds stipulated maximum length....
Spatio-temporal surveillance has found increasing applications recently in various domains including disease outbreak detection, urban crime monitoring, medical image analysis and computer network intrusion detection. The objective of...
The goal of this dissertation is to improve the predictive performance of a Bayesian hierarchicalstatistical model by incorporating an estimate of the expected out-of-sample error and extending this method for big data. In Chapter 2, we...
Due to the importance of seeing profile change in devices such as of medical apparatus, measuring the change point in variability of a different functions is important. In a sequence of functional observations (each of the same length), ...
Sequence data are extensively studied in numerous domains, such as time series forecasting, audio analysis, Natural Language Processing (NLP), Etc. However, the observed sequences show highly non-stationarity and long-term dependence in...
ESTIMATION AND PREDICTION FOR EXPONENTIAL TIME SERIES MODELS
Description:
This work is concerned with the study of stationary time series models in which the marginal distribution of the observations follows an exponential distribution. This is in contrast to the standard models in the literature where the...
A comparison of two methods of bootstrapping in a reliability model
Description:
We consider bootstrapping in the following reliability model which was considered by Doss, Freitag, and Proschan (1987). Available for testing is a sample of iid systems each having the same structure of m independent components. Each...
PARTIAL ORDERINGS, WITH APPLICATIONS TO RELIABILITY (PARTIAL ORDERINGS, SCHUR-OSTROWSKI THEOREM, INEQUALITIES)
Description:
This dissertation is a contribution to the use of inequalities in reliability theory. Specifically, we study three partial orderings, develop some useful properties of these orderings, and apply them to obtain several applications in... The first partial ordering is the notion of convex-ordering among life distributions. This is in the spirit of Hardy, Littlewood, and Polya (1952) who introduced the concept of relative convexity. Many parametric families of distribution... The second partial ordering is the ordering of majorization among integrable functions. This ordering is a generalization of the majorization ordering of Hardy, and Polya (1952) for vectors in n-dimensional Euclidean spaces.... The third partial ordering is the ordering of unrestricted majorization among integrable functions. This partial ordering is similar to majorization but does not involve the use of decreasing rearrangements. We establish another analogue...
First, we present two novel semiparametric survival models with log-linear median regression functions for right censored survival data. These models are useful alternatives to the popular Cox (1972) model and linear transformation...
AN INVESTIGATION OF THE EFFECT OF THE SWAMPING PHENOMENON ON SEVERAL BLOCK PROCEDURES FOR MULTIPLE OUTLIERS IN UNIVARIATE SAMPLES
Description:
Statistical outliers have been an issue of concern to researchers for over two centuries, and are the focus of this study. Sources of outliers, and various means for dealing with them are discussed. Also presented are general... Specifically, the primary aim of this study is to assess the susceptibility to swamping of four block procedures for multiple outliers in univariate samples. Pseudo-random samples are generated from a unit normal distribution, and varying numbers of upper outliers are placed in them according to specified criteria. A swamping index is created which reflects the relative vulnerability of each... The results of this investigation reveal that the four block tests disagree in their respective susceptibilities to swamping depending upon sample size and the prespecified number of outliers assumed to be present. Rank orderings of... Recommendations concerning the appropriate application of the four block procedures under differing situations, and proposals for further research, are advanced.
RANKING AND SELECTION PROCEDURES FOR EXPONENTIAL POPULATIONS WITH CENSORED OBSERVATIONS
Description:
Let (PI)(, 1), (PI)(, 2), ..., k) be k exponential populations. The problem of the ranking and selection for these k populations is formulated in order to accommodate censored observations. The data under study are assumed to be... Let X(, i{1}) be the minimum order statistic in the sample of size n from the population (PI)(, i), i = 1, 2, k. A selection procedure for selecting the largest location parameter, (lamda)(, {k}), under Type-I censoring is defined in... The ranking and selection for scale parameters based on Type-II censored data are investigated under two formulations, i.e., Bechhofer's indifference zone approach and Gupta's subset selection approach. The selection rule proposed under... The scale parameter problem, subjected to Type-I censoring, is also examined. We introduce the idea of using the total time on test (TTOT) statistic as the selection statistic. The exact distribution of the TTOT statistic is found and... Finally, the selection problem under random censorship is studied. The maximum likelihood estimate (MLE) T(, i) of the scale parameter (theta)(, i) is obtained from the randomly censored data. A selection procedure is proposed based on T(...
The computation of probabilities which involve spacings, with applications to the scan statistic
Description:
We develop a methodology for evaluating probabilities which involve linear combinations of spacings and then present some applications of this methodology. The basic idea underlying our method was given by Huffer (1988): A recursion is...
Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.