Some of the material in is restricted to members of the community. By logging in, you may be able to gain additional access to certain collections or items. If you have questions about access or logging in, please use the form on the Contact Page.
L'Artisan, J. (2022). Examining the Effects of Class Imbalance with Missing Data on the Logistic Regression Model. Retrieved from https://purl.lib.fsu.edu/diginole/LArtisan_fsu_0071E_17637
The problem of class imbalance has become frequent, and missing data also are often a predicament in the social sciences. In the literature, the two problems have been debated substantially and separately. Class imbalance has been shown to adversely impact the estimation of logistic regression coefficients. Also, the missing data problem has been demonstrated to yield bias in parameter estimation in various types of statistical analyses. This simulation study focuses on the context logistic regression and investigates how class imbalance interact with missing data to impact the estimation of the coefficients and test procedures such as the Wald and likelihood ratio tests. Three factors are considered in this simulation—(1) missing data mechanisms MCAR, MAR, MNAR, (2) sample sizes—small, 100; medium, 500; and large, 1000, (3) class imbalance ratios 10%, 20%, 30%, 40%, and 50%. The results demonstrate that the combination of class imbalance and missing data problems affects the performance of the logistic regression, particularly when the imbalance ratio is as low as 10% and the sample size is also low, and potential methods for the remedies are suggested.
class imbalance, logistic regression, missing data
Date of Defense
November 14, 2022.
Submitted Note
A Dissertation submitted to the Department of Educational Psychology and Learning Systems in partial fulfillment of the requirements for the degree of Doctor of Philosophy.
Bibliography Note
Includes bibliographical references.
Advisory Committee
Yanyun Yang, Professor Directing Dissertation; Fred Huffer, University Representative; Qian Zhang, Committee Member; Shengli Dong, Committee Member; Insu Paek, Committee Member.
Publisher
Florida State University
Identifier
LArtisan_fsu_0071E_17637
L'Artisan, J. (2022). Examining the Effects of Class Imbalance with Missing Data on the Logistic Regression Model. Retrieved from https://purl.lib.fsu.edu/diginole/LArtisan_fsu_0071E_17637