Heart disease, influencing factor, logistic regression, random forest
Abstract
Cardiovascular disease, the leading cause of morbidity and mortality, affects more than 523 million people worldwide. Cardiovascular disease characterized by an insidious onset, long latency period, and complex, variable presentation, poses significant challenges to recovery post-onset. Patients often suffer from irreversible organ damage due to delayed treatment, which not only severely threatens their lives but also exacerbates medical and economic burdens. In this research, the methods of logistic regression and random forest were used to identify the influencing factors. The dataset includes 303 clinical cases and 14 variables such as chest pain type, fasting blood sugar level, serum cholesterol, and so on. The outcomes demonstrate that the overall prediction accuracy of the research model is 85.15%. Thus, this model will assist primary care personnel to recognize heart disease more easily, accurately, and earlier, and to provide a scientific basis for clinical prevention and treatment, reducing the medical and economic burden on patients.