Identification of high death risk coronavirus disease-19 patients using blood tests

Document Type : Original Article


1 Department of Biology, Faculty of Basic Sciences, Azarbaijan Shahid Madani University, Tabriz, Iran

2 Department of Internal Medicine, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz; Emam Hossein Hospital, Tabriz University of Medical Sciences, Hashtrood, Iran

3 Emam Hossein Hospital, Tabriz University of Medical Sciences, Hashtrood; Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran


Background: The coronavirus disease (COVID-19) pandemic has made a great impact on health-care services. The prognosis of the severity of the disease help reduces mortality by prioritizing the allocation of hospital resources. Early mortality prediction of this disease through paramount biomarkers is the main aim of this study. Materials and Methods: In this retrospective study, a total of 205 confirmed COVID-19 patients hospitalized from June 2020 to March 2021 were included. Demographic data, important blood biomarkers levels, and patient outcomes were investigated using the machine learning and statistical tools. Results: Random forests, as the best model of mortality prediction, (Matthews correlation coefficient = 0.514), were employed to find the most relevant dataset feature associated with mortality. Aspartate aminotransferase (AST) and blood urea nitrogen (BUN) were identified as important death-related features. The decision tree method was identified the cutoff value of BUN >47 mg/dL and AST >44 U/L as decision boundaries of mortality (sensitivity = 0.4). Data mining results were compared with those obtained through the statistical tests. Statistical analyses were also determined these two factors as the most significant ones with P values of 4.4 × 10−7 and 1.6 × 10−6, respectively. The demographic trait of age and some hematological (thrombocytopenia, increased white blood cell count, neutrophils [%], RDW-CV and RDW-SD), and blood serum changes (increased creatinine, potassium, and alanine aminotransferase) were also specified as mortality-related features (P < 0.05). Conclusions: These results could be useful to physicians for the timely detection of COVID-19 patients with a higher risk of mortality and better management of hospital resources.


Ali N. Relationship between COVID-19 infection and liver injury: A review of recent data. Front Med (Lausanne) 2020;7:458.  Back to cited text no. 1
Azhar M, Thomas PA. Comparative Review of Feature Selection and Classification Modeling. 2019 International Conference on Advances in Computing, Communication and Control (ICAC3); 2019. p. 1-9.  Back to cited text no. 2
Baj J, Karakuła-Juchnowicz H, Teresiński G, Buszewicz G, Ciesielka M, Sitarz E, et al. COVID-19: Specific and non-specific clinical manifestations and symptoms: The current state of knowledge. J Clin Med 2020;9:1753.  Back to cited text no. 3
Bashash D, Olfatifar M, Hadaegh F, Asadzadeh Aghdaei H, Zali MR. COVID-19 prognosis: What we know of the significance and prognostic value of liver-related laboratory parameters in SARS-CoV-2 infection. Gastroenterol Hepatol Bed Bench 2020;13:313-20.  Back to cited text no. 4
Biswas M, Rahaman S, Biswas TK, Haque Z, Ibrahim B. Association of sex, age, and comorbidities with mortality in COVID-19 patients: A systematic review and meta-analysis. Intervirology 2020;64:36-47.  Back to cited text no. 5
Bonanad C, García-Blas S, Tarazona-Santabalbina F, Sanchis J, Bertomeu-González V, Fácila L, et al. The effect of age on mortality in patients with COVID-19: A meta-analysis with 611,583 subjects. J Am Med Dir Assoc 2020;21:915-8.  Back to cited text no. 6
Borges L, Pithon-Curi TC, Curi R, Hatanaka E. COVID-19 and neutrophils: The relationship between hyperinflammation and neutrophil extracellular traps. Mediators Inflamm 2020;2020:8829674.  Back to cited text no. 7
Breiman L. Random forests. Mach Learn 2001;45:5-32.  Back to cited text no. 8
Cabitza F, Campagner A, Ferrari D, Di Resta C, Ceriotti D, Sabetta E, et al. Development, evaluation, and validation of machine learning models for COVID-19 detection based on routine blood tests. Clin Chem Lab Med 2020;59:421-31.  Back to cited text no. 9
Chen D, Pan X, Xiao P, Farwell MA, Zhang B. Evaluation and identification of reliable reference genes for pharmacogenomics, toxicogenomics, and small RNA expression analysis. J Cell Physiol 2011;226:2469-77.  Back to cited text no. 10
Chen LZ, Lin ZH, Chen J, Liu SS, Shi T, Xin YN. Can elevated concentrations of ALT and AST predict the risk of 'recurrence' of COVID-19? Epidemiol Infect 2020;148:e218.  Back to cited text no. 11
Cheng A, Hu L, Wang Y, Huang L, Zhao L, Zhang C, et al. Diagnostic performance of initial blood urea nitrogen combined with D-dimer levels for predicting in-hospital mortality in COVID-19 patients. Int J Antimicrob Agents 2020;56:106110.  Back to cited text no. 12
Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 2020;21:6.  Back to cited text no. 13
Chicco D, Rovelli C. Computational prediction of diagnosis and feature selection on mesothelioma patient health records. PLoS One 2019;14:e0208737.  Back to cited text no. 14
Chicco D, Tötsch N, Jurman G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min 2021;14:13.  Back to cited text no. 15
Chowdhury ME, Rahman T, Khandakar A, Al-Madeed S, Zughaier SM, Doi SA, et al. An early warning tool for predicting mortality risk of COVID-19 patients using machine learning. Cognit Comput 2021 Apr 21:1-6.  Back to cited text no. 16
Cucinotta D, Vanelli M. WHO declares COVID-19 a pandemic. Acta Biomed 2020;91:157-60.  Back to cited text no. 17
Debnath S, Barnaby DP, Coppa K, Makhnevich A, Kim EJ, Chatterjee S, et al. Machine learning to assist clinical decision-making during the COVID-19 pandemic. Bioelectron Med 2020;6:1-8.  Back to cited text no. 18
Farghaly S, Makboul M. Correlation between age, sex, and severity of coronavirus disease-19 based on chest computed tomography severity scoring system. Egypt J Radiol Nucl Med 2021;52:1-8.  Back to cited text no. 19
Foy BH, Carlson JC, Reinertsen E, Padros I Valls R, Pallares Lopez R, Palanques-Tost E, et al. Association of red blood cell distribution width with mortality risk in hospitalized adults with SARS-CoV-2 infection. JAMA Netw Open 2020;3:e2022058.  Back to cited text no. 20
Ghahramani S, Tabrizi R, Lankarani KB, Kashani SM, Rezaei S, Zeidi N, et al. Laboratory features of severe vs. non-severe COVID-19 patients in Asian populations: A systematic review and meta-analysis. Eur J Med Res 2020;25:30.  Back to cited text no. 21
Gomez JM, Du-Fay-de-Lavallaz JM, Fugar S, Sarau A, Simmons JA, Clark B, et al. Sex differences in coronavirus disease 2019 (COVID-19) hospitalization and mortality. J Women's Health 2021;30:646-53.  Back to cited text no. 22
Hendren NS, de Lemos JA, Ayers C, Das SR, Rao A, Carter S, et al. Association of body mass index and age with morbidity and mortality in patients hospitalized with COVID-19: Results from the American Heart Association COVID-19 Cardiovascular Disease Registry. Circulation 2021;143:135-44.  Back to cited text no. 23
Henry BM, Benoit JL, Benoit S, Pulvino C, Berger BA, Olivera MH, et al. Red blood cell distribution width (RDW) predicts COVID-19 severity: A prospective, observational study from the Cincinnati SARS-CoV-2 Emergency Department Cohort. Diagnostics (Basel) 2020;10:618.  Back to cited text no. 24
Hintze JL, Nelson RD. Violin plots: A box plot-density trace synergism. Am Stat 1998;52:181-4.  Back to cited text no. 25
Hjerpe A. Computing random forests variable importance measures (vim) on mixed numerical and categorical data. Stochholm, Sweden: KTH Royal Institute of Technology School of Computer Science and Communication; 2016.  Back to cited text no. 26
Janardhanan P, Sabika F. Effectiveness of support vector machines in medical data mining. J Commun Softw Syst 2015;11:25-30.  Back to cited text no. 27
Jin JM, Bai P, He W, Wu F, Liu XF, Han DM, et al. Gender differences in patients with COVID-19: Focus on severity and mortality. Front Public Health 2020;8:152.  Back to cited text no. 28
Kaye AD, Okeagu CN, Pham AD, Silva RA, Hurley JJ, Arron BL, et al. Economic impact of COVID-19 pandemic on healthcare facilities and systems: International perspectives. Best Pract Res Clin Anaesthesiol 2021;35:293-306.  Back to cited text no. 29
Kim HY. Statistical notes for clinical researchers: Chi-squared test and Fisher's exact test. Restor Dent Endod 2017;42:152-5.  Back to cited text no. 30
Kong M, Zhang H, Cao X, Mao X, Lu Z. Higher level of neutrophil-to-lymphocyte is associated with severe COVID-19. Epidemiol Infect 2020;148:e139.  Back to cited text no. 31
Lashari SA, Ibrahim R, Senan N, Taujuddin N. Application of Data Mining Techniques for Medical Data Classification: A Review. Vol. 150. MATEC Web of Conferences; 2018. p. 06003.  Back to cited text no. 32
Lewis RJ. An Introduction to Classification and Regression Tree (CART) Analysis. Annual Meeting of the Society for Academic Emergency Medicine in San Francisco, California; 2000. p. 14.  Back to cited text no. 33
Liu S, Zhang L, Weng H, Yang F, Jin H, Fan F, et al. Association between average plasma potassium levels and 30-day mortality during hospitalization in patients with COVID-19 in Wuhan, China. Int J Med Sci 2021;18:736-43.  Back to cited text no. 34
Liu YM, Xie J, Chen MM, Zhang X, Cheng X, Li H, et al. Kidney function indicators predict adverse outcomes of COVID-19. Med (N Y) 2021;2:38-48.e2.  Back to cited text no. 35
Liu Y, Sun W, Guo Y, Chen L, Zhang L, Zhao S, et al. Association between platelet parameters and mortality in coronavirus disease 2019: Retrospective cohort study. Platelets 2020;31:490-6.  Back to cited text no. 36
Lodder RA, Hieftje GM. Quantile analysis: A method for characterizing data distributions. Appl Spectrosc 1988;42:1512-20.  Back to cited text no. 37
Lorente L, Martín MM, Argueso M, Solé-Violán J, Perez A, Marcos Y Ramos JA, et al. Association between red blood cell distribution width and mortality of COVID-19 patients. Anaesth Crit Care Pain Med 2021;40:100777.  Back to cited text no. 38
McKnight PE, Najab J. Mann-Whitney U test. In: The Corsini Encyclopedia of Psychology. 2010 Jan 30:1.  Back to cited text no. 39
Meizlish ML, Pine AB, Bishai JD, Goshua G, Nadelmann ER, Simonov M, et al. A neutrophil activation signature predicts critical illness and mortality in COVID-19. Blood Adv 2021;5:1164-77.  Back to cited text no. 40
Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F, Chang CC, et al. Package 'e1071'. R J 2019.  Back to cited text no. 41
Milborrow S, Milborrow MS. Package 'rpart. plot'; 2020.  Back to cited text no. 42
Minghim R, Huancapaza L, Artur E, Telles GP, Belizario IV. Graphs from features: Tree-based graph layout for feature analysis. Algorithms 2020;13:302.  Back to cited text no. 43
Molnar C. Interpretable Machine Learning.; 2020.  Back to cited text no. 44
Nogueira SÁ, Oliveira SC, Carvalho AF, Neves JM, Silva LS, Silva Junior GB, et al. Renal changes and acute kidney injury in covid-19: A systematic review. Rev Assoc Med Bras (1992) 2020;66 Suppl 2:112-7.  Back to cited text no. 45
Osi AA, Dikko HG, Abdu M, Ibrahim A, Isma'il LA, Sarki H, et al. A classification approach for predicting COVID-19 patient survival outcome with machine learning techniques. medRxiv 2020.  Back to cited text no. 46
Park SE. Epidemiology, virology, and clinical features of severe acute respiratory syndrome -coronavirus-2 (SARS-CoV-2; Coronavirus Disease-19). Clin Exp Pediatr 2020;63:119-24.  Back to cited text no. 47
Pecoraro F, Clemente F, Luzi D. The efficiency in the ordinary hospital bed management in Italy: An in-depth analysis of intensive care unit in the areas affected by COVID-19 before the outbreak. PLoS One 2020;15:e0239249.  Back to cited text no. 48
Pijls BG, Jolani S, Atherley A, Derckx RT, Dijkstra JI, Franssen GH, et al. Demographic risk factors for COVID-19 infection, severity, ICU admission and death: A meta-analysis of 59 studies. BMJ Open 2021;11:e044640.  Back to cited text no. 49
Ponti G, Maccaferri M, Ruini C, Tomasi A, Ozben T. Biomarkers associated with COVID-19 disease progression. Crit Rev Clin Lab Sci 2020;57:389-99.  Back to cited text no. 50
Pourbagheri-Sigaroodi A, Bashash D, Fateh F, Abolghasemi H. Laboratory findings in COVID-19 diagnosis and prognosis. Clin Chim Acta 2020;510:475-82.  Back to cited text no. 51
Pradhan A, Olsson PE. Sex differences in severity and mortality from COVID-19: Are males more vulnerable? Biol Sex Differ 2020;11:53.  Back to cited text no. 52
RColorBrewer S, Liaw MA. Package 'randomForest'. Berkeley, CA, USA: University of California; 2018.  Back to cited text no. 53
Ripley B, Venables B, Bates DM, Hornik K, Gebhardt A, Firth D, et al. Package 'mass'. Cran r. 2013;538:113-20..  Back to cited text no. 54
Rocca B. Handling imbalanced datasets in machine learning. Towards Data Science 2019.  Back to cited text no. 55
Ruan Q, Yang K, Wang W, Jiang L, Song J. Clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, China. Intensive Care Med 2020;46:846-8.  Back to cited text no. 56
Sasson I. Age and COVID-19 mortality: A comparison of Gompertz doubling time across countries and causes of death. Demogr Res 2021;44:379-96.  Back to cited text no. 57
Sun DW, Zhang D, Tian RH, Li Y, Wang YS, Cao J, et al. The underlying changes and predicting role of peripheral blood inflammatory cells in severe COVID-19 patients: A sentinel? Clin Chim Acta 2020;508:122-9.  Back to cited text no. 58
Therneau T, Atkinson B, Ripley B, Ripley MB. Package 'rpart'; 2015. Available from: [Last accessed on 2021 Dec 01].  Back to cited text no. 59
von Jouanne-Diedrich H. OneR: One rule machine learning classification algorithm with enhancements. R package version. 2017;2:2.  Back to cited text no. 60
Wang Q, Zhao H, Liu LG, Wang YB, Zhang T, Li MH, et al. Pattern of liver injury in adult patients with COVID-19: A retrospective analysis of 105 patients. Mil Med Res 2020;7:28.  Back to cited text no. 61
Wickham H, Chang W, Wickham MH. Package 'ggplot2'. Create Elegant Data Visualisations Using the Grammar of Graphics. Vol. 2. Version; 2016. p. 1-189.  Back to cited text no. 62
Xanthopoulos P, Pardalos PM, Trafalis TB. Robust data mining. Springer Science & Business Media; 2012.  Back to cited text no. 63