Coronavirus Pandemic A comprehensive analysis of systematically screened laboratory tests: based on a COVID-19 cohort

Introduction: The study aimed at screening indicators with differential diagnosis values and investigating the characteristics of laboratory tests in COVID-19 patients. Methodology: All the laboratory tests from COVID-19 patients and non-COVID-19 patients in this cohort were included. Test values from the groups during the course, days 1-7, and days 8-14 were analyzed. Mann-Whitney U test, univariate logistic regression analysis, and multivariate regression analysis were performed. Regression models were established to verify the diagnostic performance of indicators. Results: 302 laboratory tests were included in this cohort, and 115 indicators were analyzed; the values of 61 indicators had significant differences ( p < 0.05) between groups, and 23 indicators were independent risk factors of COVID-19. During days 1-7, the values of 40 indicators had significant differences ( p < 0.05) between groups, while 20 indicators were independent risk factors of COVID-19. During days 8-14, the values of 45 indicators had significant differences ( p < 0.05) between groups, and 23 indicators were independent risk factors of COVID-19. About 10, 12, and 12 indicators showed significant differences ( p < 0.05) in multivariate regression analysis in different courses respectively, and the diagnostic performance of the model from them was 74.9%, 80.3%, and 80.8% separately. Conclusions: The indicators obtained through systematic screening have preferable differential diagnosis values. Compared with non-COVID-19 patients, the screened indicators indicated that COVID-19 patients had more severe inflammatory responses, organ damage, electrolyte and metabolism disturbance, and coagulation disorders. This screening approach could find valuable indicators from a large number of laboratory test indicators.


Introduction
The Coronavirus Disease 2019 (COVID-19) is a newly emerging and highly contagious respiratory disease [1,2] which has been declared a Public Health Emergency of International Concern by the World Health Organization [3]. COVID-19 is still spreading in many countries and regions around the world and poses a serious threat to people's health. COVID-19 patients could suffer fever, cough, dyspnea, diarrhea, and other symptoms [4][5][6], involving various organs such as the lung, gastrointestinal tract, heart, liver, kidney, and brain [7][8][9], which causes changes in the composition or content of blood cells, electrolytes, enzymes, urine, and other body fluids. Laboratory tests could objectively reflect the changes in the internal environment and the functions of organs, which plays an important role in understanding the pathological processes of COVID-19 and improving the level of diagnosis and therapy in  Current studies on laboratory tests for COVID-19 mainly focus on single or multiple indicators [10][11][12], and there is no systematic and comprehensive analysis. Some studies have reported the characteristics of laboratory tests in severe and non-severe patients [13,14], but comparative studies between COVID-19 and other respiratory diseases are limited [15]. Therefore, the existing studies are not conducive to comprehensively understanding COVID-19, nor to the

Statistical analysis
The statistical analyses were performed by SPSS 25, all statistical were used as two-sided test, p < 0.05 was considered a significant difference. The age and time were expressed as mean ± standard deviation (SD), and two independent sample T-test was used to compare the differences between groups. The continuous data of laboratory tests was expressed by median and interquartile range (IQR), and Mann-Whitney U test was used to compare the differences between groups. Categorical data were expressed by frequency (%), and Fisher's exact test or chi-square test was used to compare the differences between groups. The missing value was filled by multiple imputations. The risk factors and diagnostic performance of laboratory indicators in COVID-19 were analyzed by univariate and multivariate binary logistic regression.
The area under the receiver operating characteristic (ROC) curve (AUC) and 95% confidence interval (95% CI) was used to evaluate the predictive ability of laboratory indicators in the occurrence of COVID- 19. A logistic regression model was established to verify the diagnostic performance of the selected indicators.

General information and clinical characteristics of the patients
A total of 133 patients were included in the COVID-19 group. There were 4 mild cases, 89 moderate cases, 21 severe cases, 19 critical cases (including 10 deaths) and 63 patients had chronic diseases. A total of 249 patients were included in the non-COVID-19 group, and there were 171 patients with pulmonary infection, 65 patients with upper respiratory tract infection, 8 patients with bronchitis, and 5 patients with nonrespiratory diseases. A total of 10 patients died and 117 patients had chronic diseases in the non-COVID-19 group. The chronic diseases in the two groups mainly included hypertension, coronary heart disease, diabetes, cerebrovascular disease, etc. There were no significant differences in gender and chronic disease between the two groups. The age of the patients in the COVID-19 group was older than that in the non-COVID-19 group (p < 0.001), and the univariate regression coefficient of age was 0.021 (Table 1).
In the COVID-19 group, the average time from onset to admission was 4.52 ± 3.11 days, to severe type, was 8.15 ± 5.29 days, to critical type, was 11.58 ± 5.79 days.

Collection of the laboratory test indicators
There was a total of 352 laboratory test indicators in both groups, 36 drug sensitivity tests, 10 etiological tests related to the disease diagnosis of this cohort and 4 descriptive indicators were excluded, the remaining 302 indicators. Then we removed the indicators tested less than 13 times. Finally, the COVID-19 group included 122 indicators and the non-COVID-19 group included 126 indicators. The two groups contained 115 identical indicators. As shown in Figure 1, several representative laboratory test indicators exhibited different changing tendencies between COVID-19 and non-COVID-19 groups during the disease.

Result of test values
The Mann-Whitney U test result of 115 laboratory indicators showed that 61 indicators had significant differences (p < 0.05) between COVID-19 and non-COVID-19 groups (Table 2) in the whole course. And 40, 45 indicators had significant differences in the days 1-7 and days 8-14, respectively. All the results from 115 indicators were shown in Supplementary Table 1.

The univariate regression analysis
We calculated the mean value of 115 indicators for each patient in two groups and removed indicators with > 30% of their values missing (Supplementary Table 2). After that, a total of 62, 60, and 41 indicators in three different courses were analyzed by univariate logistic regression analysis, respectively. In the different courses, there were 23, 20, and 23 indicators in the COVID-19 group that were significantly different from the non-COVID-19 group (p < 0.05), respectively ( Table 3). The percentage of accuracy from indicators in classifying the COVID-19 and non-COVID-19 groups is shown in Table 4. The complete results of univariate logistic regression analysis and diagnostic performance were shown in Supplementary Tables 3-5.

The multivariate regression analysis
Indicators from each course with p < 0.1 in univariate regression model analysis were divided into three groups by the value of the regression coefficient. And multivariate regression analyses were conducted on them in different courses respectively. As shown in Table 5, in total 10, 12, and 12 indicators in different courses respectively had significant differences.

The output of the regression model
We established a logistic regression model in different course respectively through indicators from each course with p < 0.05 in their multivariate regression analysis. The result of the set data indicated that the percentage of accuracy in classifying the COVID-19 and non-COVID-19 patients, and sensitivity, specificity, and AUC was 74.9%, 52.6%, 86.7%, and 0.797 (95% CI 0.751-0.843) in the whole course, 80.3%, 62.2%, 89.3% and 0.84 (95% CI 0.791- Table 3. The indicators with significant differences in univariate regression analysis between two groups.  Figure 2).

Discussion
In total, we enrolled 302 laboratory tests in this study, and then 115 indicators, which covered common indicators in respiratory disease basically, were analyzed after indicators with low test frequency were removed. After that, the Mann-Whitney U test, univariate regression analysis, and multivariate regression analysis were performed on them consecutively to demonstrate the differences between the groups from multilevel comprehensively. Upon the multivariate regression analysis, about 10, 12, and 12 indicators were the independent risk factor the whole course, days 1-7 and days 8-14 respectively. The output of the models from these indicators has preferable accuracy, which indicated that based on the comprehensive collection and multilevel analysis, this screening approach can not only show differences between groups comprehensively but also figure out indicators with preferable diagnosis values from a mass  of laboratory tests. Most of the reports studied the characteristics of laboratory tests in COVID-19, as well as their comparisons with non-COVID-19 patients or healthy volunteers. But they only investigated the changes and diagnostic value in partial indicators from COVID-19 [17,18]. The patients in the non-COVID-19 group of this cohort were suspected COVID-19 patients who were admitted to the hospital due to fever or respiratory symptoms, etc. but were finally ruled out by nucleic acid tests. Actually, 97.08% of patients in them had a pulmonary infection, upper respiratory tract infection or bronchitis, etc. A comprehensive comparison with the non-COVID-19 group is beneficial in improving the level of differential diagnosis and understanding the pathophysiological changes of COVID-19.
The Mann-Whitney U test showed that 61 indicators have significant differences between COVID-19 and non-COVID-19 groups in the whole course. Among them, changes in leukocyte differential count and inflammatory indexes indicated more severe inflammation in COVID-19 patients, which were consistent with the study reported that neutrophil (NEUT), basophil (BASO), lymphocytes, etc. were reduced in COVID-19 patients [19,20]. As reports demonstrated that COVID-19 patients suffered multiorgan damage [21], the changes in indicators of organ function and damage, and urinalysis in this study also showed that the injury might exist in the lungs, heart, liver, and kidney, etc. A study reported that the partial pressure of oxygen was lower in non-survivor of COVID-19 than those in survivors [22]. In this study, changes in arterial blood gas indicated severe damage in the lungs, and dysfunction in gas transfer, which induces hypoxemia, increased lactic acid in the blood, and reactive polycythemia. Moreover, severe inflammation and damage in multiorgan results in disorder in electrolyte tests, dysfunction in glycometabolism, proteometabolism, and coagulation [23].
Studies demonstrated that white blood cell (WBC) and urine protein were different between COVID-19 and COVID-19-negative patients in the early stage, but no differences were founded between them in platelet [24,25]. In this study, most of the indicators with significant differences during days 1-7 and days 8-14 were consistent with those in the whole course, while it indicated decreased WBC and platelet in days 1-7, and no indicators with urinalysis hypoproteinemia; During the days 8-14, no significant differences between groups in red blood cell (RBC)-associated indicators were founded, which indicated no reactive  polycythemia. There were also no differences in urinalysis, but in platelet and platelet-associated indicators.
In the univariate regression analysis, we removed the indicators missing value > 30% of its value, including arterial blood gas, urinalysis, blood lactic acid, ferritin, etc. which might contain indicators with differential diagnostic values. During the whole course, the classification of the indicators with significance was similar to the Mann-Whitney U test. While the number of the indicators decreased, especially indicators associated with organ damage, which might be because of regression analysis itself or indicators with missing values > 30% were removed. Compared with the whole course, the result during days 1-7 indicated decreased WBC, NEUT, platelet, and carbon dioxide combining power (CO 2 -CP), and increased RBC, but no differences in inflammatory response or metabolismrelated indicators. It was confirmed that no differences were founded in coagulation-related indicators, but in platelet and its associated indicators during days 8-14. Researchers conducted univariate analysis on blood routine tests and blood chemistry and then built a multivariate regression model through variables with significance in univariate analysis. According to their investigation, they found that C-reactive protein (CRP) and platelet were predictive for COVID-19 [26].
Multivariate regression revealed the decrease in leukocyte differential count after infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV2), accompanied by inflammation and cell damage for the increased erythrocyte sedimentation rate (ESR) and alkaline phosphatase (ALP). Moreover, decreased coagulation function and hypoproteinemia were founded. During days 1-7, decreased percentage of eosinophil (EOS%) in the COVID-19 patients, and disorder in electrolyte, acid-base balance, and coagulation, but no hypoproteinemia. During days 8-14, changes in leukocyte differential count and enzymes were similar to the whole course. Increased CRP indicated inflammatory response, low potassium and increased blood glucose in COVID-19 patients. Sun et al. reported that multivariate logistic regression analysis was performed through variables with statistical significance in difference analysis between COVID-19 and influenza patients, and the accuracy of the diagnostic model was 69.64% [27].
After the comprehensive collection, elimination of low-frequency tests, and multilevel screening, we built models that contained the whole course, days 1-7 and days 8-14 respectively to discriminate COVID-19 and non-COVID-19 patients. The diagnostic accuracies were preferable, but output from them showed inferior sensitivity and great specificity, which demonstrated that this approach could show the differences in laboratory tests between groups. Application of these indicators could discriminate the COVID-19 and non-COVID-19 patients preferably, understand the development and pathophysiology of COVID-19, and differences in changes of pathophysiology between COVID-19 and COVID-19-like non-COVID-19 deeply.
This study has several limitations. (1) The sample size of this cohort is limited, in the process of screening and regression analysis of the laboratory tests, we removed the indicators with a test frequency of less than 13 and had > 30% missing values, which may omit some valuable indicators; (2) The COVID-19 is a new disease lacking targeted test and systematic follow-up, which may affect the evaluation of indicators. (3) The study is an overall analysis of laboratory tests and, we did not conduct further analysis for single indicators.

Conclusions
The indicators obtained through a comprehensive collection and systematical screening have preferable differential diagnosis values. This systematic screening approach could find valuable indicators. Compared with non-COVID-19 patients, the screened indicators indicated that COVID-19 patients had more severe inflammatory responses, organ damage, electrolyte and metabolism disturbance, and coagulation disorders.     NEUT%: percentage of neutrophil; PDW: platelet distribution width; RBC: red blood cell; RDW-CV: red blood cell distribution width-coefficient of variation; RDW-SD: red blood cell distribution width-standard deviation; TBil: total bilirubin; TP: total protein; WBC: white blood cell.