Coronavirus Pandemic COVID-19 cases, hospitalizations, outpatients, and deaths in Mexico by ethnicity and state-level income

Introduction: Mexico is one of the countries that is most affected by mortality due to COVID-19. Once infected, the indigenous population living in the lower-income states had worse outcomes. Our objectives were to analyze outcomes by ethnic group, and determine the association between state-level income and the incidence, hospitalizations, outpatients, and death rates per 100,000 population. Methodology: We analyzed 1,037,567 confirmed COVID-positive cases from February 29 to November 13, 2020 recorded in the Mexican COVID-19 cases database. Sociodemographic characteristics, comorbidities, and outcomes were analyzed. Data was allocated according to the state where the patients were treated. Statistical association between age-adjusted incidence and death rates with state-level GDP per capita (as a measure of income), were ascertained using Spearman correlations. Kruskal-Wallis tests examine the association of cumulative incidence, hospitalizations, outpatients, and death rates, with income quartiles. When significant, a follow-up analysis (Mann-Whitney) was conducted. Results: Respective cumulative incidence rates and death rates were: 900.3 (non-indigenous) and 94.4 (indigenous), and 87.1 (non-indigenous) and 13.9 (indigenous). Spearman correlation coefficients of income with age-standardized incidence and death rates were 0.657 and 0.607 ( p < 0.001 for both). Kruskal-Wallis H-Values indicate significant median differences by income in total population rates: cumulative incidence 13.47 ( p < 0.01), hospitalizations 11.67 ( p < 0.01), outpatients 12.86 ( p < 0.01), and deaths 8.92 ( p < 0.05). Conclusions: Cumulative incidence, hospitalizations, outpatients, and mortality rates presented a reversed socioeconomic status health gradient in Mexico. Less adverse outcomes were observed in the lowest-income states compared to higher-income states.

of the 32 states in Mexico. If the SES health gradient applies to the COVID-19 pandemic in Mexico, worse outcomes would be observed in states with lower GDP per capita than in states with higher GDP per capita.

Data sources
The data utilized in this study was extracted from the Mexican Ministry of Health COVID-19 cases openaccess database [7], containing data accumulated from the start of the pandemic in Mexico on February 29, 2020 to the cut-off point on November 13, 2020. The data originated from the Epidemiological Surveillance System of Viral Respiratory Diseases (SISVER) implemented on April 5, 2020 [20], whereby the healthcare system in Mexico conducts epidemiological surveillance by laboratory (sentinel surveillance) under a sampling rate of 10% for outpatient cases of viral respiratory disease, and 100% for severe hospitalized cases. Up to the cut-off point, the database contained information on 2,630,258 patients from health care units in the 32 Mexican states, of which 1,037,567 were COVID-19 positive cases, confirmed by Reverse Transcription Polymerase Chain Reaction (RT-PCR). We focus our analysis on such confirmed positive cases; no cases were excluded.
The variables from the database used in this study were: sociodemographic characteristics (age, sex, "indigenous" -chosen over "speaks indigenous tongue" to include patients of all ages when determining ethnicity, and location of medical unit), type of patient (hospitalized or outpatient), reported comorbidities (diabetes, chronic obstructive pulmonary disease, asthma, immunodeficiency, hypertension, cardiovascular disease, obesity, chronic kidney disease, tobacco use, and "other comorbidities"), and outcomes (ICU admission, invasive ventilation, recovery, or death). The variable "severe cases" was used to refer to the occurrence of the outcomes ICU admission, invasive ventilation, or death. The variable "died at home" refers to ambulatory patients whose outcome was death. Data was allocated geographically according to the location, by state, of the hospitals where the patients were treated.
Gross Domestic Product (GDP) per capita by state, expressed in current United States Dollars, was calculated using 2019 state-level GDP data extracted from the National Institute of Statistics and Geography (INEGI) website [21], 2019 state population estimates extracted from the National Population Council (CONAPO) website [22], and the 2019 Mexican Peso -United States dollar exchange rate extracted from the Banco de México (Mexico's central bank) website [23]. The 32 Mexican states were classified by quartiles of GDP per capita in current United States dollars, and this information was incorporated into the database for analysis.
Cumulative incidence, hospitalizations, outpatients, and deaths per 100,000 population were calculated using data from the 2020 Population Census [24] for non-indigenous population, and data from the National Institute of Indigenous People (2017) [17] for indigenous population. To standardize the state-level incidence rates and death rates by age, we used the direct method with the 2018 National Survey on Population Dynamics [25] as reference.

Statistical analysis
Numerical variables contained in the database are presented as patient counts and means; categorical variables are presented as patient counts and percentages.
Correlation graphics were used to ascertain the statistical association between age-adjusted incidence rates and age-adjusted death rates with GDP per capita by state. Ryan-Joiner tests were performed to examine the distribution of the variables involved, which showed that neither state-level GDP per capita (RJ = 0.882, p < 0.01) nor age-adjusted incidence rates (RJ = 0.952, p < 0.05) were normally distributed. The Ryan-Joiner test did not show evidence of non-normality (RJ = 0.987, p > 0.05) for age-adjusted death rates. Thus, Spearman correlations (95% CI) were calculated. A p-value < 0.05 criterion was used for determining a significant correlation between variables.
The data were further analyzed using the Kruskal-Wallis test to examine associations of COVID-19related variables (crude rates of cumulative incidence, hospitalized patients, outpatients, and deaths per 100,000 population), with state-level GDP per capita categories (quartiles). Where the Kruskal Wallis test results were significant, a follow-up analysis using the Mann-Whitney statistic (W value) was conducted. All statistical analyses were performed using Minitab software version 19.1.
No ethical review was required for this study, since the open-access database used contains anonymized information.

Infected population characteristics and outcomes
With a population of 126.0 million [24] (114.0 million non-indigenous and 12.0 million indigenous), the overall cumulative COVID-19 incidence rate in Mexico was 823.4 cases per 100,000 inhabitants between February 29, 2020 and November 13, 2020. Cumulative cases per 100,000 population were 900.3 for non-indigenous people and 94.4 for indigenous people.
The age group 30 to 59 presented 60.5% of the cumulative cases in the non-indigenous population, and 55.5% of cumulative cases in the indigenous population (Table 1).
In both population groups, more than 80% of case fatalities were concentrated in age groups 50 and above. The case fatality rate (CFR), or the proportion of patients who died among confirmed COVID-19 cases, was lower for the non-indigenous population (9.7%) than for the indigenous population (14.7%), as was the percentage of patients who died at home (respectively: 10.9% and 12.0% overall; 13.3% and 29.6% for the 30 to 39 year age group).
The overall cumulative deaths per 100,000 population were substantially higher for the nonindigenous population (87.1) than for the indigenous population (13.9). Deaths rose significantly in the older age groups in both populations, reaching 572.8 and 104.4 respectively in the 80 years or older group. The mean age of the confirmed cases in the nonindigenous population was 44.3 years, whereas confirmed cases in the indigenous population were older, with a mean age of 48.2 years. There was an approximately even split between males and females in confirmed cases of the non-indigenous population, but in the case of the indigenous population there was a higher proportion of males (55.8%) ( Table 2). Of the total confirmed cases for each population, the proportion of patients that reported having at least one comorbidity was 43.0% for the non-indigenous population, and 49.8% for the indigenous population. In both population groups, patients with at least one comorbidity were older (50+ years) than the average age of confirmed cases.
Severe cases, grouped here in a composite endpoint [26,27] consisting of admittance to an intensive care unit (ICU), or invasive ventilation, or death, represented 11.0% of confirmed cases among the non-indigenous population, and 16.6% of confirmed cases in the indigenous population. Confirmed deaths represented 9.7% of confirmed cases in the non-indigenous population, and 14.7% of confirmed cases in the indigenous population. In both instances, the male population comprised 60% or higher of the total COVID-19 cases. The mean age of the severe cases and the confirmed deaths were, respectively, 61.6 and 62.7 years for the non-indigenous population, and 62.5 and 64.0 years for the indigenous population. Comorbidities, which have been suspected to be at least partly responsible for COVID-19-related deaths since the start of the pandemic [28], were present in 73% of the deceased, both in non-indigenous and indigenous populations. The proportion of deceased females presenting comorbidities was 78.4% for the nonindigenous population, and 82.9% for the indigenous population, compared to the corresponding 69.9% and 67.8% for deceased males.
The prevalence of any of the three main comorbidities (hypertension, obesity, or diabetes) was higher in the indigenous group than in the nonindigenous group (42.7% vs. 35.3%). The prevalence of tobacco use was lower in the indigenous group than in the non-indigenous group (5.9% vs. 7.3%). The prevalence of the rest of the comorbidities was similar for both population groups, except for chronic obstructive pulmonary disease (COPD), which in the indigenous group was more than double that of the nonindigenous group (2.9% vs. 1.3%) ( Figure 1).
Of the 46,558 patients that were admitted to an ICU or received invasive ventilation, 32,737 (70.3%) died and 13,821 (29.7%) recovered (Table 3). However, not all of the severe cases were admitted to an ICU or received invasive ventilation: 68,268 of 114,826 severe cases (59.5%) resulted in death in this situation; 57,284 (49.9%) of these patients had been hospitalized, but 10,984 (9.6%) died at home (outpatient deaths). Regarding indigenous patients, 63.4% of severe cases culminated in death without being admitted to an ICU or having received invasive ventilation (52.8% had been hospitalized, but 10.6% died at home). The presence of comorbidities was more common in severe cases than in non-severe cases (72.1% vs. 39.5%). In severe cases resulting in death, 73% of the patients had at least one comorbidity; in severe cases where the patient recovered, this proportion dropped to 65.3%.
Two or more comorbidities were observed in 40.1% of hospitalized patients admitted to an ICU or that received invasive ventilation, as compared to hospitalized patients not admitted to an ICU and that did not receive invasive ventilation (35.6%), and ambulatory patients (12.3%). Mexico spans a vast territory of nearly 2 million square kilometers [29], divided into 32 states. Each state presents a unique combination of socioeconomic and demographic characteristics, as well as different incidence and mortality rates. State-level Gross Domestic Product (GDP) per capita, as a measure of income, ranged from 2,395 US dollars in the state of Chiapas to 27,324 US dollars in the state of Campeche. Classified by quartiles (Q) of GDP per capita, six of the eight states in the lowest income quartile (Q1) are part of the group of eleven states with the most significant presence of indigenous population, i.e., the poorest states in the country frequently have a considerable proportion of indigenous population ( Figure 2).
Upon visual inspection, an upward trend in both cumulative incidence and death rates was observed, corresponding to increasing GDP per capita by state. The lowest age-standardized incidence and agestandardized death rates were observed in the remote state of Chiapas bordering Guatemala, which has the lowest GDP per capita; these rates were, respectively, 14.4 times and 5.6 times lower than those observed in Mexico City, the nation's capital, which has the secondhighest GDP per capita in the country (18,048 US dollars), the highest population density, and houses the most important airport hub in the country. Oaxaca and Guerrero, the next-lowest states in terms of GDP per capita (3,178 and 3,427 US dollars, respectively), with relative proximity to the nation's capital and with important tourist destinations located in specific areas, exhibited incidence rates 3.8 and 3.2 times lower than Mexico City, while the corresponding death rates were 3.4 and 2.2 times lower.
Five of the eight states in the highest income quartile, Q4 (Baja California Sur, Sonora, Coahuila, and Nuevo León in the northern part of the country, as well as Mexico City), had incidence rates above 1,000 and death rates of 100 or above (except Nuevo Leon, with 79). The state of Campeche, with the highest GDP per capita in the country due to crude oil production concentrated in two specific areas, had a death rate of 108 deaths per 100,000 population, but its incidence rate was one of the lowest in this quartile (753). It must be noted, however, that the state is scarcely populated and has a high presence of indigenous population (> 20%) dedicated to agriculture, living in remote regions. Using state-level results, a Spearman correlation analysis was performed to determine the relationship between GDP per capita and the cumulative incidence rate, as well as the death rate in each state. There was a strong, positive correlation between income and both the age-standardized incidence rate and the age standardized death rate, i.e., the higher the state-level income, the higher the COVID-19 incidence and death rates; the respective Spearman correlation coefficient values were 0.657 and 0.607 (p < 0.001 in both cases) ( Figures 3A and 3B).

Discussion
Once infected, the indigenous population, being older and more likely to have comorbidities, had worse outcomes than the non-indigenous population, as indicated by the corresponding proportion of severe illness cases (16.6% vs. 11.0%) and proportion of outpatient deaths (12.0% vs. 10.9%), yet the cumulative incidence and death rates per 100,000 population were considerably lower for the indigenous population compared to the non-indigenous population (94.4 vs. 900.3, and 13.9 vs. 87.1, respectively). Spearman correlation analysis indicates that the higher the statelevel income, the higher the COVID-19 agestandardized cumulative incidence and death rates by state for the population as a whole. Kruskal-Wallis test results and Mann-Whitney follow-up tests indicate that crude rates of cumulative incidence, hospitalized patients, outpatients, and deaths differ significantly by state-level income quartiles. However, when analyzed by ethnic group, only cumulative incidence and outpatients rates were significantly different between quartiles (originating from the different rates that result once we separate indigenous and non-indigenous populations): significantly lower for income quartile Q1 compared to quartiles Q2, Q3, and Q4 for the indigenous population, and significantly lower for income quartile Q1 compared to quartiles Q3 and Q4 in the case of the non-indigenous population. Such results evidence that the socioeconomic status health gradient for the COVID-19 pandemic is reversed for the case of Mexico.
To our knowledge, this is the first study to examine the differences by state-level income in COVID-19 cumulative incidence, deaths, hospitalizations, and outpatient rates in Mexico, in addition to presenting results by ethnic group. Ortiz-Hernández and Pérez-Sastré analyze the risk of severe COVID-19 outcomes in Mexico by ethnicity and municipality-level margination indices dating from 2015 and conclude that the risk is highest among low socioeconomic status indigenous population living in the country's southern states [30]. Argoty-Pantoja et al. compare fatality rates between the indigenous and non-indigenous populations in Mexico, concluding that fatality is higher in the former population group, particularly among outpatients [31]. Ibarra-Nava et al. study the fatality rate of the indigenous population in Mexico, concluding that indigenous people have a higher risk of death from COVID-19 [32]. Our study confirms such findings regarding fatality by means of a case fatality rate, which is higher for the indigenous population compared to the non-indigenous population (14.7% vs. 9.7%); this variable is not analyzed further because the true number of infected persons is underestimated in the case of Mexico due to limited testing (the cumulative number of COVID-19 laboratory tests performed by November 13, 2020 was 18.29 people tested per 1,000 population, as compared to 526.90 tests performed per 1,000 population in the United States of America [33]), resulting in an inflated rate. Furthermore, our results regarding a higher proportion of outpatient deaths for the indigenous population (12.0%) compared with the non-indigenous population (10.9%) are in line with the Argoty-Pantoja et al's. conclusion that indigenous outpatients are at a higher risk of death than nonindigenous outpatients. It is likely that in both population groups, such outpatients were misdiagnosed regarding the severity of their COVID-19 infection, given the narrow suspected COVID-19 case operational definition that was utilized. At the outset of the pandemic in Mexico, a suspected COVID-19 case was one where in the previous 7 days, the patient presented at least two of the following: cough, fever, headache, and one or more of either dyspnea, arthralgia, myalgia, odynophagia, rhinorrhea, conjunctivitis, or chest pain. This definition was updated on August 24, 2020, and COVID-19 was suspected when in the previous 10 days, the patient presented one or more of the following: cough, fever, dyspnea, headache, and one or more of either arthralgia, myalgia, odynophagia, chills, rhinorrhea, conjunctivitis, chest pain, anosmia, or dysgeusia [34]. Additionally, the higher proportion of indigenous patients in this situation might indicate a language barrier, where the symptoms of the indigenous patient are not clearly understood by the attending health professional, since 12.3% of the indigenous population that speaks an indigenous language are monolingual [17].
We found that beyond the potentially higher risk of more severe outcomes in the indigenous population in Mexico due to older age, proportionally more comorbidities, and higher health vulnerability (approximately 80% of indigenous language-speaking indigenous population live in poverty or extreme poverty conditions, and more than 80% of older adults lack social security [19]), the fact that they live mainly in states with lower GDP per capita helped them attain better COVID-19 outcomes than the non-indigenous population.
Our study has several strengths. The large sample size (over one million confirmed cases) confers reliability to our results, particularly descriptive statistics. The study considers the 2020 Population Census, published by the National Institute of Statistics and Geography in January 2021, allowing for more precise rates per 100,000 population. Another key strength is that by November 2020, the Mexican Ministry of Health had incorporated the "indigenous" variable to the COVID-19 cases database to determine ethnicity (previously, the database only included a "speaks indigenous tongue" variable), allowing us to include patients of all ages, and thus utilize population data from the National Institute of Indigenous People, which we consider to be the most reliable source at present.
The limitation of our study is that the COVID-19 cases database suffers from underreporting due to the sentinel surveillance system, which only tests a proportion of patients and leaves out mild or asymptomatic cases. Another limitation is the use of data aggregated at the state level due to the level of aggregation of GDP information; thus, possible variations at the municipality level are not captured. Also, further studies are necessary at the municipality level to analyze the effect of population density on the spread of COVID-19.
While higher health literacy in high-income states contributes to the SES health gradient by facilitating the population's adherence to outbreak containment measures [14], the fact that we are observing a reversed SES health gradient in Mexico suggests that other factors are outweighing its effect. One such factor could be that testing for COVID-19 could be even more limited in low-income states, resulting in a deceptively low incidence rate [35]. However, the sheer differences in age-adjusted incidence and death rates by state-level income, as observed in Figure 2, indicate the presence of other factors as well. In this regard, Hamidi et al. found that in the United States, connectivity is a more important contributor to the spread of the COVID-19 pandemic than population density [36]. In Mexico, the states with the lowest GDP per capita, such as Chiapas, Guerrero, and Oaxaca, tend to have important tourist destinations located in specific areas. The majority of the population, including a high presence of indigenous population, live in deprived, low-density rural areas with only local economic activities. In states with higher GDP per capita -many of them located in the northern part of Mexico and with low population density, economic activities tend to be more tightly linked together, with more business-related interaction and travel. It could be assumed that the more frequent economic and social interaction among the population in higher-GDP per capita states contributes to the higher incidence rates and death rates of COVID-19. Up to the November 13 cut-off point, connectivity, and the more frequent testing in higher GDP per capita states, apparently outweighed the effect of higher health literacy.

Conclusions
COVID-19 cumulative incidence, hospitalizations, outpatients, and mortality rates presented a reversed socioeconomic status health gradient in Mexico according to GDP per capita by state, characterized by less adverse outcomes in the lowest-income states compared to states with higher income. A health strategy should be designed to assure positive health outcomes are maintained in regions with low cumulative incidence rates even after becoming more integrated with the rest of the economy, particularly low-income areas with a significant presence of indigenous population.