R/data.R
nytexcess.Rd
All-cause mortality is widely used by demographers and other researchers to understand the full impact of deadly events, including epidemics, wars and natural disasters. The totals in this data include deaths from Covid-19 as well as those from other causes, likely including people who could not be treated or did not seek treatment for other conditions.
nytexcess
A tibble with 7,258 rows and 12 columns
country
character Country Name
placename
character Place Name
frequency
character Reporting period. Weekly or monthly, depending on how the data is recorded.
start_date
date The first date included in the period.
end_date
date The last date included in the period,
year
character Year of data. Note that this variable is of type character and not integer because several observations are notes to the effect that the year is an average of two years.
month
integer Numerical month.
week
integer Numerical week.
deaths
integer The total number of confirmed deaths recorded from any cause.
expected_deaths
integer The baseline number of expected deaths, calculated from a historical average. See details below.
excess_deaths
integer The number of deaths minus the expected deaths.
baseline
character The years used to calculate expected_deaths.
The New York Times https://github.com/nytimes/covid-19-data/tree/master/excess-deaths.
Table: Data summary
Name | nytexcess |
Number of rows | 7258 |
Number of columns | 12 |
_______________________ | |
Column type frequency: | |
Date | 2 |
character | 5 |
numeric | 5 |
________________________ | |
Group variables | None |
Variable type: Date
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
start_date | 768 | 0.89 | 2010-01-09 | 2020-12-23 | 2018-02-05 | 1267 |
end_date | 768 | 0.89 | 2010-01-15 | 2020-12-29 | 2018-02-11 | 1267 |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
country | 0 | 1.00 | 4 | 14 | 0 | 35 | 0 |
placename | 6883 | 0.05 | 6 | 8 | 0 | 4 | 0 |
frequency | 0 | 1.00 | 6 | 7 | 0 | 2 | 0 |
year | 0 | 1.00 | 4 | 17 | 0 | 15 | 0 |
baseline | 5990 | 0.17 | 20 | 25 | 0 | 7 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
month | 0 | 1.00 | 6.60 | 3.36 | 1 | 4.00 | 7.0 | 9.0 | 12 | ▇▆▆▆▇ |
week | 666 | 0.91 | 26.77 | 14.58 | 2 | 14.00 | 27.0 | 39.0 | 52 | ▇▇▇▇▇ |
deaths | 0 | 1.00 | 7968.24 | 14334.14 | 455 | 1460.00 | 2395.5 | 10486.0 | 141292 | ▇▁▁▁▁ |
expected_deaths | 5990 | 0.17 | 9237.09 | 15850.00 | 548 | 1443.00 | 2423.0 | 10771.5 | 139343 | ▇▁▁▁▁ |
excess_deaths | 5990 | 0.17 | 1195.43 | 3242.72 | -6721 | -42.25 | 76.5 | 926.0 | 30400 | ▇▂▁▁▁ |
Expected deaths for each area based on historical data for the same time of year. These expected deaths are the basis for our excess death calculations, which estimate how many more people have died this year than in an average year.
The number of years used in the historical averages changes depending on what data is available, whether it is reliable and underlying demographic changes. See Data Sources for the years used to calculate the baselines. The baselines do not adjust for changes in age or other demographics, and they do not account for changes in total population.
The number of expected deaths are not adjusted for how non-Covid-19 deaths may change during the outbreak, which will take some time to figure out. As countries impose control measures, deaths from causes like road accidents and homicides may decline. And people who die from Covid-19 cannot die later from other causes, which may reduce other causes of death. Both of these factors, if they play a role, would lead these baselines to understate, rather than overstate, the number of excess deaths.
For further details on these data see https://github.com/nytimes/covid-19-data/tree/master/excess-deaths