If 1% of COVID-19 cases result in death, does that mean you have a 1% chance of dying if you catch it? A mathematician explains the difference between a population statistic and your personal risk
As of April 2023, about 1% of people who contracted COVID-19 ended up dying. Does that mean you have a 1% chance of dying from COVID-19?
That 1% is what epidemiologists call the case fatality rate, calculated by dividing the number of confirmed COVID-19 deaths by the number of confirmed cases. The case fatality rate is a statistic, or something that is calculated from a data set. Specifically, it is a type of statistic called a sample proportion, which measures the proportion of data that satisfies some criteria – in this case, the proportion of COVID-19 cases that ended with death.
The goal of calculating a statistic like case fatality rate is normally to estimate an unknown proportion. In this case, if every person in the world were infected with COVID-19, what proportion would die? However, some people also use this statistic as a guide to estimate personal risk as well.
It is natural to think of such a statistic as a probability. For example, popular statements that you are more likely to get struck by lightning than die in a terrorist attack, or die driving to work than get killed in a plane crash, are based on statistics. But is it accurate to take these statements literally?
I’m a mathematician who studies probability theory. During the pandemic, I watched health statistics become a national conversation. The public was inundated with ever-changing data as research unfolded in real time, calling attention to specific risk factors such as preexisting conditions or age. However, using these statistics to accurately determine your own personal risk is nearly impossible since it varies so much from person to person and depends on intricate physical and biological processes.
The mathematics of probability
In probability theory, a process is considered random if it has an unpredictable outcome. This unpredictability could simply be due to difficulty in getting the necessary information to accurately predict the outcome. Random processes have observable events that can each be assigned a probability, or the tendency for that process to give that particular result.
A typical example of a random process is flipping a coin. A coin flip has two possible outcomes, each assigned a probability of 50%. Even though most people might think of this process as random, knowing the precise force applied to the coin can allow an observer to predict the outcome. But a coin flip is still considered random since measuring this force is not practical in real-life settings. A slight change can result in a different outcome for the coin flip.
A common way to think about the probability of heads being 50% is that, when a coin is flipped several times, you would expect 50% of those flips to be heads. For a large number of flips, in fact, very close to 50% of the flips will be heads. A mathematical theorem called the law of large numbers guarantees this, stating that running proportion of outcomes will get closer and closer to the actual probability when the process is repeated many times. The more you flip the coin, the running percentage of flips that are heads will get closer and closer to 50%, essentially with certainty. This depends on each repeated coin flip happening in essentially identical conditions, though.
The 1% case fatality rate of COVID-19 can be thought of as the running percentage of COVID-19 cases that have resulted in death. It doesn’t represent the true average probability of death, though, since the virus, and the global population’s immunity and behavior, have changed so much over time. The conditions are not constant.
Only if the virus stopped evolving, everyone’s immunity and risk of death were identical and unchanging over time, and there were always people available to become infected, then, by the law of large numbers, would the case fatality rate get closer to the true average probability of death over time.
A 1% chance of dying?
The biological process of a disease leading to death is complex and uncertain. It is unpredictable and therefore random. Each person has a real physical risk of dying from COVID-19, though this risk varies over time and place and between individuals. So, at best, 1% could be the average probability of death within the population.
Health risks vary among demographic groups, too. For example, elderly individuals have a much higher risk of death than younger individuals. Tracking COVID-19 infections and how they end for a large number of people that are demographically similar to you would give a better estimate of personal risk.
Case fatality rate is a probability, but only when you look at the specific data set it was directly calculated from. If you were to write the outcome of every COVID-19 case in that data set on a strip of paper and randomly select one from a hat, you have a 1% chance of selecting a case that ended in death. Doing this only for cases from a particular group, such as a group of older adults with a higher risk or young children with a lower risk, would cause the percentage to be higher or lower. This is why 1% may not be a great estimate of personal risk for every person across all demographic groups.
We can apply this logic to car accidents. The chance of getting into a car crash on a 1,000-mile road trip is about 1 in 366. But if you are never anywhere near roads or cars, then you would have a 0% chance. This is really a probability only in the sense of drawing names from a hat. It also applies unevenly across the population – say, due to differences in driving behavior and local road conditions.
Although a population statistic is not the same thing as a probability, it might be a good estimate of it. But only if everyone in the population is demographically similar enough so that the statistic doesn’t change much when calculated for different subgroups.
The next time you’re confronted with such a population statistic, recognize what it actually is: It’s just the percent of a particular population that satisfies some criteria. Chances are, you’re not average for that population. Your own personal probability could be higher or lower.