“A Matter of Numbers” by Dilip D’souza is one of my favourite columns in Mint daily. It talks about my favourite subject – Mathematics and tries to decode things around us through the lens of mathematics, statistics etc. It often stimulates thinking or provides some scintillating insights.
Dilip’s latest column “Life expectancies and tennis racquets” is very interesting and hence I thought of sharing it here. The article talks about Life expectancy, expected number of deaths based on life expectancy and the actual deaths which are much lower (by about 40%-50%). Then it poses two interesting questions: (1) What explains the discrepancy between expected number of deaths and the actual number of deaths and (2) How the analogy of Tennis racquets mentioned by the author in the initial paragraph relates to this discrepancy.
Let me attempt to share my thoughts on this.
Firstly, the approximation of (1 divided by Life Expectancy) to estimate expected number of deaths is too simplistic. It would probably hold true when the Life Expectancy and Population growth are stagnant or changing at a flat and lower rate. That’s not true in case of India. Look at the below charts of “Life Expectancy vs annual % change in population” and “Population vs annual % change in population”.
It is incorrect to use point estimates of “Life expectancy” and “Total Population” and take their ratio to arrive at the expected number of deaths. Because both these parameters have been changing at a considerable rate for last 6-7 decades.
Consider life expectancy. A 70 year old person who dies today was born in early 1950’s when his life expectancy was 32-33 years and population was 38-39 Crores. So if we divide 39 crores by 32, then approximately 11-12 Lakh people (1.1 million) would have been estimated to have died at that time. Because the person born in 1950-52 lived till 2022, it’s incorrect to use today’s point estimates. We should also consider life expectancy and population when he was born.
When any data set is frequently undergoing change it is not appropriate to use point estimates. Considering moving averages and analysing impact of paradigm shift is important.
The logical argument for the discrepancy between estimated deaths (based on Population divided by Life expectancy) and actual deaths is the following, as per my understanding. The population growth rate is positive, even though it’s been declining. This means that there is always net addition to population every year. At the same time life expectancy has been steadily increasing, because of better standard of living, advancements in medical sciences etc. This means that fewer people are dying. The combined effect is that the numerator (Population) is increasing, but the denominator (Life expectancy) is increasing as well. Population has increased from 37 Crores in 1950 to 140 Crores today. Increase of 3.78x. Life expectancy has increased from 35 years in 1950 to 70 years today. Increase of 2x. So if we continue to use the same ration, the expected death count would be overestimated (3.78/2=1.89). However, it is obvious, logically, that a person born in 1950 was likely to live for less years than a person born today.
Hence we need a better metric which takes care of changing data set, rather than using simple and constant formula of Population divided by Life expectancy.
Now coming to the second question of Tennis Racquets analogy. Playing less games is equivalent to increased life expectancy. Stringing is equivalent to deaths. So as games reduce (life expectancy increases), the stringing (deaths) reduces.
This argument of data set undergoing changes is true for India. What about the US? Their birth rate is much stable. True, but there is the immigration factor. The US population is increasing due to immigration. How would you apply the same ratio (Population divided by Life expectancy) to immigrants? For example, if someone migrated in 1950 from Iran to the US (Andres Agassi’s father actually did so), would you consider life expectancy in Iran in 1950 for that migrant or the US data? In short it’s much complicated.
I still don’t have any explanation or analysis for why the discrepancy in expected deaths and actual deaths is so huge and if there is any trend. May be there are few other factors (such as lack of accuracy of date related to deaths or natural calamities etc).