"If you torture the data long enough, it will confess.” This statement by British economist Ronald Coase is often used to question the validity of economic assertions made by economists in their papers.
Data, an apparently objective set of figures, has been used as a weapon to build narratives in favour of political regimes of all hue and colour across the world, and India is no exception to this rule. From the days of celebrated statistician P.C. Mahalanobis under Prime Minister Jawaharlal Nehru, India has both innovated on and struggled with data-collection mechanisms to capture the economic activity in the country and living standards of people. After decades of hits and misses, Indian statisticians seem to have entered into a new innovative era, where political economy is interpreted not just on what data is available but also on its missing components.
In situations where either no data is available or is too old or inconsequential to offer definitive conclusions, economists have found a way out. They extrapolate. That is what happened in a recent paper Pandemic, Poverty, and Inequality: Evidence from India, written by economists, including Surjit Bhalla, to argue that extreme poverty—calculated at purchasing power parity of $1.9 per day of income—had been eradicated from India during the Covid-19 outbreak. The estimates of this paper are in stark contrast to another study on the same subject that was published a few months ago by US based Pew Research Center. It claimed in a study that 75 million people in India fell below the poverty line of $2 a day income due to the pandemic.
Why Economists Disagree
How can two studies, done by two sets of reputed economists, be so different in their conclusions? The answer to this question lies in the non-existence of data. Unlike the advanced economies, India does not have any income data for more than 85% of its workforce. The reason for this gap is not too tough to guess. “A large part of our economy is unorganised. It is difficult to calculate agricultural income, because most farmers depend upon other jobs to make a living. Then there are construction workers and people who get income in cash. That data cannot be captured,” says P.C. Mohanan, economist and former head of the National Statistical Commission (NSC).
While government bodies get regular data input from the organised sector, it has its own set of problems. For one, only a very small segment of the workforce is employed in the organised sector. Secondly, the data that is collected from the organised sector is not open for researchers to investigate. This leaves researchers to develop their own methodologies to arrive at conclusions which could politically suit certain quarters.
Yashwant Sinha, former finance minister of India, says, “As an administrative officer posted in Patna, I had realised how poor India was at collecting data on its economy. This is the reason India had established the National Statistical Commission. But, then the government junked its reports in 2019. Instead of making the NSC and its data collection [methods] strong, the government is looking for ways to get only that data out which supports its narrative of growth.”
He argues that without having quality data, it is not possible for the government to make the right kind of policies. The government he was part of had appointed an expert committee under economist C. Rangarajan to study the system of data collection, storage and dissemination in the country. The NSC was set up in 2006 on the recommendation of the C. Rangarajan Commission.
Despite being the fifth largest economy in the world, India’s record in data collection has been poor. Take, for example, the case of gross domestic product (GDP) calculations made by the Ministry of Statistics and Programme Implementation. All quarterly and yearly estimates announced for GDP growth are based on 50% of the Indian economy, which is formal. Data for the informal sector is extrapolated, with the assumption that the informal economy grows at the same pace as the formal economy. In 2017, this method of calculating the GDP stoked controversy when government estimates of GDP suggested that there was no negative impact of demonetisation on the Indian economy.
The most contentious gap in data collection lies in the case of the Consumer Expenditure Survey (CES). Considered a reliable gauge of consumption levels in the Indian economy, the CES used to be released every five years. Unfortunately, the last set of CES data for 2017-18 was junked by the Centre, after a leaked report from the survey suggested the average amount of money spent by an Indian fell by 3.7% to Rs 1,446 per month in 2017-18 compared to 2011-12. The survey findings, if accepted, would have been an indictment of the demonetisation of high-value currency notes. Mohanan, who then headed the NSC that was responsible for conducting the CES, says: “The government did not say what fault lied in the [CES] methodology, even though the survey had the same methodology that was used in earlier surveys. There have been times in the past when surveys have gone against incumbent governments, but they did not call them faulty.”
Subhash Chandra Garg, who held the positions of secretary of economic affairs and finance between 2017 and 2019, says, “It was an unfortunate decision by the government to reject the estimates of the Consumer Expenditure Survey. It did more harm than good for the government in terms of gaining trust of experts and global agencies that track the Indian economy.”
Since the last official CES survey was published in 2011–12, that data cannot be used to calculate poverty in 2021. Researchers like Bhalla are adjusting the findings of the 2011–12 survey alongside other datasets to arrive at national and state-level poverty figures. In the case of poverty estimates, Bhalla has used estimates from CES 2011–12 and adjusted it with the state-level GDP and population data and administrative data on various consumption items, adjusting them with national private final consumption expenditures.
Rishabh Kumar, who teaches economics at the University of Massachusetts Boston, says that in the absence of representative income data in India, it is not possible to calculate how much an average Indian earns. “It is easy to know the income of the richest Indians through the income tax data provided by the tax department. But, every economist has to rely on their own mathematical models to calculate the income levels of the bottom 40%. It not only affects the policy making of the government but also the political choice of the electorate in a democracy,” he says.
It is interesting though that even as Bhalla, who is perceived to be backing the government on economic issues, has said that India has eradicated extreme poverty. Head of the Economic Advisory Council to the Prime Minister Bibek Debroy accepted recently that it was not possible to estimate the extent of poverty in India without considering the CES data. He added that the government could start collecting the CES data later this year.
Disagreements Within and Outside
The debate over the quality of data has intensified under the current government, as, on many occasions, the government has dilly-dallied on the issue of providing data to back its claims on developmental and other achievements. In January this year, the Supreme Court asked the government to submit proof of the latter’s claim that during Covid-19 “nobody died of starvation”. The Centre avoided taking responsibility for the claim by submitting to the court that no state had provided it data on starvation deaths, thus passing the buck.
The Centre manages to get away with not having data while dealing with domestic institutions of accountability. But, the situation changes in the international arena. In April, a World Health Organization (WHO) report claimed that India had the highest Covid-19 mortality in the world, with 4.7 million dying of the disease between March 2020 and December 2021. The figure was 10 times the official government claim. In the absence of all-cause mortality data at the national level in India, the WHO used its own assumptions to arrive at the figure of 4.7 million deaths in India.
India has been at loggerheads with international organisations on other occasions as well. Earlier this year, finance minister Nirmala Sitharaman dismissed the World Inequality Report, which is published by the World Inequality Lab, as flawed by raising questions on its methodology. The report stated that India was a poor and unequal country, with the top one per cent of the population holding more than one-fifth of the total national income in 2021, while people at the bottom half held just 13% of it.
Another report that did not go down well with the Union government was the Global Hunger Index that ranked India below its poorer neighbours Pakistan, Nepal and Bangladesh.
India’s Antiquated Data Machine
In 2019, two members of the NSC, including Mohanan, resigned after the government refused to publish a study conducted by the National Sample Survey Office that estimated unemployment in India to be at a 45-year high. The NSC had vetted the study.
Even Gita Gopinath, chief economist for the International Monetary Fund, had raised the issue of “transparency” over data collection with the government in 2019.
Mohanan says that the nature of the Indian economy has changed drastically as consumption habits and patterns of its citizens have changed. He adds that to improve the quality of its data, the country needs to increase the resources available for collecting it. “The NSSO has the same strength that it had in the 1980s. The sample size is also static at 1.15 lakh, even though our population has doubled since the time we changed our methodology. … [To] have a better methodology that is representative of a larger population, we need to increase resources to collect data,” he says.
In order to put the controversies surrounding the poverty estimates in India to rest, the government has decided to re-launch a new CES from July 1, but using a different methodology than the earlier ones. But, as things stand today, this proposal is already being questioned by economists. Under the new survey, there will be additional questions on the government’s free food grain programme and instead of one visit, the survey will now be completed in three visits to the respondent’s home. Since the new methodology will be based on an additional set of questions and two additional visits, the findings will not be comparable to previous surveys and estimates of poverty in the country.
If India wants to find a solution to the problem of poverty, it must allow relevant data to be collected and analysed on time in a structured way with definitional uniformity. And, while dealing with the academics working on poverty data, policymakers will do well to remember another statement by Ronald Coase: “I hate when people ask me to massage the data.”