It has been a productive time in India thus far, and a homecoming of sorts for me, a researcher sitting in the US and working on India. The productivity of Indian researchers (i.e. researchers on India) can be measured through their projects and more importantly, publications record, and a cursory glance at any of the many journals in the social sciences indicates that Indian researchers based out-of-India are very productive. But I always wondered how they managed to closely contextualize their work to India, and seem well-informed and rooted. At the end of about 3 weeks, I can submit that engaging with researchers in India and collaborating with them are likely important elements of being good Indian researchers working from elsewhere. One of the biggest learnings I have had is the need to disaggregate analysis by region – look at trends in demographic events of interest by regions/states differently because they are all very very diverse. And more often than not, I've been told that a more descriptive/qualitative research piece may need to substantiate the findings based on number-crunching. Fertility, the subject area of my research, is well-known to vary regionally, and particularly sex ratios at birth and child sex ratios. So I always knew that any analysis I do would need to be done at the sub-national level. But when dealing with factors determining fertility or son-preference behaviors, I am hearing that even state-level analysis would be too broad-brushed. 

As a quantitative researcher in Demography, I deal with aggregates. If my analysis uses sample data, I attempt to stake the claim that my results are representative of a larger population, the population from which the sample is drawn and is representative of. Of course, if I could use Census data, the problem of representativeness would never arise. But given that Census data is scarcely available at the disaggregated level to begin with, and my statistical techniques require – more often than not – micro- or individual-level data, my use of Census data is limited to district- or state-level indicators of health and mortality, but not individual or household- characteristics and behaviors. So, I fall back on survey data. As do many many if not most other researchers. But imagine if I were to be told that my use of survey data was inherently flawed – because I am attempting to generalize using characteristics that were never part of the original sampling methodology. Here is an example: lets say I want to analyze differences between Hindus and Muslims in a particular state of India. The survey that I think of using is a nationally- and state-level representative survey of thousands of households. But a senior statistician who deals with surveys throws the biggest spanner I have ever seen into my relatively-small works, tells me that the survey never attempted to generalize for the Hindu and Muslim population of the state or country, and so nothing I say is actually likely to be 'generalizable'. Aha, I say. Well, so then I say that this would be true if the Hindu and Muslim population would differ substantially on the indicators on the basis of which the sample was drawn – in case of the survey I am using, the sample was drawn on the basis of population of ever-married women in the age-group of 14-49 and urban/rural residence – both according to the Census. I don't remember the differences top-of-my-head but I say that there may be small differences. At this time, the senior statistician sits back, smiles, and says this is the problem with most analysis using surveys, but that neither he nor his organization ever brings this up. I ask why, and what about the use of sampling weights, and he dismisses sampling weights and says they don't say anything because everybody-is-like-this-only! 

So, it may be true that this is the way it is and has been, and I take assurance from the fact that numerous studies published before me have used survey data and not even mentioned this issue. But I haven't been able to dismiss the idea. I need to investigate this whole issue a little further. I also need to find the 'appropriate' level of aggregation for my research. As luck would have it, I am sitting with a team fielding one of the biggest nationally-representative surveys of the country. Representation, though, would be state-level. More on that later. 


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

About avisaria