Essential health information available for India in the public domain on the internet

Background Health information and statistics are important for planning, monitoring and improvement of the health of populations. However, the availability of health information in developing countries is often inadequate. This paper reviews the essential health information available readily in the public domain on the internet for India in order to broadly assess its adequacy and inform further development. Methods The essential sources of health-related information for India were reviewed. An extensive search of relevant websites and the PubMed literature database was conducted to identify the sources. For each essential source the periodicity of the data collection, the information it generates, the geographical level at which information is reported, and its availability in the public domain on the internet were assessed. Results The available information related to non-communicable diseases and injuries was poor. This is a significant gap as India is undergoing an epidemiological transition with these diseases/conditions accounting for a major proportion of disease burden. Information on infrastructure and human resources was primarily available for the public health sector, with almost none for the private sector which provides a large proportion of the health services in India. Majority of the information was available at the state level with almost negligible at the district level, which is a limitation for the practical implementation of health programmes at the district level under the proposed decentralisation of health services in India. Conclusion This broad review of the essential health information readily available in the public domain on the internet for India highlights that the significant gaps related to non-communicable diseases and injuries, private health sector and district level information need to be addressed to further develop an effective health information system in India.


Background
Health information and statistics are important for planning, monitoring and improvement of services for the health of populations [1]. These are essential for policymakers and programme planners to inform their decisions about what actions to take and what services to provide in order to improve the health of the populations they serve. Though developing countries account for the majority of the global burden of disease, the availability of health information is not adequate in many of these countries [1][2][3]. The lack of quality health information has become more apparent in recent years with the Millennium Development Goals and demand from international organisations for monitoring and evaluation data on health programmes supported by them [4].
The Government of India in its National Health Policy of 2002 acknowledged the absence of systematic and scientific population health statistics as a major deficiency in India [5]. There have been a few recent efforts to strengthen some of the data sources in India [6-9]. While the data sources themselves require strengthening, it is important that the information generated by them is made available in the public domain in order for it to be utilised by a variety of stakeholders. In this background, as an initial step we reviewed the essential health information readily available in the public domain on the internet for India.

Methods
We reviewed the essential sources of health-related information for India as identified by AbouZahr and Boerma [10]. The search strategy included extensive searches of the various websites of the Government of India and the relevant national and international organisations using Google [11] and search of the PubMed literature database [12].

Identification of available data
The ten essential sources of information included in this report are: census; birth and death registration; surveillance and response systems; household surveys; service generated data; mapping of health facilities; behavioural surveillance; national health accounts; financial and management information; and modelling, estimates and projections [10]. These information sources provide data at a variety of levels household, patient, health facility, district, state and national. Health research, one of the other essential sources of information [10], was not included in this review as it is a major topic by itself and we have previously reported on health research output from India [13].
As a first step, extensive searches were carried out for the essential sources and their outputs on websites of the government, national health programmes, nongovernmental institutions, and international agencies relevant for the health information system in India. These searches provided further useful links to other relevant websites that provided information on essential sources of population health information in India. In addition, for each essen-tial source a Google search was carried out using "India" AND the essential source related terms. For example, for birth and death registration, the search combinations of "birth registration, death registration, or sample registration system AND India" were used. A similar strategy was used to search PubMed for publications 1990 onwards on essential sources of health information in India. From these various searches, the organisations or programmes that yielded usable information regarding the content, process or outputs of essential sources of health information for India are shown in Table 1.

Assessment of available data
For each essential information source, the periodicity of data collection, the information it generated, the geographical level at which this information is reported, and the extent of its availability in the public domain were assessed. Additional searches and reviews were conducted, if needed, of the relevant reports and publications accessible in the public domain to assist with these assessments.
Periodicity was defined as the frequency with which data collection was carried out. The latest year of data collection is reported. The information generated by each essential source was classified into these five categories: mortality and causes of death, morbidity and health status, risk factors, service provision, and health resources [4]. Information on health resources was sub-categorised based on whether it was related to infrastructure, human resources or financing [14].
The available information was further assessed to understand how it related to the major causes of disease burden in India as estimated by the Global Burden of Disease Project [15]. The rationale for the use of the leading causes of disease burden was based on the idea that relevant health information for these causes at least should be available. All available information produced by the essential sources was examined. Mortality, morbidity, risk factor and service provision information related to the conditions was listed. A risk factor for each condition was considered if it was listed in the major publications examining risk factors or a review of risk factors in India [16][17][18]. Service provision indicators considered were the interventions that addressed the risk factors or treatment of a disease or condition.
Geographical level of information reported was assessed for the administrative level for which the information was available: national, state, district or city/town. The extent of availability in the public domain on the internet was rated based on whether all the information produced was freely available including reports, microdata and metadata where appropriate (+++); all types of information was available but with restrictions such as purchase cost (++); only some information was available, for example reports but no microdata (+); or no information was available in the public domain (0). Availability of microdata was not used as a criterion for the documents or sources that dealt with secondary data or were based on other sources.

Results
The findings related to the characteristics of health information for India available from essential sources in the public domain on the internet are summarized in Table  2[6, 8,9,16,.

Information generated
The essential sources generate all five categories of information, i.e. mortality and causes of death, morbidity and health status, risk factors, service provision, and health resources, but to varying degrees (  [16]. Service provision information, such as coverage of interventions, is generated by a surveillance and monitoring system [26] [46,47,49,50] and financial and management information [55][56][57] (Table 2). Much of this deals with the numbers of public health facilities and the equipment availability in these facilities. Estimates of   ¶Only for the states of Uttar Pradesh and Bihar. **Part of a regular household consumption survey, but these are the only reports on maternal and child health, and tobacco consumption available. † † District level only for service provision; infrastructure and human resources information available for state level. ‡ ‡At the time of writing this paper, reports, metadata and microdata available for the 2003 survey. § §Maps produced in 2007 using 2001 census data. |||| Only for the states of Andhra Pradesh, Gujarat, Karnataka, Kerala, Madhya Pradesh, Maharashtra, Orissa, Tamil Nadu, Uttar Pradesh, West Bengal. ¶ ¶Only reports available; these are not primary data collection sources, so microdata not expected. ***Some results have been released in the form of a journal publication [53].
† † †National Health Accounts for 2004/5 are being computed but the results not available at the time of writing this paper. ‡ ‡ ‡Regional estimates for 2004 were made available in 2008.  Mortality information generated by the essential sources other than modelled projections is available to some degree for the leading causes of disease burden in India except lower respiratory tract infections and HIV/AIDS. However, it is not complete for many conditions. For example, the perinatal mortality rate is available but there are no data available on the causes of perinatal death. Additionally, for many of the conditions, the only mortality information available has been generated by the World Health Organisation World Health Survey, which used verbal autopsy for sibling deaths. These estimates have limitations in the detail of the cause of death, the small sample of deaths examined, and the fact that for 46% of female deaths and 40% of male deaths the cause of death was not determined [34]. This lack of primary data on mortality is being partly addressed by the addition of verbal autopsy to the sample registration system, which is likely to provide mortality data for all major causes in India [6].
There are more sources generating information on morbidity for maternal and child health and communicable diseases as compared with non-communicable diseases.
All of the available morbidity information is sourced from self-reporting of conditions in household surveys. The only exception to this is HIV prevalence. Although, selfreporting of conditions in surveys is problematic, use of biological tests in surveys for the estimation of population prevalence can be tedious for many conditions [4]. The use of other sources such as service records can aid to fill some of these gaps, although these estimates are prone to bias [4]. As with mortality information, modelled morbidity estimates for India have been produced by the Global Burden of Disease and Risk Factors study [15]. Additionally, the National Commission on Macroeconomics and Health produced modelled morbidity estimates for tuberculosis, HIV/AIDS, diarrhoeal diseases, blindness, mental health and cardiovascular disease [57].
Risk factor information is generated by a large number of essential sources in addition to the estimates by the Global Burden of Disease and Risk Factors Study. The majority of this information is generated by household surveys. Information on risk factors is more substantial for perinatal and maternal conditions than for the other conditions. The proposed Integrated Disease Surveillance Project is expected to enhance the risk factor information for noncommunicable diseases and injuries, but outputs from this initiative are not available yet [9].
Service provision information is lacking for the leading non-communicable diseases and road traffic injuries.
Only one essential source generates information on services for ischaemic heart disease, unipolar depression and road traffic injuries, and these estimates are based on data from a household survey, held in only 6 states in India [34]. There is no information on services for cerebrovascular disease.       Overall, there is a significant lack of relevant information on non-communicable diseases and injuries, which now account for a major proportion of the disease burden in India [15]. Household surveys are the main source of primary information on the leading causes of disease burden in India. Their main focus is on communicable diseases and maternal and child health. The other essential sources from which information is available on the leading causes of disease burden in India are a surveillance system [8], service generated data [43][44][45] .

Geographical level of information
The majority of information produced by the essential sources is reported at the state level (Tables 2). Census produces information at the town and city level including demographic information, access to clean water, sanitation and use of cooking fuels [19]. The National Cancer Registry Programme generates information mainly at the city level [26]. The Reproductive and Child Health District Level Household Survey was designed to provide monitoring of the Reproductive and Child Health programme at the district level [29]. The Bulletin on Rural Health Statistics generates some information at the district level on numbers of public health facilities in each district [46]. The Revised National Tuberculosis Control Programme provides performance indicators for the programme at the district level [45].

Availability in the public domain
The availability of the information produced by the essential sources in the public domain on the internet was variable (

Discussion
Ready availability of essential health information is imperative for the development of informed and effective systems for improving health of societies. This paper provides a broad overview of the data readily available in the public domain on the internet related to essential health information in India. It highlights a number of issues that need to be addressed to improve the scope and availability of health information in India.
There is a lack of primary data on mortality and cause of death information for the majority of the leading causes of disease burden in India. While there are modelled estimates of causes of death generated by the Global Burden of Disease and Risk Factors project, there are minimal primary data on causes of death. These data would normally be generated by a complete death registration system, which is not present in India. The national birth and death registration system is estimated to cover about half of deaths in India [20]. The recent addition of verbal autopsy to the sample registration system is expected to provide all-cause mortality information to some degree in the near future [6]. The Integrated Disease Surveillance Project is also expected to contribute to this information, although no data are available yet [9].
There were substantial gaps in the available information on non-communicable diseases and injuries. This is significant as the epidemiological transition is well underway in India. Whereas previously maternal and child conditions and communicable diseases where responsible for the majority of the disease burden, more recently the rising burden of non-communicable diseases is being documented in India and is projected to increase [68,69]. While there is still need for information on maternal and child conditions and communicable diseases, there is also now additional need for information on non-communicable diseases and injuries. There have been some recent efforts to address this gap. The Integrated Disease Surveillance Project is planned to include surveillance of risk factors for non-communicable diseases and information on road traffic accidents [9]. The sample registration system is also expected to contribute information on a number of non-communicable disease risk factors [6].
Information on the health infrastructure and human resources is not complete as it focuses primarily on the public health system, while in the most recent National Family Health Survey 65% of the households in India reported seeking health care from the private sector [58]. Information is available for the public health system at all levels of health care including primary care and detailed distribution of human resources is also available. However, information available on the private health sector is restricted to the number of hospital beds and estimates of the number of entire health workforce. These estimates of the total number of health workers are inadequate [70] and do not allow a good understanding of the distribution of the health workers within the private sector.
The lowest level for which the majority of the information generated by the essential sources is available is the state level. Although this information is useful, there is are wide variations between districts within the states. . It should be noted that there may be information available at the district level to administration and management officials, which has not been captured in this review as it is not readily available in the public domain.
The availability in the public domain of the information generated by the essential sources varied. The ready availability of health information, including primary data, informs a range of actions to improve population health and the health system and thus there is an increasing momentum for it to be available in the public domain [3]. There are some considerations which need to be taken into account when making data available in the public domain, one of which is maintaining confidentiality. The availability of data sets in the public domain from over 150 demographic and health surveys, including those for India, demonstrates that this is feasible [3].
A limitation of our analysis is that it covers information on the essential sources readily available in the public domain on the internet. However, as stated above, the ready availability of information generated by the health information system is crucial for use by all stakeholders to efficiently improve population health. Therefore, the findings in this paper are significant in highlighting what health information is readily available in the public domain in India and what is not, which would help bring attention to the major deficiencies that exist currently. Although we did extensive web searches for the essential sources of health information that were available in the public domain, it is possible that we could have missed some sources that were not readily available. However, it seems unlikely that this would have led to a substantially different message from this paper.
One of the essential sources, health research, was not examined in this paper [10]. The contribution of research to health information cannot be overstated as it often fills gaps in the information generated by the other sources and can guide conceptual development of health policy and systems [71].
All essential sources of health information complement each other to together produce a full picture of the health of a population. Thus a coordinated overall approach should be taken when strengthening health information systems, taking into account all essential sources. This will aid the effective use of scarce resources [4]. It has been suggested that streamlining of surveys and careful planning of a national survey programme will ensure all priority health topics are covered and costs minimised by avoiding duplication [4]. For example, there is overlapping information produced by a variety of sources on reproductive and child health in India. If this were streamlined the resources saved could be utilised to generate the basic health information that is missing for non-communicable diseases and injuries.
The essential sources and the information they produce are just one component of a national health information system. The policy and leadership environment, infrastructure, the information dissemination and utilisation are the other important aspects of a national health information system [72]. A comprehensive framework for the assessment of all these components has recently been developed by the WHO Health Metrics Network [73]. The findings reported in this paper provide an initial broad understanding of the essential health information readily available in the public domain in India. Further detailed understanding of the major gaps identified would be needed in order to develop strategies to address them for strengthening the health information system of India.

Conclusion
This broad overview of the essential health information readily available in the public domain on the internet for India identified several weaknesses such as the lack of information on non-communicable diseases and injuries, primary data on causes of death, the private health sector and district level information. While some recent initiatives will help enhance the health information system of India, a systematic approach is needed to develop a