The 10/66 Dementia Research Group's fully operationalised DSM-IV dementia computerized diagnostic algorithm, compared with the 10/66 dementia algorithm and a clinician diagnosis: a population validation study

Background The criterion for dementia implicit in DSM-IV is widely used in research but not fully operationalised. The 10/66 Dementia Research Group sought to do this using assessments from their one phase dementia diagnostic research interview, and to validate the resulting algorithm in a population-based study in Cuba. Methods The criterion was operationalised as a computerised algorithm, applying clinical principles, based upon the 10/66 cognitive tests, clinical interview and informant reports; the Community Screening Instrument for Dementia, the CERAD 10 word list learning and animal naming tests, the Geriatric Mental State, and the History and Aetiology Schedule – Dementia Diagnosis and Subtype. This was validated in Cuba against a local clinician DSM-IV diagnosis and the 10/66 dementia diagnosis (originally calibrated probabilistically against clinician DSM-IV diagnoses in the 10/66 pilot study). Results The DSM-IV sub-criteria were plausibly distributed among clinically diagnosed dementia cases and controls. The clinician diagnoses agreed better with 10/66 dementia diagnosis than with the more conservative computerized DSM-IV algorithm. The DSM-IV algorithm was particularly likely to miss less severe dementia cases. Those with a 10/66 dementia diagnosis who did not meet the DSM-IV criterion were less cognitively and functionally impaired compared with the DSMIV confirmed cases, but still grossly impaired compared with those free of dementia. Conclusion The DSM-IV criterion, strictly applied, defines a narrow category of unambiguous dementia characterized by marked impairment. It may be specific but incompletely sensitive to clinically relevant cases. The 10/66 dementia diagnosis defines a broader category that may be more sensitive, identifying genuine cases beyond those defined by our DSM-IV algorithm, with relevance to the estimation of the population burden of this disorder.


Background
Unlike its predecessor DSM III-R [1], DSM-IV [2] does not specify a criterion for the diagnosis of dementia. However, this can be inferred from the common elements of the DSM-IV criteria for each of the dementia sub-type diagnoses. These are as follows:

The criteria (A-D) must all be satisfied
A. The development of multiple cognitive deficits manifested by both 1. memory impairment (impaired ability to learn new information or to recall previously learned information) 2. one (or more) of the following cognitive disturbances: a. aphasia (language disturbance) b. apraxia (impaired ability to carry out motor activities despite intact motor function) c. agnosia (failure to recognize or identify objects despite intact sensory function) d. disturbance in executive functioning (i.e., planning, organizing, sequencing, abstracting) B. The cognitive deficits in Criteria A1 and A2 each 1. cause significant impairment in social or occupational functioning, and 2. represent a significant decline from a previous level of functioning.
C. The deficits do not occur exclusively during the course of a delirium.
D. The disturbance is not better accounted for by another axis I disorder (for example, major depressive disorder, schizophrenia) This criterion has been widely used in both clinical and epidemiological research. There is strong face validity. It defines a progressive and relatively pervasive disorder. It seeks to distinguish between dementia on the one hand, and potentially remediable cognitive impairment arising from delirium or mental disorder on the other. Its elements are, for the most part, objectively verifiable. The main weakness is the lack of operational definition. What constitutes memory impairment, or cognitive disturbance? What is a significant impairment in functioning? What represents a significant decline in functioning? When is the disturbance better accounted for by another axis one disorder? The usual practice is for these questions to be settled according to clinical judgment (in research often by a consensus panel of expert diagnosticians). However, even when structured assessments have been used, the lack of clarity in these areas introduces much scope for unreliability. For the older DSMIII-R criteria, reliability was poorer between as opposed to within international research teams [3]. A WHO assessment of cross-national reliability of application of the DSM-IIR criteria identified worryingly poor reliability for certain elements [4].
The 10/66 Dementia Research Group has as one of its main objectives to compare the distribution and determinants of dementia in different world regions. We have already described an approach for identifying persons with probable dementia in a one phase study design, using a probabilistic predictive algorithm for '10/66 dementia' derived and tested against the criterion of DSM-IV dementia [5]. The algorithm seemed relatively 'education-fair', that is the false positive rate among those with low levels of education was low, and validity was established for a variety of countries and cultures. 10/66 DRG population-based studies are now underway in Cuba, Brazil, Dominican Republic, Venezuela, Mexico, Argentina, Peru, India, China and Nigeria. We also wished to apply DSM-IV criterion directly, and had extended the scope of our one phase assessment to ensure that the necessary data were recorded. The purpose of this paper is to describe our full computerized operationalisation of the DSM-IV criterion, and then to test its validity against local clinician dementia diagnosis and the 10/66 dementia diagnosis, using data from the Cuban population-based study.

Design
We have previously published a comprehensive description of the full protocol for the 10/66 population-based surveys [6]. In the Cuban 10/66 population-based study participants were recruited from eight catchment areas defined by proximity to polyclinics in Havana city and Matanzas. All those aged 65 and over living within the catchment area boundaries were included. Eligible participants were identified from polyclinic registers backed up by systematic door knocking of all households within the catchment area. The DSM-IV dementia and 10/66 dementia survey diagnoses were derived from the structured assessments described below using computerized algorithms, which were run when the full survey was completed. In six of the eight Cuban catchment areas, all participants were interviewed by polyclinic psychiatrists and physicians (one for each catchment area) who were experienced dementia diagnosticians. They were asked to make clinical dementia diagnoses according to DSM-IV criteria, at the end of each individual assessment. At this point, the clinician could not have been aware of the output of the computerized 10/66 and DSM-IV algorithms, and, given their complexity, would have found it difficult to guess the result. This clinical diagnosis was then used as the criterion validation for the computerized DSM-IV algorithm, described below. We also examined the correspondence between the 10/66 Dementia Diagnostic algorithm and clinical diagnosis, and, in the full sample, between the DSM-IV computerized algorithm and the 10/ 66 Dementia Diagnostic algorithm.

Measures
The individual cognitive tests used in the 10/66 population-based studies were all validated against a dementia criterion in the earlier pilot phase of the 10/66 programme [5].
1) The Community Screening Interview for Dementia (CSI 'D') [7] consists of a 32 item cognitive test administered to the participant (20 minutes) and a 26 item informant interview, enquiring after the participant's daily functioning and general health (15 minutes). Three summary scores can be generated from the CSI 'D'; a) The cognitive score (COGSCORE), a summary score from the participant cognitive test, with different weightings applied to some items b) The informant score (RELS-CORE) an unweighted total score from the informant interview and c) The discriminant function score (DFS-CORE), combining COGSCORE and RELSCORE into a single score, using the formula DFSCORE = 0.452322 -(0.01669918*COGSCORE) + (0.03033851*RELSCORE). For the COGSCORE and DFSCORE there are validated cutpoints suggestive of probable and possible cases of dementia.
2) The animal naming verbal fluency task [8] from the Consortium to Establish a Registry for Alzheimer's Disease (CERAD) test battery. Participants are encouraged to name as many different animals as they can in the space of one minute.
3) The adapted CERAD ten word list learning task improved the discrimination of the Hindi Mental State Examination in the Indo-US Ballabgarh dementia study [9]. Six words were taken from the original CERAD battery English language list; butter, arm, letter, queen, ticket, and grass. Pole, shore, cabin, and engine were replaced with corner, stone, book and stick, which were deemed more culturally appropriate [9]. In the learning phase, the list is read out to the participant, who is then asked to recall straight away the words that they remember. This process is repeated three times, giving a total learning score out of 30. Five minutes later the participant is again asked to recall the 10 words, giving a delayed recall score out of 10.
4) The Geriatric Mental State (GMS/AGECAT) [10,11]. This clinical interview generates, from a computerised algorithm (AGECAT) levels of psychopathology (0 = noncase, 1 and 2 = subcase, 3, 4 and 5 = mild moderate and severe case) within nine diagnostic clusters; organic brain syndrome (dementia), schizophrenia, mania, neurotic and psychotic depression, obsessional, hypochondriacal, phobic and anxiety neuroses. These assessments were supplemented by a structured physical and neurological assessment and an extended informant interview, the History and Aetiology Schedule -Dementia Diagnosis and Subtype (HAS-DDS), a modification of the earlier HAS [12] to focus specifically on information relevant to DSM-IV dementia diagnostic criteria [2], and for dementia subtype diagnosis. With reference to the DSM-IV dementia criteria the HAS-DDS covered the onset and course of the reported cognitive/ functional impairment, evidence of impairment in gnosis, praxis and executive function, and the presence or absence of delirium.
Three further assessments not included in either the DSM-IV or 10/66 algorithms were used to test for concurrent validity. These were: 1) activity limitation and participation restriction measured by the WHO-DAS II [13], developed by the WHO as a culture-fair assessment tool for use in cross-cultural comparative epidemiological and health services research; 2) behavioural and psychological symptoms of dementia assessed by the Neuropsychiatric Inventory (NPI-Q) [14]; 3) dependency (whether the participant needed no care, some care or much care) ascertained by a series of openended questions administered to the informant.
We assessed the severity of dementia in all participants using a computerised operationalisation of the Clinical Dementia Rating (CDR) [15].

10/66 dementia diagnosis
10/66 dementia diagnosis is defined as those scoring above a cutpoint of predicted probability of DSM-IV Dementia syndrome [2] from the logistic regression equation developed in the 10/66 international pilot study, using coefficients from the GMS, CSI-D informant and cognitive test interviews and the modified CERAD 10 word list learning tasks [5].

Operationalisation of the DSM-IV criteria for the computerised DSM-IV algorithm
We settled on a 'top down' approach in which we reviewed the data available in the 10/66 populationbased study interview and mapped the information collected onto the DSM-IV criterion applying clinical principles. We sought to operationalise the decision-making process of a competent clinician.

Criterion A1 (memory impairment)
Memory was assessed objectively by several items in the CSI'D', which were summed to form a memory subscale, and by the CERAD 10 word list with immediate learning and delayed recall. Three memory scores were thus derived. Additionally, at the end of the GMS interview the interviewer makes an objective rating of memory function. Finally, in the informant section of the CSI'D', the informant is asked about declining memory in general, and the frequency of six specific and characteristic memory lapses; forgetting where s/he has put things, where things are kept, names of friends, names of family, when s/he last saw informant, and what happened the day before.
To define impairment on memory tests, we used the threshold applied in the criterion for mild cognitive impairment; 1.5 standard deviations below the mean for non-demented persons. We differ from the MCI criteria in using education as well as age specific norms. Memory function in older people free of dementia is strongly influenced by level of education, and it seemed important to identify memory impairment only among those performing worst than most of their peers. 1.5 standard deviations below the mean corresponds approximately to the 7th centile of a normal distribution. While it was judged important to set the threshold at the level of mild impairment to ensure sensitivity for detection of early dementia, impaired performance on one test may be explained by extraneous factors; consistent poor performance is much more likely to reflect genuine impairment. In our algorithm, the presence of more unequivocal cognitive impairment on multiple objective tests lowers the degree of corroboration required from the informant report, and vice versa. At the extremes, consistent cognitive impairment on testing means that the criterion is met even in the absence of informant report, and very high levels of informant reported impairment mean that the criterion is met even in the absence of evidence of impairment on formal testing.
Criterion A2 (at least one of aphasia, apraxia, agnosia or disturbance in executive functioning) With the 10/66 assessments, impairment in each of these domains of cognitive functioning can be established through objective cognitive tests, and informant opinion of recent decline.

a) Criterion A2a. Aphasia
Both language expression and comprehension are tested in the CSI'D', yielding a six point language subscale. In our pilot study this was somewhat lacking in variance, with a pronounced ceiling effect, and with little effect of age and education. Therefore the same cutpoints were applied to all participants, but as for memory impairment, more convincing evidence of impairment on formal testing lowered the threshold for supporting informant reports, and vice versa.

b) Criterion A2b. Apraxia
Praxis was tested by three items in the CSI'D'; taking a sheet of paper, folding it in half and placing it in the lap, and copying interlocking circles and pentagons. These items were used to construct a five point apraxia scale. Two informant report items in the CSI'D' were relevant, those referring to difficulty feeding and difficulty dressing, not explained by physical disability. Here, given the gen-erally poor performance on the praxis test items of those with little education, apparent impairment on formal testing only fulfils the criterion if corroborated by an informant report of at least some degree of impairment c) Criterion A2c. Agnosia In the CSI'D' the participant is asked to recognize and name four objects (pencil, watch, chair, and shoe) and three parts of the body (knuckle, elbow and shoulder) generating a seven point agnosia scale. In addition, in our extended informant interview, the informant is asked if the participant misidentifies family or friends. The criterion is fulfilled if the informant reports misidentification and/or if two or more errors are made on formal testing.

d) Criterion A2d. Disturbance in executive functioning
Executive functioning was tested using the palm-fist-hand test from the Luria battery of frontal lobe tasks, and also by the animal naming test (testing verbal fluency) from the CERAD battery. In the case of the animal naming test, with much of the variance explained by age and education in non-demented persons, we again used 1.5 standard deviations below the age and education norm as the threshold for impairment. In the informant section of the CSI'D' the informant was asked whether the participant experienced difficulty in adjusting to change in routine (frequently or occasionally) and whether there was a change in their ability to think and reason. As part of our extended informant interview we also enquired after difficulty in making decisions, and whether thinking had seemed muddled. Informant reports on these four aspects of executive function were formed into an ad hoc informant scale. Given the lack of domain specificity of the animal naming test (impaired for example by poor concentration) the sub-criterion was not considered to be fulfilled unless impairment in this test was corroborated by some degree of executive impairment reported by the informant. Failure on the palm-fist-hand test was considered sufficient in itself.

Criterion B1
This criterion requires that the cognitive deficits in Criteria A1 and A2 each cause significant impairment in social or occupational functioning. Several of the Geriatric Mental State items elicit self-reported memory impairment, including impact on functioning. The participant is asked if memory impairment is a problem for him/her, if they forget names of family or close friends, if they forget where they have placed things, and if they have to make a greater effort to remember things. These items are only rated positive if they occur frequently and cause regular inconvenience. One informant (CSI'D') item is also relevant, where the informant is asked whether memory is a particular problem for the participant. Several CSI'D' items address general impairment in social and occupa-tional functioning likely to have arisen as a consequence of impairment in language, praxis, gnosis or executive function; that is, diminution in range of activities and/or reduced ability to carry out activities, loss of hobby or skills, problems with household chores (not accounted for by physical disability), change in ability to handle money, getting lost inside of the home or in the community. The B1 criterion was considered to be present where the participant or the informant reported memory impairment that was hindering real-life functioning, and the informant reported one or more examples of general social/occupational impairment.

Criterion B2
This criterion requires that the cognitive deficits described in A1 and A2 represent a significant decline from a previous level of functioning. In the absence of repeated measures of cognitive function, the only recourse is to the informant history. In the CSI'D' the informant is asked if there has been 'a change in activities', and a 'general decline in mental function'. The HAS-DDS, introduced for the population-based studies to define more precisely course and outcome asks the informant whether overall deterioration, overall improvement or no change has occurred since the onset of the condition, whether there has been gradual decline over a period of two or more years, and whether the intellectual impairment dates from birth or pathology earlier in life. The criterion is established if the informant reports either 'a change in activities', or a 'general decline in mental function' or an overall deterioration or gradual decline over two or more years, provided that the impairment is not considered to date from birth or early life.

Criterion C
This criterion requires that the deficits do not occur exclusively during the course of a delirium. Information relevant to this criterion is gathered in the HAD-DDS from an informant; time of onset, type of onset, onset as a result of stroke, presence or absence of clouding of consciousness and confusion worse towards the end of the day. Our algorithm determines that the deficits do occur exclusively in the course of a delirium if the onset was in the last month with sudden onset over 1-3 days, and with either clouding of consciousness (the HAS-DDS item 'changeable over 24 hours, alert at one time, drowsy and confused the next') or confusion worse towards night or evening.

Criterion D
This criterion requires that the disturbance is not better accounted for by another axis I disorder (for example, major depressive disorder, schizophrenia). Our algorithm determines that the disturbance is better accounted for by another axis 1 disorder if the stage 2 GMS/AGECAT diagnosis is either schizophrenia or depression, and dementia diagnosis is not confirmed by the 10/66 dementia diagnostic algorithm.

Analyses
10/66 dementia algorithm cases, and DSM-IV cases are compared with the Cuban clinician interviewers judgment of dementia caseness as an external criterion, with sensitivity, specificity and kappa for agreement.
The prevalence of the DSM-IV sub-criteria and their individual elements is compared between clinically diagnosed dementia cases and high education and low education controls, free of dementia in the 10/66 Cuban population-based study data sets.
10/66 Dementia cases, not confirmed by the DSM-IV computerized algorithm are compared with confirmed DSM-IV cases and controls free of dementia according to the following characteristics; mean CSI'D' COGSCORE and RELSCORE, NPI-Q (behavioural and psychiatric symptoms of dementia) severity and distress scores [14], WHODAS II disability [13], Clinical Dementia Rating (CDR) [15] and the frequency of reports of needing much, some or no care; using independent sample t-tests, and Chi squared tests for trend as appropriate.

Results
In Cuba, 2909 interviews were completed with 96.4% of all eligible persons responding. All non-response was accounted for by refusals.
In six of the eight Cuban catchment areas, all participants (n = 1887) were interviewed by clinicians who were experienced dementia diagnosticians. 147 (7.8%) were identified by the clinicians as dementia cases, leaving 776 low education controls (completed primary or less) and 958 high education controls (completed secondary or more). After processing our survey data, we identified among these 1887 participants 192 dementia cases according to the 10/66 algorithm (10.2%) and 114 cases according to the computerised DSM-IV dementia algorithm (6.0%). Agreement with the clinician diagnosis was better for 10/ 66 dementia (Kappa 0.79 [95% CI 0.74-0.83], sensitivity 93.2%, specificity 96.8%) than for the DSM-IV computerised algorithm (Kappa 0.63 [95% CI 0.56-0.69], sensitivity 57.8%, specificity 98.3%). Across the six polyclinic centres, Kappas ranged from 0.63 to 0.86 for 10/66 dementia (median 0.81) and from 0.55 to 0.77 for DSM-IV dementia (median 0.60). According to the CDR severity of the clinically diagnosed cases, the DSM-IV criterion confirmed 86.7% of severe cases, 78.1% of moderate cases, 65.4% of the mild cases and none of the questionable cases. The 10/66 algorithm identified 100.0% of severe cases, 100.0% of moderate cases, 98.1% of mild cases and 77.4% of questionable cases.
Seventy-one percent of the clinically diagnosed cases met the A1 memory impairment criterion, an essential requirement for a DSM-IV diagnosis, 95% had impairments in one or more other domains of cognitive function (A2). All of the sub-criteria for cognitive impairment (A1 and A2) had a very low prevalence in the control groups, suggesting high specificity (Table 1). The only exception was that of the palm-fist-hand test from the Luria battery testing executive function which was frequently 'failed' by those free of dementia. Consequently, more than one third of low education controls and nearly one fifth of high education controls also met the A2 sub-criterion. However, this had little impact on the overall specificity of the A criterion given the high specificity of the memory sub-criterion. 85% met DSM-IV sub-criterion for social and occupational impairment (B1), and 82% met sub-criterion for decline (B2) ( Table 2). Combining these two elements, 73% of clinically diagnosed cases met the B criterion compared with around one tenth of high and low education controls. No participants were identified as suffering from delirium (Table 3). Only 1% of the clinically diagnosed cases were classified by the DSM-IV algorithm as 'better accounted for by another Axis 1 disorder' ( Table  4). The distribution of DSM-IV sub-criteria by 10/66 dementia caseness was very similar to that previously observed for the clinician diagnoses (Table 5).
In the full sample of 2909 participants, 181 of the 315 10/ 66 dementia cases (58%) were confirmed as cases by the computerised DSM-IV algorithm. Compared with 10/66 dementia cases not meeting DSM-IV criteria, those with DSM-IV dementia were more cognitively impaired, had more cognitive and functional impairment according to the informant, were more disabled, had more Behavioural and Psychological Symptoms of Dementia and more consequent caregiver distress, and greater needs for care ( Table 6). The Clinical Dementia Rating severity of the DSM-IV cases was also more severe -0.5% had questionable dementia, 43.9% mild dementia, 31.2% moderate dementia and 24.3% severe dementia, compared with 50.8% questionable, 34.1% mild, 9.1% moderate and 6.1% severe for the 10/66 cases not confirmed by DSM-IV. However, the 10/66 dementia cases that were not confirmed by DSM-IV were much more similar in all of these respects to the DSM-IV cases than to the non-demented controls. Using Scheffe's test for multiple group comparisons and Kruskal Wallis (non-parametric test based on group ranks), all group differences were statistically significant at p < 0.001 (Table 6).

Discussion
The sub-criteria for DSM-IV dementia, derived from a computerised algorithm were plausibly distributed in dementia cases (both those identified by clinician diagnosis, and by the 10/66 dementia algorithm) and non-cases.

A2a. Language impairment
Score of <= 4 on CSI'D' language subscale and/or 32 2 2 Score of <= 5 on CSI'D' language subscale and informant reports that participant 'sometimes' or 'frequently' uses wrong words or has difficulty saying words, and/ or 22 1 1 informant reports that participant 'frequently' uses wrong words or has difficulty saying words 23 1 1 A2a OVERALL 52 2 3

A2b. Apraxia
Unable to dress, or wrong sequence/forgets items (DRESS >= 2) and/or 22 0 1 Cannot eat cleanly with proper utensils (FEED >= 1) and/or 20 1 1 Apraxia score <= 3 and some impairment in dressing or feeding (DRESS or FEED >= 1) 29 1 2  1. Only ascertained among those with a current episode characterized by functional and cognitive decline Over 80% of cases met the key A1, A2, B1 and B2 sub-criteria, the sole exception being that of A1, met by 71% of 10/66 algorithm cases. The requirement that each of the A1, A2, B1 and B2 sub-criteria are met accounts for the relatively low proportion, 58%, of clinically diagnosed dementia cases meeting all elements of the computerised DSM-IV criterion. On the one hand the stringency of this approach ensures high specificity; on the other, misclassification error will tend to accumulate reducing the sensitivity of the overall DSM-IV algorithm. We acknowledge a potential weakness of our approach in that the clinical diagnosis, the computerised DSM-IV diagnosis and the 10/66 diagnosis were all derived from elements of the same structured clinician assessment. This may have led to an overestimate of their concordance. Formal validation would have required independent clinical assessments of survey participants by expert clinicians, applying DSM-IV criteria but without relying upon our survey assessments to do so; however, this approach would still not address the problem of unreliability in clinician application of the DSM-IV criteria.
We have intentionally avoided describing any of the diagnostic outcomes in this study as a 'gold standard', preferring instead to explore the underlying dementia construct and its measurement through the relationships observed  between the three outcomes; the 10/66 dementia and DSM-IV dementia survey diagnoses and the clinician diagnosis. In reality, each has their strengths and their weaknesses. The DSM-IV criteria are the dominant paradigm in current research and clinical practice, but have been critiqued both for the primacy accorded to memory impairment (a necessary component, which is not, however, a prominent early feature of many dementias, for example vascular dementia and frontotemporal dementia) and for the lack of specificity of the secondary cognitive criteria (aphasia, apraxia, agnosia, disturbance in executive functioning) which are arguably of more direct relevance to stroke than to dementia [16]. We are optimistic that our 10/66 dementia diagnosis provides a robust alternative given its careful cross-cultural development and calibration [5]; however, it has not previously been validated in a community setting. Local clinician diagnoses have strong ecological validity, reflecting as they do local clinical practice. Nevertheless, lack of standardisation may lead to problems with reliability and validity. In the absence of an unimpeachable gold standard, our approach has therefore been closer to that of construct validation as described by Cronbach and Meehl [17] than to criterion validation.
What then have we learnt from this exercise regarding the comparative utility of the 10/66 and DSM-IV diagnoses? Comparison of the characteristics of 10/66 Dementia Algorithm cases confirmed and not confirmed by our computerised DSM-IV algorithm does raise some questions regarding the sensitivity of the DSM-IV criterion, in that many unconfirmed 10/66 cases were still grossly impaired compared with the controls who met neither set of criteria. Furthermore, in Cuba, clinician diagnoses matched more closely to our 10/66 dementia category than to the more restrictive DSM-IV computerised criterion. In principle, the clinician diagnoses were made using the DSM-IV criterion. Clinicians may be less stringent in the thresholds they set for clinical significance, and less rigorous than a computerised system in the application of the algorithm. In either event, our Cuban data tends to suggest that clinically relevant dementia may be prevalent beyond the confines of the narrowly defined DSM-IV criterion when, as in our study, it is strictly applied using a fully-operationalised computerised algorithm. One of the few population-based studies to examine this issue directly, the Canadian Study of Health and Aging [18] reported a prevalence of 20.9% for those aged 65 and over according to clinical consensus compared with 13.7% according to DSM-IV criterion. Mild cases were selectively excluded by the DSM-IV criterion. We found the same. This may have important implications for previous assessments of the global prevalence of dementia [19,20], relying as they do, to a large extent, upon studies that have used the DSM-IV criterion. The more inclusive 10/66 dementia diagnosis may help to establish the true population burden of the dementia syndrome. Conversely, our 10/66 Dementia algorithm may be overdiagnosing dementia. Our earlier pilot study to develop and test the 10/66 algorithm against a DSM-IV criterion, suggests that this is possible; we achieved very high sensitivity (94%), but with a 6% false positive rate in low education controls and 3% FPR in high education controls. We will examine these issues carefully in each of the 10/66 population-based study centres. The relative validity of the two computerised diagnoses used in the 10/ 66 survey can probably best be settled when their predictive validity is addressed in the forthcoming incidence phase of the 10/66 investigations. Those with dementia would be expected to have declined further, or to have died. Stability or improvement in cognitive function and functional status would argue against the validity of the case definitions.

Conclusion
We have at least partly established the validity of the computerised DSM-IV algorithm in Cuba. While we cannot assume that this will extend to other 10/66 centres, core face validity is also strong. Clinical principles have been applied by experienced clinicians in the 10/66 group. The operationalisation, and the basis of the decisions that were taken is transparent. The algorithm is described in this paper, and the full computerised algorithm can be obtained on request from the authors.