Validation of the diagnosis of autism in general practitioner records

Background We report on the validity of the computerized diagnoses of autism in a large case-control study investigating the possible association between autism and the measles, mumps and rubella vaccine in the UK using the General Practitioner Research Database (GPRD). We examined anonymized copies of all relevant available clinical reports, including general practitioners' (GP) notes, consultant, speech therapy and educational psychologists reports, on 318 subjects born between 1973 and 1997 with a diagnosis of autism or a related disorder recorded in their electronic general practice record. Methods Data were abstracted to a case validation form allowing for the identification of developmental symptoms relevant to the diagnosis of pervasive developmental disorders (PDDs). Information on other background clinical and familial features was also abstracted. A subset of 50 notes was coded independently by 2 raters to derive reliability estimates for key clinical characteristics. Results For 294 subjects (92.5%) the diagnosis of PDD was confirmed after review of the records. Of these, 180 subjects (61.2%) fulfilled criteria for autistic disorder. The mean age at first recording of a PDD diagnosis in the GPRD database was 6.3 years (SD = 4.6). Consistent with previous estimates, the proportion of subjects experiencing regression in the course of their development was 19%. Inter-rater reliability for the presence of a PDD diagnosis was good (kappa = .73), and agreement on clinical features such as regression, age of parental recognition of first symptoms, language delay and presence of epilepsy was also good (kappas ranging from .56 to 1.0). Conclusions This study provides evidence that the positive predictive value of a diagnosis of autism recorded in the GPRD is high.


Background
Of 32 epidemiological surveys of autism and pervasive development disorders (PDDs) included in a recent review [1], 13 were published within the last 5 years. Increased research activity in this field of neuropsychiatry has led to a refinement of the definition of autism that involves a combination of qualitative impairments in language/communication, in social interaction and in patterns of play behaviours and interests. Improved operationalisation of diagnostic definitions within nosographies has occurred (American Psychiatric Association, 1994; World Health Organization, 1992), in parallel with the development of more precise diagnostic instruments such as the Autism Diagnostic Interview [2] and the Autism Diagnostic Observational Schedule [3]. There has also been an increasing concern by the public about this group of disorders, prompted, in part, by concern that the rates of PDD may have increased in recent decades [4][5][6]; and that the cause of the increase may be due to the side effects of vaccination [7][8][9] or increased exposure of young infants to neurotoxins such as methylmercury or thimerosal [10].
The investigation of risk factors for autism in epidemiological surveys has been limited by the small size of many studies. The median number of cases identified in the 32 surveys reviewed was 50 children [1]. Some investigators examined the effects of specific environmental exposures using large samples of subjects obtained from educational or hospital services [11][12][13]. These included computerized databases obtained through research networks of general practitioners, such as the General Practitioner Research Database (GPRD) [14], the Doctor Independent Network Database in the UK [15], national registers [16] and memberships of consumer associations [17]. There are problems with the use of these databases because of uncertainties about the validity of the diagnosis of the cases. Often there is no information on specific clinical characteristics which might allow identification of subgroups of individuals within the same diagnostic category (e.g. children with autism who have regressed in the course of their development), precluding investigation of hypotheses proposing an association between subtypes of PDDs and specific exposures [18].
To test the hypotheses of a link between autism and exposure to combined measles, mumps and rubella vaccines, or to other infectious agents, we designed a study based on cases identified in the GPRD database in the UK [19,20]. The research protocol included evaluating the quality of diagnoses in the GPRD database by obtaining clinical reports on a sub-sample of children included in the study. We report the results of this validation study on a subset of 318 cases of autism, based on the reports in the medical files of the general practitioners through whom these cases were identified.

The General Practice Research Database
The GPRD (previously called the VAMP (Value Added Medical Products) Research Bank) was set up in 1987 and is now held on behalf of the Department of Health by the Medicines Control Agency [21]. It consists of the computerised general practice medical records for around 3 million people in the United Kingdom. The electronic record includes demographic information such as age and sex, details of every consultation with a general practitioner, all prescribed drugs and vaccinations given, and details of referrals to hospital or specialist services. It is possible to obtain from general practitioners copies of hospital letters regarding specific patients (in anonymised form), although not all participating general practices provide this service.
Practices in the GPRD originally used a modification of the Oxford Medical Information Systems coding system to record diagnoses [22]. Through the 1990s an increasing number of practices changed to the READ coding system, which is now used throughout the United Kingdom National Health Service [23].

Selection of cases and data obtained
Of patients with a recorded diagnosis of PDD in the GPRD, including prevalent cases when first registered, 446 were registered with 203 general practices willing to provide copies of patient records. For 80 of these patients, records were not available as the patient was no longer registered with the general practitioner. We obtained complete case records including copies of hospital clinic letters and specialist reports for 318 (87%) of the remaining 366 patients.

Abstraction of clinical data
The case validation form included a section on sociodemographic data, an assessment of current level of language and of educational status, together with estimates of associated levels of learning disabilities. The assessment of learning disabilities was based on available psychometric data and, when not available, on a best estimate of intellectual functioning, classified into broad bands after review of all the available information. A section on health covered the lifetime occurrence of epilepsy, of treatments with psychoactive drugs, of associated medical disorders, the body measurements of head circumference, weight and height, of dysmorphic syndromes and the report of any significant non-autistic symptom in the course of the development of the child (such as: gastrointestinal symptoms, infections, sleeping difficulties or immune medical conditions). Symptoms were rated as being either reported or not reported since information allowing a more detailed coding (based on severity of the symptom, age of onset and of offset) was not available for most cases where symptoms were reported. A section on pregnancy and birth covered the incidence of maternal illness and infection during pregnancy, the mode of delivery, length of labour, birth weight and birth order. A section on the early development of the child covered major milestones (coded as normal versus delayed), the age of first words and phrases (coded either as an age in months or as an approximate age band), language delay (defined as single words not occurring until after 24 months of age or phrase speech not occurring until after 36 months of age), any regression or loss of skill at any point in the course of development and, when present, the type of skill lost. For those cases with some regression/ loss of skills, a global judgment was provided by the rater on whether the developmental pattern was suggestive of a definite regression/loss of acquired skills as opposed to fluctuating development with an uneven rate of skill acquisition. As age on onset of first symptoms is a key diagnostic criterion, this was operationalised in three different ways. First, we recorded the age at which the parents first recognized signs of developmental delay or variation in their child and the type of symptoms that first triggered their concerns. Second, we recorded age at the date of the first letter on file with concerns about a developmental problem in the child (e.g. a referral letter from the GP to a specialist to gain an opinion). Third, rater's assessment of age of onset was based on the rater's judgment of age of first symptom onset, irrespective of actual parental and or professional recognition of these symptoms at the time. Specific or global developmental disorders were identified in the first and second-degree relatives, along with specific medical (especially autoimmune) and psychiatric disorders.
The overall diagnostic rating of a child was made with two approaches, one algorithmic and one based on judgment of the rater. First, reports were searched for evidence of 12 specific DSM-IV symptoms for PDDs. A computer diagnostic algorithm using DSM-IV symptoms ratings was devised. Instead of using the typical DSM-IV algorithm (2 social symptoms, 1 communication/language symptoms, 1 repetitive behaviour symptom, together with at least 6 symptoms out of the 12 possible), we generated an algorithm to take account of the uneven quality of the data across the subjects. The algorithm generated a PDD diagnosis when at least three out of the 12 DSM-IV symptoms were scored, with the further constraint that there would be at least 1 symptom in the social domain and 1 symptom in either the communication/language or the repetitive behaviour domain. This algorithm is consistent with that used in a recent survey using a comparable record review approach [5]. Second, when all the documentation had been reviewed, the rater was asked to make a global judgement regarding the presence or absence of a PDD in the child and, when present, to provide a diagnosis for the specific subtype of PDD whenever possible. A global index of confidence in the rater's judgment about the PDD diagnosis was also derived.

Raters and inter-rater reliability
The two raters were a child psychiatrist (EF) and a psychologist (LH) both with long experience in the field of autism. LH reviewed and coded all the files. In order to obtain reliability estimates on the rating procedure, a subset of 50 medical notes chosen at random amongst the 318 records was rated blindly by EF. Records that posed particular coding difficulties were identified and consensus ratings were derived by the two raters at the end of the study. These records were eligible to be selected for the inter-rater reliability study. All ratings were made blind to the child's history of immunization.

Statistical analysis
All data were analyzed with SPSS and SAS with conventional chi-square and Fisher exact tests for categorical variables and Student's t test for continuous variables. Interrater reliability was measured with the kappa coefficient for categorical ratings and with the intraclass correlation coefficient (ICC) for continuous measures [24]. Throughout, a p value of .05 was chosen as the level of statistical significance. Missing data occurred at high rates for many variables included in the case validation forms and, as a result, we report both absolute and relative frequencies.

Sample characteristics
Medical notes for 318 subjects were obtained. They varied in quality and exhaustiveness. For some children, GP records included several consultant reports, speech and language assessments, and educational psychology reports. For other children, the information available was scanty, with sometimes the only available data consisting of one, or a few, letters between the GPs and consultants. A high proportion of records had missing data on parental age, socio-economic status, and detailed psychometric assessment of the child and therefore the frequencies of these variables are not described here. Of the 318 children whose medical forms were obtained, the raters confirmed a diagnosis of PDD in 294 children (92.5%). Compared to children with a confirmed PDD diagnosis, children for whom the diagnosis was not confirmed (n = 24) had significantly fewer PDD symptoms (2.1 vs 6.2; p < .001), higher language level (phrase speech: 80% vs 45%; p=.051), and more frequent parental concern arising for the first time after the age of 3 years (20% vs 2.9%; p=.024). No significant differences were found with respect to gender, birth year, presence of epilepsy or regression or in the average age at first diagnosis in the GPRD database.
The main characteristics of the 294 children with a confirmed diagnosis of PDD are shown in Table 1. The male/ female ratio was 4.25:1. A third of the children had no phrase speech when language level was recorded (at a mean age of 7.9 years). About a third of children had estimated intellectual skills falling into the normal range. 55 (19%) children showed clear evidence of regression and loss of acquired skills, and a further 34 had a developmental pattern consistent with an uneven and slow rate of acquisition of new skills as they grew up. The rate of regression and of epilepsy (18%) are consistent with those described in other surveys of autistic children. The mean number and pattern of DSM-IV symptoms was consistent with the diagnostic concepts of autism, especially as symptoms of social deficits appeared to be reported more frequently ( Table 1). The computer-based algorithm identified 237 (80.6%) of the 294 cases as having a PDD.
It was possible to allocate a more specific diagnosis to 217 of the 294 children with PDD. This was autistic disorder in 180 children (82.9%), Asperger Disorder (AD) in 18 children (8.3%), and PDDNOS (Pervasive Developmental Disorder Not Otherwise Specified) in 19 children (8.8%). The confidence level in the diagnostic subtype was generally high (high in 67.1%; medium in 19.7%, and low in 13.1%). In the remaining 77 children (26.2%), the quality of the data did not allow for the diagnosis of a specific PDD subtype. A comparison of the PDD children with and without a more specific diagnosis showed that children without a PDD subtype were comparable to children with autism with respect to language level and intel- lectual functioning but closer to the children with either PDDNOS or AD with respect to age at first electronic diagnosis and rate of regression. Compared to both other groups, they had significantly fewer PDD symptoms, most certainly reflecting the poorest quality of the notes that precisely precluded a final sub-typing to be attained by raters.
The mean age at first parental concern regarding their child's development was 16.8 months (SD = 9.8) in 142 children where a precise age could be estimated. Age of first recognition of symptoms in medical records could be estimated in broad age bands in 207 subjects and occurred before age 3 years in 201 subjects (97.1%). Onset of first symptoms was also determined by the rater's judgment, based on the medical records, in 91 subjects and was 12 months (SD = 8.5). In 88 subjects where both a parental and a rater age of onset were available, the rater mean age of onset was significantly younger than the age at parental concern (12.1 months vs 13.3 months; paired t-test; p = 0.02). Finally, the presence of a PDD in a firstdegree relative of the index child was reported in 7.8% of the sample, consistent with other surveys of PDDs [4].
We compared children with an autistic disorder diagnosis with children with another PDD diagnosis ( Table 2). The PDDNOS/AD group had significantly fewer language and intellectual impairments and were on average 2.3 years older than their autistic counterparts when recorded in the GPRD database. Regression was less often reported in the PDDNOS/AD group.
Since regression and loss of skills is a clinical feature of potential interest for our main study, we examined further the clinical correlates of regression. As regression was infrequent in the PDDNOS/AD group and as these children were different from children with autism with respect to age at diagnosis and severity, we restricted this analysis to those children with a diagnosis of autistic disorder ( Table 3). The regressive and non regressive groups differed with respect to language level and intellectual functioning where children with regression exhibited lower levels of functioning at the final assessment. They also had a significantly lower age at the first referral letter on file mentioning a developmental problem.
Trends over time in clinical features that are known to indicate autism severity were also examined for the autism group (Table 4). Birth years were grouped into 5-year intervals. There was a significant trend for decreasing levels of mental retardation and for an increasing proportion of males, suggesting that clinical presentation became less severe over time. Age differences in the birth cohorts made the interpretation of trend for phrase speech and for epilepsy more difficult to interpret. Other clinical features, including regression, did not change significantly with time.

Interrater reliability
Interrater reliability was examined on the subset of 50 randomly selected children. Agreement between the two raters was good for the presence/absence of a PDD in the child (Kappa = .73), and there were only 2 cases where raters originally disagreed. The agreement on the number of DSM-IV symptoms was excellent (ICC = .92). PDD symptom scores for each of the three domains separately showed high intra-class correlations as well, with ICC values of .87 for the social domain, .75 for the communication/language domain, and of .91 for repetitive behaviours. The agreement was also good to excellent on the presence/absence of language delay (Kappa = 1.0), of regression or loss of skills in the course of development (Kappa = .58), of epilepsy (Kappa = .84), on overall language level (Kappa = .62), on estimate of intellectual functioning coded on 3 levels (normal range, mild retardation, moderate to profound retardation) (Kappa = .72), and on the presence/absence of any developmental disorder amongst first degree relatives (Kappa = .74). Reliability was lower for regression due to the difficulty in differentiating loss of skills from developmental stagnation and to establish language level before the reported loss occurred.

Discussion
We have shown that the positive predictive value of a diagnosis of autism recorded in the electronic health record of patients in the General Practice Research Database is high, and higher than in a previous study where the diagnosis of autism was confirmed in 80% of 83 subjects with a GPRD computer record of autism [25]. A high positive predictive value for other morbidity data recorded in the GPRD has been found for a range of other conditions. For example 94% of cases of cataract identified had their diagnosis confirmed by a review of hospital eye service discharge summaries [26] and a recorded diagnosis of myocardial infarction was confirmed in over 90% of cases [27]. In our study, the diagnosis of PDD was confirmed by expert review of the notes in 92.5% of the cases. Amongst the unconfirmed cases were several records with poor quality data which precluded a positive confirmation of the diagnosis. A North London study based on a disability register identified a similar proportion of confirmed cases (89%) in their diagnostic validation [28]. The study design precluded an estimate of the sensitivity of a GP's diagnosis of a PDD, that is the percentage of children with a PDD who did not have this recorded in the GP records. This would have required a much more extensive study.  . The relatively wide range of regression rates across studies reflects the different definitions and methods of data collection used in these studies. Therefore, on a range of indices, our sample characteristics were typical of studies including well characterized PDD children.
Within the PDD spectrum, a relatively small proportion was identified as having a PDDNOS or Asperger Disorder. This would not have been surprising in earlier years, since the diagnosis of Asperger Disorder was not defined until 1992, and therefore many of these children will not have received a diagnosis that lead to ascertainment in the sample. However, although their number increased, there were still few children with these diagnoses among the most recent birth cohorts in the GPRD. This was unexpected: evidence from epidemiological surveys suggests that the prevalence of PDDNOS is higher than that of autistic disorder [1,4]. This is consistent with three alternative explanations: first, a precise differentiation between autism and PDDNOS has not been possible in our study (maybe due to our particular mode of data collection), leading to an inclusion of children with atypical forms of autism in the autistic disorder group; second, that some children with atypical autism were not diagnosed as having a PDD at all; and third, that among those with a PDD diagnosis, the recording of diagnosed PDDNOS into GPRD is less complete than of autism. The fact that the severity of autism, as indicated by gender ratio, intellectual and language levels, decreased over time supports the first interpretation. It could be the case, however, that this trend also reflects a genuine change in the association between autistic disorder and mental retardation, possibly due to earlier diagnosis and intervention.