Sample selection
This was a population-based cross-sectional postal survey of reproductive histories of adult women living in the United Kingdom in 2001, designed to enable the construction of a retrospective population-based reproductive cohort and a case-control study of risk factors for miscarriage. A sample of women was randomly selected from electronic electoral registers for England, Wales, Scotland and Northern Ireland held by the company Eurodirect [19]. All UK citizens aged 18 and over are eligible to vote; registration is voluntary, but in 2001 around 98% of the entire resident population were on the electoral register [20], the remainder being largely non-UK citizens and iterant population. At the time of survey there was no opt-out clause for those who did not wish to be on an electronic version of the electoral register, so the sampling frame contained all UK residents eligible (and registered) to vote.
In order to reduce possible biases associated with memory, we aimed for a sample aged 55 years and under at survey. Date of birth is not, however, routinely recorded on the electoral register. To avoid unnecessary mailing and expense, we therefore made use of a probabilistic process offered by Eurodirect based on forename, whereby the sampling frame was restricted to women thought likely to be aged 55 and under on the basis of their name. This process was based on empirical data relating to birth certificates going back to the beginning of the 20th century, from which it could be calculated that, for example, those named "Elsie" are likely to be aged over 55, and those named "Kylie" under 55 years. Predictions are further refined by examination of combinations of names within a household (a "Jane" married to or living with an Alfred likely to be older than a "Jane" married to or living with a "Darren") and length of residency (e.g. someone registered to vote at the same address for 12 years has to be over 30). We requested a random sample of 61,000 women likely to be aged 55 and under (sample size calculations based on achieving at least 80% power for key risk factors in the case-control analysis, and cost). After removing those known to be under age 18 at study (those turning 18 in the year of registration are allowed to register early, giving date of birth), the final sample consisted of 60,814 women.
The study received approval from the Trent Multi-Centre Research Ethics Committee and the Ethics Committee of the London School of Hygiene & Tropical Medicine.
Postal survey
The postal survey had two stages. Stage one consisted of a single-page "screening" questionnaire which asked for details of all pregnancies experienced by study participants, as well as periods of infertility and infertility treatment. This form was sent to the whole sample and included "opt-out" boxes to be ticked if the recipient had never been pregnant and had never attempted to have children, and/or was over age 55, and/or did not wish to take part. The second stage of the study consisted of a longer postal questionnaire which was sent to all those responding to Stage 1 who had ever been pregnant or who reported ever attempting to conceive and who agreed to be re-contacted. Excluded from this second stage were women who had had one or more termination for non-medical reasons (i.e. for reasons other than that a defect had been identified in the fetus or that continuing the pregnancy would put the mother at risk) and no other pregnancies. The Stage 2 questionnaire requested more general detail about the women (including height, age at menarche, educational level, marital status and details of infertility problems, treatment and diagnosis, if appropriate); detailed information on all pregnancies (including whether the pregnancy was the planned, the result of infertility treatment, father's date of birth and whether father had remained the same); plus socio-demographic and behavioural details relating to the most recent pregnancy. These details included questions relating to weight at start of pregnancy, nausea, smoking, coffee and alcohol consumption, diet, vitamin intake, ill health, air travel, sexual intercourse, occupation and stress levels. The most recent pregnancy was selected to minimise biases related to recall, and since it could be at the start, middle or end of the reproductive careers of these women whose ages at survey ranged from 18 to 55 years potential biases relating to ending reproductive careers on a "success" were not expected to be large. For those whose most recent pregnancy had ended in miscarriage (defined as fetal death at <24 weeks gestation), brief information relating to clinical management of miscarriage and the advice given was also requested. Permission to access clinical notes relating to outcomes reported in the questionnaire, and to contact the women for further study if needed, was also requested. In order to increase the number of cases for the case-control analysis of risk factors for miscarriage, women who had had a miscarriage recently (since 1995) but whose last pregnancy was not a miscarriage were sent a third questionnaire. This was a shortened version of the Stage 2 questionnaire, containing only those questions relating to biological, socio-demographic and behavioural details of the most recent pregnancy, but now requesting these details in relation to the most recent miscarriage. Such women then had two pregnancies in case-control analyses and standard errors were computed using a robust method based on the "sandwich estimate" to account for this statistically.
A free telephone helpline was run throughout the study, to answer queries and refer on to other organizations for professional help, if appropriate, and this was well used.
Statistical methods
All analyses in this paper were performed using Stata statistical software [21]. To investigate possible selection bias we compared stillbirth and multiple delivery rates with rates in the general population. For this we obtained annual registered stillbirth risks and registered multiple delivery rates by maternal age for England and Wales, 1980–2001 [22] (data for 2002 was estimated from that for 2001). Standardised registered stillbirth ratios (SRSR) and standardised multiple delivery rates (SMDR) were then calculated using logistic regression analysis (offsetting the log odds of the population risk) [23]. The unit of analysis for stillbirths was a registered birth. A registered livebirth is defined as a baby born alive at any gestation, registered stillbirth being defined as a fetal death at 28 weeks or more gestation until the end of 1992, and at 24 weeks or more gestation from 1993 onwards. Where gestational age was not available from Stage 2 data, a pregnancy was considered to be a stillbirth if it was so described. Forty-one (40%) of the total 102 stillbirths in the analysis fell into this category. For multiple delivery, the unit of analysis was a pregnancy containing at least one livebirth or registered stillbirth (as described above). For the purposes of the analyses presented in this paper (comparisons with the general population), a pregnancy was only considered multiple if it contained two or more babies who were liveborn or (registered) stillborn in order to be consistent with the definitions used in the national data. Thus, for example, a twin pregnancy occurring before 1993 and resulting in a livebirth and a fetal death at less than 28 weeks was considered to be a singleton pregnancy in this analysis. Average maternal age at first birth, if live, was also compared with that in the general population. Annual average maternal age at first (registered) birth, if live, was obtained with denominators for England and Wales, 1980–2001 [22] and re-calculated for 5-year periods. This national data was available for births within marriage only. Marital status of mother at time of birth was known only for the most recent pregnancy (or most recent miscarriage since 1995) in this dataset. For the NWHS average maternal age was therefore calculated for all first registered births, if live. No formal statistical comparisons of maternal age were made, partly because the numbers were so large that slight, non-meaningful, nuances in the data would give a statististically significant result, and render the comparison meaningless, and partly because the average ages in the general population, though comparable, were expected to be similar but slightly older in the general population data owing to the fact that the data related to births within marriage only. Births where the date of birth or maternal age were not known were excluded from all comparisons with population data.