Geographical and time-related variation
Fisch and co-workers determined whether geographic variations in sperm count might bias the conclusions drawn from studies of semen quality. Briefly, of the 61 studies in the meta-analysis of Carlsen et al.  only 20 included more than 100 individuals. A reanalysis of these 20 studies revealed that the majority of studies published before 1970 were from USA, mainly New York. These studies represented locations with the highest sperm counts. In contrast, after 1970, most of the studies were from locations not included earlier. Selection bias due to geographical and ethnic variations could account for the observed decline in sperm quality . However, when the subgroup of studies from the USA (28 studies, including data on 8.329 men) was analyzed separately, a similar trend of decreasing mean sperm values was observed . On the other hand, Lipshultz argued that when all the larger studies containing only New York data are excluded from the meta-analysis, a second linear regression analysis detects no decline in sperm density .
The sperm count issue has been extensively investigated since 1992 and the decrease is supported by additional studies. For example, it was reported that sperm counts of 1351 fertile men (semen donors) in Paris decreased by 2.1 percent per year from 89 × 106/ml in 1973 to 60× 106/ml in 1992 (p < 0.001) . Sperm motility also decreased in these individuals. The mean seminal volume was 3.8 ml and did not change during the period from 1973 to 1992. Irvine et al. also examined sperm quality in 571 semen donors in Scotland by birth cohort groups . They reported a significant decrease in median sperm concentration, total number of sperm in the ejaculate and total number of motile sperm in donors born between 1970 and 1974 compared to men before 1959.
On the other hand, several studies confirm that in some locations sperm counts have not decreased over the past 20 to 25 years. More importantly, the results clearly show a remarkable variability in sperm counts at different geographical locations. The mean sperm counts in 302 men in the Toulouse area were unchanged over the period from 1977 to 1992  and their mean sperm count (83 × 106 ml) was significantly higher than observed in the Paris study . Furthermore, it was reported that the highest sperm counts recorded in Finland were found in men from rural areas accompanied by a low incidence of testicular cancer . These findings suggest that urban lifestyle or environmental factors might be an important etiological factor of testicular malfunction and disease.
Fisch et al.  conducted a study comprising 1,283 men who banked sperm before vasectomy in three different sperm banks in the United States from 1970-1994. A slight but significant increase in mean sperm concentration from 77 × 106 ml to 89 × 106 ml over the past 25-year period was found. Furthermore, marked differences in semen characteristics between the New York, Minnesota, and California sperm banks were found. Sperm concentration and motility was highest in New York and lowest in California.
It was argued that the data set analyzed by Carlsen et al.  was not equally distributed between the decades: 79% of the publications and 88% of the volunteers of all studies were clustered between 1970 and 1990. If joint-analyses were conducted with studies from this period only, an increase in sperm concentrations would become evident. Thus, only 21% of the studies and 12% of the volunteers analyzed before 1970 have caused the regression to have a statistically negative slope [15, 16].
Looking at sperm count studies which have been published since 1992, it becomes evident that over the past 20 years, sperm quality has decreased in some but not all locations. The results also show that sperm counts can vary widely between and within countries. These major geographical differences suggest that the results of the former meta-analyses of sperm count may be biased by geographical confounding.
The WHO recommends that in order to determine normal ranges of semen quality "specimens should be evaluated from men who have recently achieved a pregnancy, preferably within 12 months of the couple ceasing contraception" . In the former meta-analyses [1–3] a precise definition of "proven fertility" is not given. Nevertheless, a significant decline in mean sperm concentration was seen when the 39 papers reporting data on men with "proven fertility" where analyzed separately. In the remaining 22 publications, men were unselected with respect to fertility and were therefore "considered to represent the normal male population".
It is well known that, due to the personal and potentially embarrassing manner of collection, men are reluctant to provide semen samples unless currently concerned about their fertility. In other words, highly selected volunteers constitute a biased sample of the population with regard to their perceived fertility . Furthermore, the socio-demographic profile of sperm donors is commonly shifted to more educated classes . As a result, studies of men providing semen samples involve mainly self-selected volunteers with various non-neutral motivations.
Hence, in most studies on human semen characteristics the populations under study are insufficiently defined and the study participants are not a representative population sample. The former meta-analyses [1–3] included study participants who were examined before vasectomy; some were recruited while their partners were attending antenatal clinics; some were volunteer donors participating in artificial insemination programmes; and some were recruited as part of an occupational study; others were recruited from infertility clinics but were included only if their partners subsequently became pregnant. Data from such study populations cannot be legitimately extrapolated to their apparent source population unless they originate or constitute a representative sample of that reference population.
The "normal male population" included fertile, subfertile as well as infertile men. The lack of external validity of studies using data from self-selected volunteers might further question the hypothesis of a decline in sperm quality with time.
Two important factors influencing semen characteristics are age and duration of sexual abstinence (prior to semen donation). Older age contributed significantly to a decline in sperm concentration and sperm motility. The variable for the year of semen donation is composite because it combined each man's age at the time of donation with his year of birth. The relation of each semen characteristic to these independent variables (age at donation and year of birth) has to be tested with multiple regression analysis. Greater sexual abstinence is associated both with age and an increase in the sperm concentration and a decrease in the percentage of motile spermatozoa . In general, sperm concentrations reflect sperm production, age and abstinence time . In the former meta-analyses [1–3], the important information on age was not extracted from the original publications and hence has not been considered in the statistical analysis.
It has been shown that sperm counts in volunteers vary with abstinence time . It has been shown that only 66% of the men investigated had complied with the required abstinence time of three to five days . Thus, if a secular trend in sperm production is to be analyzed, well-controlled abstinence times are a prerequisite.
WHO recommends non-manual methods for semen analysis such as haemocytometer or so called counting chambers , since manual methods rely on subjective judgements and are not reliable. WHO has laid the foundation for such quality control by introducing a laboratory manual for semen analysis, currently in its fourth edition . WHO recommendations for laboratory techniques are slowly becoming accepted as the international standard, but are not yet universally applied. All data obtained in early studies, as considered in the former meta-analyses [1–3], were not subjected to quality control.
Furthermore, reports on external quality control have shown noticeable variations in the determination of semen characteristics between laboratories related to differences in technique . Consequently, it is very difficult, to compare data from different laboratories if subtle changes in sperm concentration are to be detected .
Olsen et al.  presented different statistical approaches to the data suggesting that alternative models including the quadratic, spline fit and stair step provided a better fit than the linear regression used by Carlsen et al. and might lead to different interpretations. For example, the use of linear or quadratic models tends to suggest that any change in semen quality is gradual and may be continuing, whereas the stair-step model tends to suggest an isolated event or set of events which is not continuing after say 1970. A sudden apparent fall in semen quality may be due to substantial changes in analytical methodology, subject selection criteria, study selection criteria or widespread introduction of a global environmental factor.
Goldstandard in measurements of semen quality
The definition of a "normal" sperm concentration has changed from 60 million/ml in 1940 to the present value of 20 million/ml [1, 17]. In order to avoid bias, Carlsen et al. restricted their study to men with proven fertility (39 studies) or to "normal" men of unknown fertility (22 studies). It was speculated that much of the apparent change in semen quality could be accounted for by a change in the "accepted" definition of the lower limits of "normal" from around from 60 million/ml in 1940 to values of 20 million/ml accepted today. This change might have led to the exclusion of men with sperm concentration of 20-60 million/ml in the earlier studies, as these studies set out to include "normal men" with sperm counts not below 60 million/ml. This possibility is underlined by a theoretical calculation showing that such a change of the reference data would indeed result in a drop in average sperm concentration. This predicting is in good accordance with the actual data [11, 16, 25]. However, it has been argued by the authors of the original meta-analysis that at least some of the earlier papers in fact did include men with semen quality in the range below 60 million/ml .
Sperm concentration, sperm motility and sperm morphology are related to each other: factors that cause deterioration of one of them usually also have negative impact on the other two . Sperm motility, followed by sperm concentration is the best predictor of fertility . The WHO has given normal limits for semen characteristics. These values are higher than those associated with infertility. They are not based on population samples of fertile men or unselected men, but rather reflect the lower range of men still being fertile . With regard to semen analyses in epidemiological studies the definition of a genuine reference range for human sperm output based on empirical sampling is needed (considering geographical region, ethnicity and age). Furthermore, for clinical or epidemiological studies, it would be preferable to investigate more than one outcome parameter.
Jorgensen et al.  investigated 1082 male partners from pregnant women in four European cities (Copenhagen, Denmark; Paris, France; Edinburgh, Scotland; and Turku, Finland). Semen analysis was standardized, inter-laboratory differences in assessment of sperm concentration were evaluated, and morphology assessment centralized. Lowest sperm concentrations and total counts were detected for Danish men, followed by French and Scottish men. Finnish men had the highest sperm counts. Furthermore, semen quality of a 'standardized' man (30 years old, fertile, ejaculation abstinence of 96 h) was estimated. These data may also serve as a reference point for future studies on time trends in semen quality in Europe. In another study  968 young men from the general population in Denmark, Norway, Estonia and Finland were examined according to the same study protocol. Possible confounders including age of man, year of birth, year of participation in the study, season of year and duration of abstinence were evaluated, and included in the statistical analysis when appropriate. Inter-laboratory differences in assessment of sperm concentrations were controlled by an external quality control programme and morphology assessment was performed by one person. The men examined were recruited from groups attending a compulsory medical examination (military service), and not selected for known fertility or semen quality. Moreover, the majority of participants had no prior knowledge of their fertility potential. This cross-sectional study shows that a geographical gradient exists in the Nordic-Baltic area with regard to semen parameters, somewhat similar to the gradient in incidence of testicular cancer. However, it has to be kept in mind that the participation rate in this study was below 20%.
Carlsen et al.  speculate that environmental rather than genetic factors might be responsible for the secular trend in semen quality, which seems to be supported by the observation of increased testicular cancer incidence in many countries [31, 32]. However, a possible causal relationship between sperm quality and environmental factors cannot be investigated by description of time-trends and correlation studies. Individual data of both semen quality and data on exposure to environmental factors have to be analysed in analytic case-control or cohort studies. Maternal factors during pregnancy deserve further investigation because they have changed over time and because they might influence male reproductive disorders like cryptorchidism, hypospadias, low sperm counts and testicular cancer . This would correspond to a lifecourse approach regarding the investigation of the semen quality and its determinants. Furthermore, fertility status and semen quality might be predictors of male health and life expectancy , which gives an important public health perspective for further studies.