Bullying (Bull) at schools in Mexico is reaching epidemic proportions. According to the First National Survey of Exclusion, Intolerance and Violence in Public Schools, surveyed by the Ministry of Public Education (SEP) in 2008, 44.6% and 26.2% of men and women between 15 and 19 years old recognized having abused their peers. On the other hand, the National Commission on Human Rights (CNDH) estimates that 40% of all Mexicans in elementary schools, both public and private, are victims of bullying
. Also, previous studies have reported specific cultural conflicts and discrimination between ethnic groups due to differences in acculturation (e.g. language barriers), an additional factor seen in Mexican immigrants
 and those settled in Mexico-US border cities (like the one studied here)
. However, there have been very few systematic studies performed in Mexico on Bull epidemiology, none of them reviewing Bull impact on public health costs or its implications for community health.
Studies in social psychology indicate that Bull aggressors and victims are characterized by displaying recognizable behaviors
. However, from an epidemiological standpoint, the psychosocial distress in both parties is not easy to detect because it depends largely on the complicity and anonymity provided by peers. As a consequence, Bull epidemiology and its early detection, at least in Mexico, is limited by the lack of validated tests to ensure the respondents’ anonymity. Here, the preliminary study and experts’ opinions showed that interviewing about school bullying can be intimidating. This can cause serious problems for the correct identification of both aggressors and victims of bullying. In this study it was observed that the addition of a single less intimidating question at the beginning, although having a low correlation with the overall scale (rs = 0.10; Table
1), did not affect the reliability of Bull-M. To our knowledge, there is no previous report that analyzes the intimidating effect of individual questions on the truthfulness of total response of any test that assesses bullying.
On the other hand, validation statistics used in this study (reliability, construct, and convergence) demonstrated that Bull-M is a useful tool for population studies. This assertion is supported by the following:
First, the internal consistency (Cronbach’s α; CA) and reproducibility (test-retest) showed that Bull-M is reliable. The general consensus on the interpretation of CA stating that α <0.70 means that a test is not uniform while an α > 0.90 suggests the existence of redundant items
. Cronbach’s α for Bull-M was 0.75, signifying acceptable consistency. Baldry
 and Cerezo
 found similar CA results with Bull-S and a self-report anonymous questionnaire, respectively. Also, a high correlation was found between a first and second application [test-retest (rs = 0.91; p < 0.001)]. Although a higher score (p < 0.05) in retest was observed for items 3, 5, 7, and 9, this is non-meaningful because the transformation of the Likert scale (never, rarely, sometimes, often, and every day) into numbers (0, 1, 2, 3, 4) really shows that at both times, these items were identified as “rarely.” However, it should be kept in mind that the test-retest may be influenced by the complexity and ambiguity of the item
; although both the expert panel and the pilot study agreed on the fact that Bull-M is simple and clear, so complexity and ambiguity are not an issue.
Second, CPA and CFA correctly identified the bi-factorial character of Bull-M as was proposed from the beginning. CPA revealed that both factors (“Bullying me” and “Bullying others”) explained 44.6% of total variance. Although there are tests which explain up to 75% of total variance
, it is more common to find lower percentages
. A plausible explanation to this is the fact that no single test can contain all possible bullying representations as is the case of Bull-M which only reflect twelve situations (Additional files
2) which were gathered from in-depth interviews and focus groups with experts and teachers. Also, PCA is not able to demonstrate each factor structure or the dimensionality of the model
, but CFA does; it confirmed the bi-factorial structure suggested by PCA, providing not only more evidence on the robust structure of Bull-M but also on the strong association between both factors.
Third, Bull-M was designed to allow assessment of the prevalence of bullying in schools which it actually allows, at least when compared to Bull-S (convergent validity). To our knowledge, comparisons on the psychometric properties of two tests designed to explore Bull at schools, have not yet been reported.
The authors recognize that there are many validation factors that need to be addressed in order to sustain even more the reliability and criterion validity of Bull-M. First, internal consistency and reproducibility, although commonly used in social sciences, are just two of many reliability criteria. External validation conducted on different populations with different social conditions could argue the reliability of Bull-M even more. Second, given the inherent subjectivity of any test that attempts to evaluate Bull behaviors, the calibration or criterion validity is difficult to evaluate since there is no “golden” criterion with which to contrast
; nevertheless, it has to be analyzed somehow. Third, although Bull-M was originally designed to be applied on subjects between the ages of 8 and 15, it also has to be validated with elementary school students because Bull at this age is somewhat different