Reliability between reviewers (regarding the decision to include or exclude an article) was calculated using kappa statistics.
Unlike the observed agreement, the kappa statistic expresses the degree of agreement while correcting for agreement expected simply by chance (12).
A one-sample t test was conducted to test whether the kappas were significantly different from zero.
As a second method of validation, the kappa statistic was used to judge concordance between the cluster solution and teacher classification of the children.
Similarly, for case definition, kappa scores are good for both instruments but the confidence intervals are wide, which could result in variable estimates of prevalence.
A minimum kappa of 0.75 for rating the presence of psychotic and mood items is required of all interviewers trained in the programme.
The kappa values of agreement were significantly greater than the zero at both distances.
Agreement at the level of any anxiety or depressive disorder was also good (kappa=0.80).