Training sessions were held in which the research team members thoroughly reviewed the coding form and manual, coded two sample studies, and discussed divergent ratings in detail. The included studies were then reviewed independently and scored by two reviewers. Inter-rater agreement on the quality checklist was 76.9% across the included studies. The two reviewers discussed any discrepancies in ratings and a consensus rating was reached. Where consensus could not be reached on a particular item, a third reviewer provided a rating.