What We Know and Don’t Know About Risk Assessment with Offenders of Indigenous Heritage

What We Know and Don’t Know About Risk Assessment with Offenders of Indigenous Heritage PDF Version (1.2 MB)

by Leticia Gutierrez, L. Maaike Helmus, R. Karl Hanson

Table of Contents


The over-representation of Indigenous offenders in the Canadian criminal justice system highlights the need for research on the applicability of risk assessment for this group. Given that most decisions throughout an offender’s progression through the criminal justice system are guided by the outcomes of risk assessment, it is essential that risk assessments be structured, objective, reliable and transparent. Furthermore, it is imperative that these risk assessments be empirically validated in order to defend their use with a diverse offender population. Meta-analyses and large-sample studies have demonstrated that the major risk factors and commonly used risk assessment scales predict recidivism for Indigenous offenders, but the predictive accuracy is weaker for Indigenous compared to non-Indigenous offenders. Given the consequences of risk assessment for offenders and matters of public safety, the reasons for these differences remain an important topic of research. Despite the evidence gaps, the available research supports the use of empirically validated structured risk assessments with offenders of Indigenous heritage, until there is more research done to better understand differences in predictive accuracy.

Author's Note

The views expressed are those of the authors and do not necessarily reflect those of Public Safety Canada. Correspondence concerning this report should be addressed to:

Research Division, Public Safety Canada
340 Laurier Avenue West
Ottawa, Ontario
K1A 0P8
Email: PS.CPBResearch-RechercheSPC.SP@ps-sp.gc.ca


We thank Dr. Andrew Haag and the University of Alberta for hosting the webinar upon which the current article was based.


The applicability of commonly used risk assessments to Indigenous offenders has been a topic of considerable debate in the Canadian justice system for decades. Some practitioners and academics argue that the inappropriate assessment of risk of Indigenous offenders is yet another contributing factor to their over-representation in the criminal justice system (Laprairie, 1997; Martel, Brassard, & Jaccoud, 2011). For example, in Canada, Indigenous peoples represent 21% of the federalFootnote 1 inmate population (Public Safety Canada, 2015), while only accounting for 4.3% of the Canadian adult population (Statistics Canada, 2013). Compared to non-Indigenous offenders, Indigenous offenders are also more likely to be incarcerated for violent offences (Trevethan, Moore, & Rastin, 2002), placed in administrative segregation (Helmus, 2015), incarcerated in maximum security institutions (Public Safety Canada, 2015), released later in their sentence (Public Safety Canada, 2015), and revoked while on parole (Office of the Correctional Investigator, 2014). Despite both government and Supreme Court attempts to address this over-representation (e.g., Canadian Criminal Code § 718.2(e); R. v. Gladue, 1999), it has increased since the late 1990s (Public Safety Canada, 2015). Given the important role of risk assessment in guiding the management and treatment of offenders in the criminal justice system, we have a responsibility to examine the applicability of these assessments to a group that is over-represented in our system.

The purpose of this paper is to discuss the applicability of structured risk assessment scales with Indigenous offenders. This includes a brief overview of risk assessment and possible reasons why we may or may not expect risk scales to perform the same or differently with Indigenous offenders. We subsequently review research on the applicability of risk scales with Indigenous offenders, focusing on large-sample studies and meta-analytic reviews of both general offenders as well as sex offenders. Our coverage of the research is meant to be illustrative but not exhaustive. It should also be noted that although we refer to Indigenous peoples as a subgroup in this review, it is important to acknowledge the diversity of Indigenous cultures within the broader population and the varied histories for each group (i.e., First Nations, Métis, and Inuit). For example, in Canada there are approximately 617 First Nations communities, representing 50 distinct nations and over 50 Indigenous languages (Indigenous and Northern Affairs Canada, 2015). Although a discussion of the varieties of Indigenous cultures and the implications for risk assessment is beyond the scope of this article, recognition of this heterogeneity is warranted.

Risk Assessment: What is Its Purpose and Why is It Important?

Most decisions throughout an offender’s progression through the criminal justice system involve risk assessment, including sentencing, security classification, parole decisions, treatment needs, and supervision conditions/intensity. Recent decades have yielded considerable advances in the field of risk assessment, with the development of dozens of scales that can predict recidivism with moderate to high levels of accuracy (Hanson, 2005; Singh et al., 2014). Research has demonstrated that structured risk assessments perform better than unstructured approaches (Ægisdóttir et al., 2006; Dawes, Faust, & Meehl, 1989; Grove, Zald, Lebow, Snitz, & Nelson, 2000; Hanson & Morton-Bourgon, 2009). Given the consequences of risk assessment for offenders, and the implications for public safety, it is imperative that risk assessments be empirically based, objective, transparent, and reliable.

As highlighted in the recent court decision (Ewert v. Canada, 2015), it is particularly important that risk scales be empirically validated in order to defend their use. Empirical support enables the application of cumulative knowledge about factors (linked to recidivism) to the individual offender in order to assess their likelihood of recidivism. An underlying assumption, however, to the proper application of cumulative knowledge is that offenders being assessed are not meaningfully different (in risk-relevant ways) from those included in the development and validation research. It goes without saying that no two offenders are exactly alike; however, the extent to which differences matter will depend on whether those differences impact the predictive accuracy of the risk assessment scale.

To understand the accuracy of risk scales first requires an understanding of what risk assessment scales are designed and intended to assess (i.e., their purpose). Importantly, risk assessment tools are criterion-referenced scales, as opposed to norm-referenced scales. Most scales in psychology are norm-referenced, as they are attempting to assess how individuals display varying amounts of a specific construct (e.g., Aiken, 1985). Examples include tests of intelligence, ability, or personality. In contrast, criterion-referenced scales (e.g., offender risk scales) are designed specifically to predict an outcome of interest. Norm-referenced and criterion-referenced scales are meaningfully different, with some elements of test reliability and validity not applicable to the latter (e.g., internal consistency; Aiken, 1985). Namely, in norm-referenced scales, reliability (e.g., high item-total correlations) increases when multiple items are assessing the same construct. This may be achieved by saturating the scale with similar items with different wordings or reversed scoring. The abundance of items measuring similar constructs also easily lends itself to analyses of the underlying factor structure of norm-referenced scales (e.g., to organize patterns of relationships between items into distinct factors measured by the scale; Aiken, 1985).

In contrast, criterion-referenced scales are often developed atheoretically and their most important goal is to predict the outcome (Joint Committee on Standards for Educational and Psychological Testing, 2014, p. 96). Consequently, it may be undesirable to measure only one construct or to include multiple items assessing the same construct. Given practical constraints in applied use, an optimal criterion-referenced scale may be one that includes the smallest number of items measuring the most distinct constructs as possible to maximize both accuracy and efficiency.  This would be expected to decrease internal consistency. High item-total correlations indicate increased reliability in a norm-referenced scale but may indicate redundancy in a criterion-referenced scale. Similarly, it may not be desirable to explore the factor structure of a risk scale, because there would rarely be enough items measuring a factor to allow for reliable factor analyses (Babchishin, 2013; Brouillette-Alarie, Babchishin, Hanson, & Helmus, 2015).

Given that risk scales are designed to maximize predictive accuracy, it is important to distinguish between two types of predictive accuracy: discrimination and calibration. Discrimination refers to a scale’s ability to distinguish between recidivists and non-recidivists. This assesses the extent to which higher risk scores are associated with higher levels of recidivism (regardless of the actual rate of recidivism), which means that the scale is able to effectively rank order offenders in their relative risk for recidivism. Calibration, on the other hand, focuses on the accuracy of predicted recidivism rates. In other words, if a scale predicts that 20% of offenders with a particular score will reoffend, calibration examines whether 20% of offenders with that score reoffend in new samples (or among subgroups). Discrimination can be examined for any type of structured risk scaleFootnote 2 (e.g., Structured Professional Judgement [SPJ] or actuarial), whereas calibration can only be examined for actuarial scales, as they are the only method that includes empirically-derived estimates of the probability of recidivism.

Should We Expect Risk Scales to Perform Differently for Indigenous and Non-Indigenous Offenders?

One of the most robust findings in the literature on Indigenous offenders is that they tend to score significantly higher than non-Indigenous offenders on most risk factors. On average, Indigenous offenders are younger (Babchishin, Blais, & Helmus, 2012; Statistics Canada, 2006); have lengthier criminal histories, particularly early onset (Babchishin et al., 2012; Dell & Boe, 2000; Holsinger, Lowenkamp, & Latessa, 2003; Shepherd, Adams, McEntyre, & Walker, 2014) and report more negative childhood histories (Ellerby & MacPherson, 2002; Johnston, 1997; Trevethan, Auger, Moore, MacDonald, & Sinclair, 2001). In adulthood, Indigenous offenders are rated as higher need in the domain of family and/or marital problems (Shepherd et al., 2014; Trevethan et al., 2002), as well as education/employment and substance abuse (Ellerby & MacPherson, 2002; Shepherd et al., 2014). Indigenous sex offenders have been found to have significantly higher lack of concern for others, impulsivity, poor cognitive problem-solving, and problems cooperating with supervision (Helmus, Babchishin, & Blais, 2012); they were also more likely to abuse substances during the commission of the offence (Ellerby & MacPherson, 2002; Nahanee, 1996; Rastin & Johnson, 2002; Rojas & Gretton, 2007). In contrast, however, Indigenous sex offenders may have similar or lower levels of sexual deviance compared to non-Indigenous sex offenders (Babchishin et al., 2012; Ellerby & MacPherson, 2002; Helmus et al., 2012).

Indigenous offenders have also been found to have higher recidivism rates than non-Indigenous offenders (Gutierrez, Wilson, Rugge, & Bonta, 2013; Sioui & Thibault, 2002). Among sex offenders, Indigenous offenders show higher rates of sexual recidivism (Rastin & Johnson, 2002; Rojas & Gretton, 2007; Williams, Vallée, & Staubi, 1997), violent recidivism (Rojas & Gretton, 2007), and general recidivism (Rastin & Johnson, 2002; Rojas & Gretton, 2007).

Importantly, however, it does not necessarily follow that because Indigenous offenders are higher risk than non-Indigenous offenders, risk factors (or scales) will predict recidivism differently for Indigenous offenders. Although higher risk scores among Indigenous offenders should be a call for greater resources (e.g., treatment) for this group, it is not in itself a form of test bias (Warne, Yoon, & Price, 2014). The main issue regarding the suitability of risk scales for Indigenous offenders concerns whether the predictive accuracy of the scale (discrimination and calibration) differs between Indigenous and non-Indigenous offenders. Rather than focussing on the stability of the factor solution, the validity of prediction tools should focus on the validity of the regression equations (or comparable discrimination statistics) linking scores to recidivism rates (Reynolds, 2000). Furthermore, it is important to consider how any observed differences could lead to harmful impacts for already disadvantaged groups.  Certainly, over-representation of Indigenous offenders and higher prevalence of risk factors among this subgroup makes this an important research question.

Research on Risk Assessment with Indigenous Offenders

Most risk assessment scales tend to incorporate at least some information from the Central Eight risk factors for recidivism (Andrews & Bonta, 2010):  history of criminal behaviour, procriminal personality, procriminal associates, procriminal attitudes, family/marital problems, education/employment problems, poor use of leisure/recreation time, and substance abuse. A recent meta-analysis of 49 independent samples (n = 57,315 Indigenous and 204,977 non-Indigenous offenders) found that all Central Eight risk factors were significantly predictive of general and violent recidivism for Indigenous offenders (with Cohen’s d’s ranging between 0.11 to 0.56; Gutierrez et al., 2013).  However, for all but two of the domains (procriminal attitudes and leisure/recreation problems), predictive accuracy was lower for Indigenous offenders compared to non-Indigenous.

In another meta-analysis, Wilson and Gutierrez (2014) examined different versions of the Level of Service (LSI) risk scales in 15 unique samples (n = 21,807 Indigenous and 42,515 non-Indigenous offenders). LSI total scores significantly predicted general recidivism for Indigenous offenders with moderate accuracy (d = 0.62), and all subscales also predicted recidivism (d’s between 0.24 to 0.60). Similar to Gutierrez et al. (2013), however, five of the eight subscales (criminal history, employment/education, companions, alcohol/drugs, and procriminal attitude-orientation) had lower predictive accuracy for Indigenous offenders compared to non-Indigenous offenders.

Wilson and Gutierrez (2014) also examined the calibration of the LSI (Ontario revision) in a single sample of 1,692 Indigenous offenders and 24,758 non-Indigenous offenders from Wormith and Hogg (2012). Following Reynolds (2000), this study computed separate logistic regression equations (intercept, slope) for the two groups. As seen in Figure 1, the recidivism rates predicted from the LSI-OR were well-calibrated for moderate and high scoring Indigenous offenders, but underestimated the absolute recidivism rates of low scoring Indigenous offenders. In other words, recidivism rates for Indigenous offenders with low scores on the LSI-OR were higher than what would be predicted by the risk scale. This suggested that actuarial risk scales may actually under-classify Indigenous offenders.

Figure 1: Expanded logistic regression model lines of best fit (in black) and absolute recidivism base rates (in grey) by LSI-OR total score for each group. Vertical lines indicate risk categories based on LSI-OR total scores: very low (0-4), low (5-10), medium (11-19), high (20-29), and very high (30-43). Adapted from “Does one size fit all?: A meta-analysis examining the predictive ability of the Level of Service Inventory (LSI) with Indigenous offenders,” by H.A. Wilson & L. Gutierrez, 2014, Criminal Justice and Behaviour, 41, p. 211.

Regarding sex offenders, there is one meta-analysis (albeit small) available, examining Static-99R and Static-2002R with Indigenous offenders (Babchishin et al., 2012). This study included five Static-99R samples (n = 319 Indigenous and 1,269 non-Indigenous sex offenders) and three Static-2002R samples (n = 209 Indigenous and 955 non-Indigenous sex offenders). Static-99R was found to predict sexual recidivism with similarly high levels of predictive accuracy for both Indigenous and non-Indigenous offenders (AUC of .71 vs .74). Static-2002R, however, predicted sexual recidivism for Indigenous offenders (AUC = .61), but the effect size was small and was lower than the accuracy found for non-Indigenous offenders (AUC = .76).

Although not meta-analytic in nature, there are two related studies with large sample sizes validating the Static Factors Assessment (SFA), an SPJ risk tool used by the Correctional Service of Canada (CSC) for all federal offenders (CSC, 2014; Motiuk, 1993). The SFA has 137 dichotomous items in three subscales: criminal history, offence severity, and sex offence history (though the latter subscale has not been validated).

Examining the construct validity of the scale, Helmus and Forrester (2014a) analyzed all SFA assessments completed from 1997 to 2012, which included 12,265 assessments for Indigenous offenders and 52,340 for non-Indigenous offenders. Overall, 59% of Indigenous offenders were rated as high risk according to the final professional judgement, compared to 38% of non-Indigenous offenders. The higher ratings of risk given to Indigenous offenders could only be partly explained by the risk factors. Controlling for the sum of all criminal history and offence severity items, the odds of being declared high risk were still 1.3 times higher for Indigenous offenders. In other words, given the same criminal history and offence severity as a non-Indigenous offender, CSC staff were still more likely to label an Indigenous offender as high static risk.

In a second study of the SFA, Helmus and Forrester (2014b) examined five year follow-up data (for revocations, readmissions, and readmissions for a violent offence) for a subset of 1,649 Indigenous offenders and 7,061 non-Indigenous offenders. Similar to previous meta-analytic findings, they found that both subscale total scores on the SFA and the final rating (using SPJ) generally predicted reoffending for both Indigenous (AUCs ranged between .47 and .76) and non-Indigenous offenders (AUCs ranged between .54 and .83), although predictive accuracy was quite small for Indigenous offenders (though it performed better for Indigenous women) and was lower than for non-Indigenous offenders (this was true for both subscale total scores and the SPJ risk rating). Interestingly, the mechanical sum of the criminal history items in the SFA had meaningfully higher predictive accuracy than the overall SPJ rating for all subgroups examined, suggesting the use of this subscale may be preferable to the SPJ rating.

Possible Explanations for Differences in Findings

Although the research generally finds empirical support for the predictive accuracy of risk scales with Indigenous offenders, there is a fairly consistent pattern of lower accuracy compared to non-Indigenous offenders. The reason for the pattern, however, is not clear. Gutierrez and colleagues (2013) hypothesized whether this could be due to restriction of range from a ceiling effect. For example, it could be difficult to discriminate between recidivists and non-recidivists if all Indigenous offenders score high risk. This explanation, however, is unlikely. Evenly split distributions (e.g., a 50% endorsement rate) maximize statistical power and provide the strongest protection from restriction of range. In contrast, recidivism rates are often quite below 50% and risk scales are positively skewed, with fewer offenders scoring in the highest risk ranges. Examining descriptive data for most of the research discussed above, higher risk scores and recidivism rates among Indigenous offenders often creates a more optimal distribution for predictive accuracy as opposed to a restriction of range (in other words, non-Indigenous offenders are more likely to display a floor effect than Indigenous offenders to display a ceiling effect).

More recently, Wilson and Gutierrez (2014) synthesized existing literature on this topic and proposed four possible explanations for the pattern of lower discrimination for risk scales among Indigenous offenders. The first is racial discrimination in the criminal justice system. It may be harder for risk scales to discriminate between low risk and high risk offenders if recidivism rates are inflated because of systemic bias, rendering common risk assessments less predictive with Indigenous offenders. In addition to artificially increasing prior offences and recidivism for Indigenous offenders, this may also alter the thresholds for risk factors. In other words, if Indigenous offenders are more exposed to risk factors and more likely to be detected and prosecuted for criminal behaviour, it is possible that a greater potency of risk factors is needed to predict recidivism for Indigenous offenders.

The second possible explanation is that although the risk factors for recidivism are the same for Indigenous and non-Indigenous offenders, Indigenous offenders exhibit many more risk factors, largely due to historical, social, and economic disadvantages. Predictive accuracy for one factor may be low for Indigenous offenders because low-scoring individuals may still be high risk on other factors compared to non-Indigenous offenders (Wilson & Gutierrez, 2014). This hypothesis implies that reduced accuracy for Indigenous offenders should be less of a problem on total scores of risk scales because they would presumably incorporate the other factors, or at least many of them. The more comprehensive the risk scale is, the more this problem should be ameliorated.

The third possible explanation reviewed by Wilson and Gutierrez (2014) is that the unique present and historic circumstances of Indigenous peoples are neglected in risk factors. For example, Wilson and Gutierrez (2014) hypothesize that broader conceptualizations of family in Indigenous communities may not be incorporated when assessing risk factors in the family/marital domain. This is similar to the argument of Helmus and colleagues (2012) that the risk-relevant constructs may be the same for Indigenous and non-Indigenous offenders, but that the indicators of those constructs may differ, or the meaning of those indicators may be different. For example, whereas substance abuse may reflect self-regulation problems for non-Indigenous offenders, it may reflect self-medication to cope with trauma or other adverse conditions among Indigenous offenders. This would mean that it could be possible to develop risk scales with equivalent accuracy for Indigenous and non-Indigenous offenders if the indicators of the underlying constructs are defined in a way that is culturally generalizable. This may involve coding manuals that specifically indicate how these risk constructs may be particularly manifested for Indigenous offenders.

A fourth hypothesis is that there are risk factors unique to Indigenous offenders that are not adequately captured in current risk scales. This suggests that risk scales specific for Indigenous offenders should be developed, or that culturally-specific risk factors should be incorporated into current assessments. For example, Heckbert and Turkington (2001) suggested that cultural or spiritual isolation (e.g., reserve system and effects of assimilation due to residential school experience) is a prominent issue for Indigenous peoples in the justice system, and that it plays a significant role in the healing and successful reintegration of offenders back into the community. Other examples of culturally-specific factors/domains that have been raised in the literature include the following: loss of native language (Ellerby & McPherson, 2005; Laprairie, 1996; Mann, 2009); impact of residential schools (Royal Commission on Aboriginal Peoples, 1996; Mann, 2009); and, lack/loss of pride in heritage (Heckbert & Turkington, 2001. Unfortunately, as observed by Gutierrez et al. (2013), little research has empirically tested how these potential risk factors/domains relate to recidivism; therefore, our knowledge regarding their utility in risk prediction is limited. It may be that Indigenous offenders score high on culturally-specific items that are not currently captured in risk assessment, which would account for the poorer predictive ability for Indigenous offenders.

Conclusions and Recommendations

Correctional best practice involves the use of empirically validated structured risk assessment scales to guide decision-making. Use of these scales, however, requires some assumption that the offenders being assessed are similar to those on which the scale was developed or validated. Often this assumption is appropriate. Although each offender is unique, risk scales should be applicable as long as offenders are not meaningfully different from the research base in a risk-relevant way. Given that Indigenous offenders are over-represented at virtually every stage of the criminal justice system and demonstrate elevated risk and need, special consideration of the application of risk tools with this group is not only warranted, it is necessary. Regardless of whether actuarial or SPJ tools are being used, it is necessary to validate these tools with Indigenous offenders.

What we know from current research is that the standard risk factors and some of the commonly used structured risk scales (e.g., the LSI family of scales and Static-99R) predict recidivism with Indigenous offenders, and are, consequently, appropriate for use in correctional practice. Much of this research is based on very large sample sizes and/or meta-analytic studies. However, it is also important to note that although these tools are valid for use with Indigenous offenders, their accuracy is lower compared to non-Indigenous offenders, and this appears to be true for both actuarial and SPJ scales. This finding necessitates additional caution in assessments with this group, particularly for life-changing decisions (e.g., Dangerous Offender designations). Nonetheless, given that these scales still predict recidivism with moderate accuracy, abandoning their use is not defensible, unless they are replaced with a method empirically demonstrated to have superior accuracy. For Canadian Indigenous offenders, there is currently sufficient evidence to support the applied use of the Level of Service instruments for general recidivism, and Static-99R for sexual recidivism.

Given that Indigenous offenders face disadvantage in virtually every criminal justice decision (e.g., security placements, parole grant rates) and have more extensive criminal histories, we cannot rule out the possibility of discrimination against Indigenous offenders in correctional decision-making (Truth and Reconciliation Commission of Canada, 2015). One of the best ways to protect against bias in decision-making is to rely on objective, structured, and empirically defensible methods. Conceptually, it is possible that SPJ scales have more flexibility to consider cultural differences, but there is no empirical evidence to support the benefit of this approach in terms of improved accuracy or decreased bias. In contrast, the available evidence from CSC suggests SPJ methods may actually increase bias against Indigenous offenders by assessing them as higher risk than non-Indigenous offenders with the same risk factors (Helmus & Forrester, 2014a). Use of actuarial scales may not eliminate this bias (e.g., the scoring of individual items still requires some subjectivity), but should decrease it, enhancing objectivity, transparency, and consistency among raters.  

What we don’t know is why predictive accuracy is lower for Indigenous offenders (both in terms of lower discrimination and greater errors in calibration). A better understanding of these differences is necessary to improve risk assessment for this subgroup of offenders and to inform how to intervene to reduce risk. To improve risk assessment with this group, future research needs to better understand the meaning of commonly used risk factors (to inform why their accuracy may be lower with Indigenous offenders) and should also explore the possibility of culturally-informed risk factors for this group.


Date modified: