New research findings by computer scientists “cast significant doubt on the entire effort of algorithmic recidivism prediction”
This notable new research article in the latest issue of Science Advances provides a notable new perspective on the debate over risk assessment instruments. The article is authored by computer scientists Julia Dressel and Hany Farid and is titled “The accuracy, fairness, and limits of predicting recidivism.” Here are parts of its introduction:
In the criminal justice system, predictive algorithms have been used to predict where crimes will most likely occur, who is most likely to commit a violent crime, who is likely to fail to appear at their court hearing, and who is likely to reoffend at some point in the future.
One widely used criminal risk assessment tool, Correctional Offender Management Profiling for Alternative Sanctions (COMPAS; Northpointe, which rebranded itself to “equivant” in January 2017), has been used to assess more than 1 million offenders since it was developed in 1998. The recidivism prediction component of COMPAS — the recidivism risk scale — has been in use since 2000. This software predicts a defendant’s risk of committing a misdemeanor or felony within 2 years of assessment from 137 features about an individual and the individual’s past criminal record.
Although the data used by COMPAS do not include an individual’s race, other aspects of the data may be correlated to race that can lead to racial disparities in the predictions. In May 2016, writing for ProPublica, Angwin et al. analyzed the efficacy of COMPAS on more than 7000 individuals arrested in Broward County, Florida between 2013 and 2014. This analysis indicated that the predictions were unreliable and racially biased. COMPAS’s overall accuracy for white defendants is 67.0%, only slightly higher than its accuracy of 63.8% for black defendants. The mistakes made by COMPAS, however, affected black and white defendants differently: Black defendants who did not recidivate were incorrectly predicted to reoffend at a rate of 44.9%, nearly twice as high as their white counterparts at 23.5%; and white defendants who did recidivate were incorrectly predicted to not reoffend at a rate of 47.7%, nearly twice as high as their black counterparts at 28.0%. In other words, COMPAS scores appeared to favor white defendants over black defendants by underpredicting recidivism for white and overpredicting recidivism for black defendants….
While the debate over algorithmic fairness continues, we consider the more fundamental question of whether these algorithms are any better than untrained humans at predicting recidivism in a fair and accurate way. We describe the results of a study that shows that people from a popular online crowdsourcing marketplace — who, it can reasonably be assumed, have little to no expertise in criminal justice — are as accurate and fair as COMPAS at predicting recidivism. In addition, although Northpointe has not revealed the inner workings of their recidivism prediction algorithm, we show that the accuracy of COMPAS on one data set can be explained with a simple linear classifier. We also show that although COMPAS uses 137 features to make a prediction, the same predictive accuracy can be achieved with only two features. We further show that more sophisticated classifiers do not improve prediction accuracy or fairness. Collectively, these results cast significant doubt on the entire effort of algorithmic recidivism prediction.
A few (of many) prior related posts on risk assessment tools:
- ProPublica takes deep dive to idenitfy statistical biases in risk assessment software
- “Assessing Risk Assessment in Action”
- Thoughtful account of what to think about risk assessment tools
- “The Use of Risk Assessment at Sentencing: Implications for Research and Policy”
- Wisconsin Supreme Court rejects due process challenge to use of risk-assessment instrument at sentencing
- “In Defense of Risk-Assessment Tools”
- Parole precogs: computerized risk assessments impacting state parole decision-making
- Thoughtful look into fairness/bias concerns with risk-assessment instruments like COMPAS
- “Gender, Risk Assessment, and Sanctioning: The Cost of Treating Women Like Men”
- Expressing concerns about how risk assessment algorithms learn
- “Under the Cloak of Brain Science: Risk Assessments, Parole, and the Powerful Guise of Objectivity”