Empirical Sentencing: Weapons of Math Destruction?

Can math be racist? Efforts to change the means by which defendants are sentenced, to end the voodoo of gut sentencing, to empiricism have been around since the Sentencing Reform Act of 1984, which birthed the dreaded United States Sentencing Guidelines. And deeper efforts, championed by Senior Nebraska Judge Richard Kopf, are still in the works, even after the Supreme Court backed off its misbegotten Mistretta mandate in its punchline opinion in Booker.

The attraction is obvious, particularly in a world that has come to adore binary thinking, the belief that data doesn’t lie. It may not make us happy because it reveals truths we would prefer to hide behind flowery words, but it is truth nonetheless.  Whose truth, however, has remained an issue, as the empirical research of J.C. Oleson shows significant correlation between poverty and recidivism.

To be clear, the criterion wouldn’t be that black defendants would be sentenced longer because they were black, for example, but that poor people whose parents had criminal histories and didn’t graduate high school would be sentenced longer because, empirically, they would be more likely to be recidivists. Entirely different?

By raising the question of poverty as a proxy, which fails in my opinion for a variety of reasons ranging from “correlation not proving causation,” to its inadequacy as a proxy (say, a 74% reoffend correlation rate, statistically significant for empirical purposes, means 26% of defendants will be sentenced to a longer sentence than parsimony would allow, which is just plain wrong), to its placing the full weight of recidivism on the defendant.  “What,” you ask?

Among the factors imposed by law, and embraced by sentencing theory, to justify the legitimacy of imposing a sentence of imprisonment are protection of the community, specific deterrence and rehabilitation.  When Judge Kopf uses pre-incarceration factors as empirical proxies to ascertain a defendant’s likelihood of committing crimes upon release, it leaves a huge gaping hole in the analysis.

Not being statisticians, computer scientists or mathematicians, lawyers are constrained to respond from positions of logic rather than with numbers. As strong as a well-conceived logical argument may be, there is usually a similarly strong counter-argument. Even if you disagree with it, no honest assessment can deny its existence. We just pick the side that best suits our views or purpose and do our best to pound it to death. But numbers, bright, shiny numbers, are irrefutable. One plus one equals two, no matter what arguments we proffer in response.

Except Cathy O’Neil, armed with a Harvard Ph.D. in mathematics, contends that numbers can and do lie.

From targeted advertising and insurance to education and policing, O’Neil looks at how algorithms and big data are targeting the poor, reinforcing racism and amplifying inequality.

These “WMDs,” as she calls them, have three key features: They are opaque, scalable and unfair.

Denied a job because of a personality test? Too bad — the algorithm said you wouldn’t be a good fit. Charged a higher rate for a loan? Well, people in your zip code tend to be riskier borrowers. Received a harsher prison sentence? Here’s the thing: Your friends and family have criminal records too, so you’re likely to be a repeat offender. (Spoiler: The people on the receiving end of these messages don’t actually get an explanation.)

The problem isn’t that the numbers don’t necessarily add up, but that they’re used to conceal the underlying assumptions upon which they’re based.

The models O’Neil writes about all use proxies for what they’re actually trying to measure. The police analyze zip codes to deploy officers, employers use credit scores to gauge responsibility, payday lenders assess grammar to determine credit worthiness. But zip codes are also a stand-in for race, credit scores for wealth, and poor grammar for immigrants.

Putting aside the unfortunate whiff of social justice motivating O’Neil, who writes at mathbabe.org, that tends to taint her contentions,* she’s not without a point:

One of the book’s most compelling sections is on “recidivism models.” For years, criminal sentencing was inconsistent and biased against minorities. So some states started using recidivism models to guide sentencing. These take into account things like prior convictions, where you live, drug and alcohol use, previous police encounters, and criminal records of friends and family.

These scores are then used to determine sentencing.

“This is unjust,” O’Neil writes. “Indeed, if a prosecutor attempted to tar a defendant by mentioning his brother’s criminal record or the high crime rate in his neighborhood, a decent defense attorney would roar, ‘Objection, Your Honor!'”

Well, no, that’s not at all what a “decent defense attorney” would say, no less roar, but then, she’s not a lawyer so she can be forgiven her melodramatic grasp of law.

But in this case, the person is unlikely to know the mix of factors that influenced his or her sentencing — and has absolutely no recourse to contest them.

Or consider the fact that nearly half of U.S. employers ask potential hires for their credit report, equating a good credit score with responsibility or trustworthiness.

This “creates a dangerous poverty cycle,” O’Neil writes. “If you can’t get a job because of your credit record, that record will likely get worse, making it even harder to work.”

Someone, somewhere, created an algorithm that, after peer review and vetting, demonstrates some statistically valid correlation, even if it’s woefully unhelpful in explaining causation so we can improve upon the situation rather than perpetuate it. But the bases for these calculations are hidden behind the math, and the correlation gives rise to proxies, seemingly benign factors that correlate to race and poverty.**  And by wrapping up benign factors in pretty pink bows, we can use race as a sentencing factor while pretending we’re not.

So is math racist? No. Of course not. But O’Neil calls out the reality that our beloved algorithms serve to conceal their true underlying bases, which may well be that poor blacks are gonna get burned, mired in a “dangerous poverty cycle,” by the use of empiricism.

Even if the numbers don’t lie, even if the numbers tell the truth no matter how unpleasant, there remains an unanswered question. Do we want to perpetuate a correlation that, while mathematically accurate at the moment, will serve to make it difficult, if not impossible, to improve upon this cycle of poverty?

While numbers may not change, people can.

“Big Data processes codify the past,” O’Neil writes. “They do not invent the future. Doing that requires moral imagination, and that’s something only humans can provide.”

Tell that to the Sentence-o-Matic 1000.

*O’Neil justifies her views on the basis of moral imperative, which inherently assumes her flavor of morality is right and any other flavor is wrong. Just because she suffers from a simplistic, self-righteous and myopic grasp of “justice” doesn’t mean that it isn’t consistent with sound policy and legal doctrine.

**An entirely separate question is the propriety of applying empiricism to individuals, who may or may not fit neatly within the statistical framework. That’s a separate issue from the one raised by O’Neil.

H/T Rick Horowitz

19 thoughts on “Empirical Sentencing: Weapons of Math Destruction?

  1. Richard G. Kopf

    SHG,

    Let’s take the portion of 3553(a) that says judges are supposed to protect the public from further crimes of the offender. I assume you, and the Math Babe, prefer a crystal ball.

    All the best.

    RGK

    1. SHG Post author

      Isn’t that what Congress is asking judges to use as well? The problem is that the alternative to the crystal ball may be worse.

      1. paul

        And it may not be. If you start with the position that “crystal ball” sentencing is bad(tm) then how do you improve it? Improving upon the sentence-o-matic 1000 can be done in a scientific, controlled, transparent manner (and then you get 2000 and we all know bigger numbers mean better right). Over time the data gets better (or worse) in a measurable way with the solution of fix the data get better results. Solution to judge X issued too light / too harsh is the guidelines you love right?

        1. OEH

          It sounds like what your suggesting is you could train your model for race or income neutrality?

          That’s an interesting idea… I wonder how it well it would work in practice. I’m worried what you’d get is a model that uses some proxies for race to “balance out” other proxies for race, ie a model with some racist factors balanced out by some reverse-racist factors.

          1. paul

            These models are only as good as the data they are trained on. It is likely that data is biased. Methodology for collection, selection, and training on this data are all very important and the first place to look when the model underperforms. The point i was trying to make is that this is a concrete area for improvement that theoretically could be addressed if society conjured the will, time, and money. Judges’ intuition or crystal ball is much harder to debug.

    2. Patrick Maupin

      Since everybody commits three felonies a day, it shouldn’t take too long to get all the bad guys locked up forever once we have the Minority Report precogs.

  2. Richard G. Kopf

    Dear Patrick,

    But before that happens, my friend, Max von Sydow’s character, Director Lamar Burgess, will kill himself. The precogs will go off to a peaceful island to live an idyllic life. And murder will again become commonplace. What a perfect ending for the socially aware.

    I hope you feelz better now. All the best.

    RGK

      1. Patrick Maupin

        Yeah! Some of us don’t know how to use google properly, and we hate to admit that we are still reliant on ancient technology because we can never get google to fully explain Life, the Universe, and Everything in small words (or numbers!) we can understand.

        “You’ll still be answering my questions when google’s gone to digital heaven, right?”

        “Reply hazy try again”

  3. D-Poll

    It’s fair to point out that the connection to algorithms here is a red herring. The Sentence-O-Matic is perfectly capable of just as much lenity, specificity, and social-justicity as any human judge; it has to be built that way from the start, of course, but the same is true of the human, and it’s a lot harder to get a hold of a copy of his source code. Put another way, O’Neil’s bald assertion that “moral imagination [is] something only humans can provide” is just an article of faith – anyone with experience in machine learning (which I am, but I respect that nobody will take my word for it since I can’t back that up) will tell you that “moral imagination” is one of the many squishy feelings we could very easily program the machine to care about as soon as everyone else decides what it means. Judge Persky has “moral imagination”, and look how that worked out for him.

    The real question seems to be “is it right to give stricter sentences to criminals you honestly and reasonably expect are more likely to reoffend”, on which the people have spoken (and continue to speak), and their answer is “yes”. It may be shallow and shortsighted, and it may only serve to perpetuate the problem, but this is already the world we live in and you are probably going to have to take it up with the public, not with math.

      1. D-Poll

        I promise there’s at least one cogent thought there, but it might be well-hidden.
        Let me try this again.

        O’Neil’s piece is silly and wrong, because the problem isn’t algorithms. Anything a human judge does or thinks, including “moral imagination”, can be expressed in an algorithm. This post isn’t really about algorithms, but about the sentencing philosophy that goes into them – and O’Neil is trying to trick you into blaming the “Sentence-O-Matic”, because it distracts from the real problem, which is that most people really do want “likely reoffenders” to be sentenced more harshly, even if they won’t face the logical consequences of this belief. It’s not for nothing that Minority Report was remade as a cop drama just last year. This will be true regardless of whether it is up to the Sentence-O-Matic or Judge Kopf with a crystal ball, but we can choose what goes on in the Sentence-O-Matic’s head. Framing the debate as algorithms-versus-humans is misleading.

        1. SHG Post author

          Anything a human judge does or thinks, including “moral imagination”, can be expressed in an algorithm.

          This is an article of faith in devs. It’s ignorant bullshit, as has been discussed at length in the past (which means, we’re not going to discuss it again because you just showed up and you are entitled to revisit every discussion ever because you are the most special person ever!!!). Everything else fails because it’s built on a foundation of bullshit.

          Find a nice place to put the Billy Madison Award and be proud.

        2. CAB

          “Anything a human judge does or thinks, including “moral imagination”, can be expressed in an algorithm. ”

          You work in information science, with an emphasis on Big Data, don’t you? There’s a reason the rest of us social researchers point and laugh when people in your field walk by.

Comments are closed.