Three Times Longer For Ladies?

When Seattle criminal defense lawyer Stephan Illa saw the headline at Doug Keene’s Jury Room, it was a smack upside the head. Keene posted of a new study that showed that women convicted of “white collar crimes” were sentenced more harshly than men. What?

How much longer? 300% longer if you are a White woman and 450% longer if you are a Black woman. Seriously. Wow. A new report has been released comparing the federal white-collar crime sentences of men and women.

Stephan was astounded. After 30 years in the trenches, it defied every experience he’s ever had. Me too. But then, how many times have we learned that “common wisdom” turns out to be total marlarky and reality is just the opposite? So Stephan read through the report. He was unimpressed:

I believe the study’s conclusion is the result a seriously flawed methodology:

1. Only female inmates currently serving sentences in federal prisons were included. Since only inmates sentenced to terms of a year or more are housed in federal prisons, any defendants who receive sentences of probation or less than a year and a day were excluded from consideration. As a result, any gender disparities manifested in low-end dispositions disappear.

2. The study examined only one variable (“loss amount”) in comparing the sentences imposed on the defendants. While the amount of loss is a significant factor in calculating a defendant’s advisory guideline range, many other considerations can drive the range up or down. The study ignores a myriad of offense-related adjustments for things like a defendant’s role in the offense, the number of victims, and the extent and sophistication of the fraudulent conduct. Likewise, the study makes no attempt to account for each defendant’s criminal history. Nor does the study consider the effect of the defendants’ plea agreements (if any) or decisions to go to trial and/or testify at trial. The study also fails to consider the effect of convictions on multiple counts (which, if not “grouped”,” may add further points to a defendant’s offense level).

3. The study’s claim to have considered the “loss amount” is itself questionable. It is no easy task to determine the precise loss amount that was used by the court at sentencing in a federal criminal case. In some cases, the parties stipulate to a loss amount in the plea agreement, but that stipulation need not be accepted by the court. The Probation Department, operating in its role as a sock puppet for federal law enforcement, often advocates for a different (i.e., higher) amount. And the judge may or may not identify a specific loss amount on the record at sentencing. The judgment itself does not identify the loss amount, just the sentence imposed. How did the study determine the loss amount in each case? Did it use the amount calculated in the discovery? Or did it use the amount identified in a complaint or an indictment? Did it accept the amount listed in a plea agreement? Or did it rely on statements in a sentencing memorandum filed by one of the parties? Did the authors examine the amount alleged in a presentence report? Or were the transcripts of a sentencing hearing examined? The study does not say.

4. At least one of the 29 female defendants is said to have been convicted of aggravated identity theft in addition to a different “white collar” offense. The federal identity theft statute carries a 24-month mandatory consecutive term of imprisonment irrespective of the loss amount. The study considered only the total term of imprisonment imposed on that defendant and compared her sentence to those imposed on all of the other inmates (none of whom where tagged with identity theft convictions).

5. The data were collected in a nonrandom fashion. The 29 female defendants studied were selected because they all were serving their sentences at Danbury Federal Prison Camp. They were convicted in federal district courts located in 13 different states. All but five of the 29 female defendants were convicted in the Northeastern region (NY, NJ, MA, NH, PA, CT, DC). Two were convicted in the South (GA, VA) and Midwest (OH, WI), and one in the West (CA). The study says it selected 31 male defendants for comparison, “compiled from court records limited primarily to the same states (or regions) from which the women came.” Id. at 2. The men were convicted in federal courts in 11 different states. How these particular male defendants were selected for comparison is not specified, although the study assures us that “[t]here is no evidence that the selection of the individuals comprising the database was done with the intent of influencing the results of the study.” Id. at 2. A little over half (18) of the 31 male defendants were convicted in the Northeast region. Of the remaining 13 convictions, ten were from the South (4 in FL, 4 in VA, and 1 each in TX & NC), two from the Midwest (MO, IL), and one from the West (CA). The study does not bother to consider whether regional differences might affect sentencing decisions. Finally, the propriety of comparing sentences imposed by different judges with different sentencing philosophies and attitudes toward the guidelines remains unaddressed. Federal judges are not fungible. Regardless of a defendant’s gender, it makes a great deal of difference whether the sentencing judge wears a skirt, a beard, a scowl, or a smirk.

The study has many other inane features (the graphs are delightfully unintelligible) and includes a series of particularly maladroit mixed metaphors (involving canaries and smoke).

The company that did the study, Culture QuantiX, doesn’t facially appear to have an agenda, though it’s unclear whether they take on work for the purpose of demonstrating what their patrons want demonstrated. But Stephan’s points raise some very significant issues.

If you’re going to do a study to challenge common experience and the results show that our experience is significantly wrong (who you gonna believe, me or your lying eyes?), then you’d better make sure your methodology is sound. This study appears exposed on every flank.

So why would Doug Keene promote this study without any apparent scrutiny as to its validity?  Within opponents of the prison-industrial complex, there are competing interests for who gets screwed worse, and therefore who needs reform more. It’s a shame that this happens, as it pits over-incarcerated women against over-incarcerated men against over-incarceration for everybody.

It also creates an undesirable secondary effect, where questioning a study like this creates the impression of denigrating the problem of incarcerated women. No one wants to suggest that the over-incarceration of women isn’t an issue, anymore than over-incarceration itself is an issue.  But it takes the fight away from the core problem to focus on a subset and creates an internal competition for concern, resources, and potentially reform.

It’s a bad thing. Rather than fight among ourselves, the problem of over-incarceration for everyone needs to remain the focus, and not by gender by contending that women suffer disproportionately.  Even if my experience, that women are sentenced less severely than men, is correct, it does not mean they are not over-incarcerated or that they do not suffer far beyond any societal value in their imprisonment.

But when someone promotes a study that flies in the face of experience, it’s irresponsible to promote its results without scrutiny. Just as junk science in the courtroom leads to wrongful convictions, junk studies lead to wrongful beliefs. Neither is tolerable. Both make us worse for it. Don’t do it.

14 thoughts on “Three Times Longer For Ladies?

  1. Jeff Gamso

    I didn’t go beyond Keene’s summary because at least some of the study’s problems seemed evident just from that.

    Regardless, there’s another issue, too, since bad data tends to drive out good. When the public and the politicians and the ones who decide can’t take this study seriously, they learn to distrust all data and conclude that whatever their tweetpeeps tell them must be true. Besides, it was chilly yesterday, so global warming can’t be true.

    1. SHG Post author

      Excellent point. A bad study gives permission to discount all studies, all complaints, and dismiss the issue altogether.

  2. Mark Draughn

    It’s odd that the statistical significance of the conclusions is not given. Statistical significance indicates the likelihood that the differences discovered are the result of random chance rather than a real difference between the subgroups, and are usually look like “p=0.37”. Smaller is better, and p=0.05 would be acceptable in most social science studies. Just eyeballing the graphs, I’d guess the results are probably somewhat significant, but it’s odd to see papers that don’t show these numbers, especially from a company with “quant” in the name. (Since they do publish their dataset — which I must point out is a highly admirable practice — I could reproduce their work and calculate the significance, but I’m too lazy.)

    A significant result would only mean it was probably not random noise, but it wouldn’t mean the result happened for the reason the study was examining, which is where Illa’s points come in. His point about how loss amount is determined is particularly interesting and potentially devastating.

    In addition to excluding people whose sentences were too light to land them in prison, selecting people currently in prison also selects for people with long sentences, since people with short sentences are naturally under-represented in later years. E.g. For people sentence in 2008, only people who got at least 60 months are still in prison and included in the study. The male data data is weird on this, however, such as including a guy sentenced to 46 months in 2004. He should have been out of prison by the time the study was conducted. In what is eventually a case-control study, the methodology used to choose the control group is extremely important. I wish it had been described in the paper.

    Illa makes a good point about the study not considering regional variations. Some quick googling suggests that there are known disparities in sentencing between districts. So if the men were chosen from more lenient districts, that would skew the results.

    The study also doesn’t include references to prior studies of the same subject, such as David Mustard’s 2001 study Racial, Ethnic, and Gender Disparities in Sentencing: Evidence from the U.S. Federal Courts which studied 77,236 federal offenders, explains its methodology, and controls for offense type, offense level, criminal history, and district. It doesn’t present statistics for white collar crime separately, but finds in general that women receive sentences about 12% shorter than men (p=0.01), and that women are more likely to receive no prison term (p=0.01).

  3. DerekT

    The Economist is running a section about problems with research just now, there’s some interesting reading there.

  4. Rita Handrich

    I am in the midst of writing a report and so a little behind on this one! There are multiple questions as to why in the world we would publicize this non-random and very small sample study initiated by women in jail for white collar crimes.

    We wrote the post on this study because it is the precursor to a national study that is being run to see if this one (which they explain repeatedly is NOT random) is reflective of national trends. If it is not, it is not and we will report that as well. If it is, it is a real issue and needs to be addressed.

    If CultureQuantix was NOT doing a national follow-up, we would not have written about such a small sample study based on the sentences of women in a single facility and using a male comparison group that was hand selected. Regardless, it is an issue that needs to be addressed and we are pleased someone is doing the work to see if the statistical analysis supports the anecdotal evidence.


    1. SHG Post author

      A curious explanation, Rita. If that’s the case, wouldn’t it have been a good idea to state all of this in the post? Yet, this study was promoted without any questions raised.

  5. Doug Keene

    I’m glad to see the care that our readers apply not only to our blog but to the source material we link to. Ultimately, we describe striking research that’s in the public domain, and let people come to their own conclusions. We often share our views and impressions, but usually resist passing judgment on it.

    In this case, I provided a citation to the study, as well as the larger study that is under development, and made it clear that the project was being conducted by the women inmates themselves, not a disinterested academic group. I further noted “Recently, we wrote up a study on corporate fraud in the U.S. and looked at the roles of women and men in white-collar crime. The findings of that study make us especially concerned as to whether this small-scale report really does mirror the overall dynamic of the sentencing practices by race and gender for federal white-collar crime… We look forward to the results of the national study now being completed by CultureQuantiX and examining sentences by gender and race.” We observed the limitations of the study, and its inconsistency with other research we have reported.

    I am happy to note that we have a lot of very smart and well-educated readers. I see our blog as an information portal, not a surrogate brain, and its good to see the brains of others being engaged by what we identify as noteworthy (even when the research is flawed). We pointed out the limitations without getting too far into the weeds on statistics and research design, but as someone who conducts research for a living and has taught statistics and research design, I thoroughly understand the importance of Mark Draughn”s comments. I am not endorsing any of the research on our blog, I am sharing it, often with caveats as I did here. And I while I point out obvious issues, I am not performing peer review, but rather offering those interested an opportunity to learn more if they are inclined. Everyone should feel free to examine the study more closely, disagree with it, reject it, or remain curious, as they wish. As Rita noted, we will be tracking the findings of the larger CultureQuantix study, and will keep our readers up to date on what is has to say.

      1. Doug Keene

        Scott, I don’t know what bullshit you’re referring to, or what I’m supposedly trying that I’m not supposed to do on your blog. I think your criticisms are off base, but I’ll leave it to others to reach their own conclusions. I stand by what I wrote, whether you think its bs or not.

    1. Sgt, Schultz Post author

      Let me try, Doug:

      I’m glad to see the care that our readers apply not only to our blog but to the source material we link to. Ultimately, we describe striking research that’s in the public domain, and let people come to their own conclusions. We often share our views and impressions, but usually resist passing judgment on it.

      Your readers? We’re not your readers. We’re SHG’s readers. If it wasn’t for his posting it here, no one would have read your post. Bullshit. “Let people come to their own conclusions” and “resist passing judgment on it”? Did you read your own headline? You endorsed it. Bullshit.

      Since we didn’t get past your first paragraph, it’s not worth wading through the rest. Hope this makes it clearer. Don’t bullshit.

      [Ed. Note: Moved to correct reply.]

      1. SHG Post author

        Do you really think it required explanation? He may have been referring to Stephan Illa, though then he should have written “reader” or perhaps named him, but that too wouldn’t have been particularly persuasive since the only reason Stephan read it and checked the source material was because the headline was so outrageous and contrary to experience. Still, it always nice to see someone try to spin a screw up into a feigned win.

Comments are closed.