Halkides: Compositional Bullet Lead Analysis, Minute Numbers with Infinitesimal Meaning

Ed. Note: Chris Halkides has been kind enough to try to make us lawyers smarter by dumbing down science enough that we have a small chance of understanding how it’s being used to wrongfully convict and, in some cases, execute defendants. Chris graduated from the University of Wisconsin-Madison with a Ph.D. in biochemistry, and teaches biochemistry, organic chemistry, and forensic chemistry at the University of North Carolina, Wilmington.

At James Otto Earhart’s trial for the murder of Kandy Kirtland, compositional (comparative) bullet lead analysis (CBLA) was used to conclude that the bullets seized from his home and car were “analytically indistinguishable” from one found with her body. FBI agent John Riley testified that he could determine whether or not bullets are from the same box of ammunition.

With respect to all of the 0.22 caliber bullets manufactured in one year, he testified the probability that two 0.22 caliber bullets came from the same batch is approximately 0.000025%, give or take a zero.  Upon cross examination, he conceded that he had not taken into account the fact that there were three kinds of 0.22 caliber bullets. Despite having little other probative evidence, the jury convicted Mr. Earhart of Kandy’s murder and he was sentenced to death. Even during Mr. Earhart’s appeals, it would have been difficult to find an expert to question CBLA. Justice, however, was not served.

CBLA can be divided into three phases: analysis, grouping, and inference. The analysis phase uses neutron activation analysis (NAA) or inductively coupled plasma atomic emission spectroscopy (ICPAES) to determine the amounts of up to seven elements found in trace amounts in lead bullets. If the seven elements were found in the same concentrations within error, then the reference and questioned samples were analytically indistinguishable. Both NAA and ICPAES are used in various branches of chemistry and are noncontroversial.

The grouping phase, in which bullets were declared analytically indistinguishable or distinguishable, was more questionable but could have been fixed. The FBI used three different statistical procedures, two-standard deviation test, a range test, and data chaining, the first and third of which probably overstated the strength of the evidence. Data chaining, at the very least, misleads the unwary, but it also increases the chances of false matching. Suppose that bullet A matched bullet B; bullet B matched bullet C; and bullet C did not match bullet A. Yet bullet C could be said to match the group made up of A and B and thus match A. The National Research Council recommended dropping these methods in favor of the successive t-test approach or the T2 test statistic.

However, what prevents CBLA from being a salvageable forensic technique are the four assumptions that underlie the inference phase:

(1) The forensic specimen (about 50 milligrams) is a representative sample;
(2) The hundred-ton batches of lead were homogeneous.
(3) No two molten sources are ever produced with the same composition.
(4) Bullets were distributed in retail markets in a way as to make it unlikely that a chemical concentration of bullets was common in any geographical area.

All four assumptions are either dubious or wrong.

Lead is heterogeneous at the microscopic level, suggesting that (1) might not be correct. When the beginning, middle, and end of one pour was examined, the amount of copper was found to vary by 142%, indicating that (2) is incorrect. One problem with (3) is that tin is not generally present in bullet lead, and cadmium almost never is. The range of silver concentrations is modest, and the range for bismuth is even smaller. Therefore, most of the discrimination is dependent on just three elements, antimony, arsenic and copper. Moreover, the concentrations of the seven elements are not independent, meaning that there are some statistical correlations. The assumption of independence probably increases the possibility of a false match. There is not enough data to be certain about (4).

Empirical testing also challenged the basis of CBLA.  According to the California Innocence Project (https://californiainnocenceproject.org/issues-we-face/firearms-analysis/lead-bullet-analysis/), an internal FBI study showed that two batches manufactured months apart had the same composition and that bullets in the same box often had different compositions. The National Research Council noted that “the FBI’s own research has shown that a single box of ammunition can contain bullets from as many as 14 distinct compositional groups.”

And the problems did not end there. The FBI was not forthcoming with respect to people who wished to study their database.  Moreover, they may have deleted some of their own data. The language the agents used gave widely differing impressions of the value of the evidence. It ranged from “Could have come from the same box” to “Must have come from the same box or from another box that would have been made by the same company on the same day.” The National Research Council concluded that “The available data do not support any statement that a crime bullet came from a particular box of ammunition. In particular, references to ‘boxes’ of ammunition in any form should be avoided as misleading under Federal Rule of Evidence 403.”

CBLA evidence was generated in 2500 cases before the FBI discontinued this type of analysis in 2005. About 20% of those cases went to trial, and it presumably factored into the thinking of some of the defendants who took plea bargains.  After the NRC report and subsequent media investigations, the FBI sent pro forma letters to prosecution and defense organizations. Paul Giannelli was unimpressed:  “Yet the letters neither highlighted the problem, nor its significance, and therefore were grossly inadequate means of communication.”

Wendy J Koen concluded that Agent Riley’s estimate of 0.000025% probability of two 0.22 caliber bullets coming from the same batch was “incorrect and not based on reality.”

Mr. Earhart was executed before several state courts stopped accepting CBLA as evidence. Was this a miscarriage of justice? In addition to Ms. Koen another scholar of wrongful convictions, Tucker Carrington, indicated that Mr. Earhart might have been guilty or innocent and that we will never be certain.

For further reading

Giannelli P, “Comparative Bullet Lead Analysis: A Retrospective,” 2010 Criminal Law Bulletin 47(2) 306-315.

Koen W and Houck MM, “Compositional Bullet Lead Analysis,” in Forensic Science Reform Protecting the Innocent, ed. Koen W, Bowers CM. 2017 Academic Press.  ISBN 978-0-12-802719-6.

Tobin WA et al., “Absence of Statistical and Scientific Ethos: The Common Denominator in Deficient Forensic Practices” 2017 Statistics and Public Policy, 4:1, 1-11.

National Research Council “Forensic Analysis: Weighing Bullet Lead Evidence (2004) ISBN 0-309-52756-2.

24 thoughts on “Halkides: Compositional Bullet Lead Analysis, Minute Numbers with Infinitesimal Meaning

  1. Richard Kopf

    Dr. Halkides,

    Thank for your post.

    It makes a very important point in clear way. So clear was your post, that even a judge was able to understand it when that same judge once believed molecules with unpaired valence electrons were terribly lonely.

    All the best.


    1. SHG Post author

      Chris is smart enough to explain this in a way that even I can understand. It must have been painful for him to dumb it that far down, but I deeply appreciate it.

      Unpaired valence electrons aren’t lonely?

      1. Chris Halkides

        A chemical species with an odd number of electrons (a radical) is lonely in the sense that it is likely to find another radical and form a new bond.

        1. SHG Post author

          A full explanation of current politics in a sentence. You’re a miracle worker! And thanks again, Chris, for this great post.

          1. SHG Post author

            Damn, I love the McGarrigle sisters. They did a song called “Sex in the Morning,” and I wish I could find it.

            Sex in the morning
            Sex in the morning
            Green stuff in my eyes

  2. Beth Clarkson

    Another problem with this analysis is that statistical tests tell us when two things are different. The null hypothesis assumes they are the same. The computational method used to get to the 0.000025% number used p-values (the probability of incorrectly concluding they are different) while what they should have used to make that computation was the 1 minus the power of the test – i.e. the probability of incorrectly concluding they were the same. This would have shown that they can’t conclude the same box with certainty.

    1. rojas

      The NRC study touched on that concern but difenncies were beyond the scope of a single statistical silver bullet.

      FBI protocol and the fallacies that some of their experts employed when testifying boils down to: “We have this novel theory so we weighed a bunch of ducks.”
      What things are “analytically indistinguishable” from a duck?

        1. rojas

          On a lighter note:

          “He who is accused of sorcery should never be acquitted, unless the malice of the prosecutor be clearer than the sun; for it is so difficult to bring full proof of this secret crime, that out of a million witches not one would be convicted if the usual course were followed!”

  3. JD

    “At James Otto Earhart’s trial for the murder of Kandy Kirtland, compositional (comparative) bullet lead analysis (CBLA) was used to conclude that the bullets seized from his home and car were “analytically indistinguishable” from one found with her body. ”

    Consistency is one of the weakest methods of proving an affirmative fact. Inconsistency can help exclude, but consistency alone is grossly insufficient to prove.

    The bullets seized at home were analytically indistinguishable from the one found with her body. They were also indistinguishable from countless other bullets that were not found with her body.

    Consistency alone tells us nearly nothing. Being a passenger on an airplane on 9-11 is consistent with being a terrorist. But not flying that day is inconsistent, and can be used to disprove someone was a terrorist that day.

    Anytime consistency is used to prove a point, the alarm should go off.

    1. B. McLeod

      Any time a field of “science” is created specifically to prove a range of contentions in litigation, courts should be wary of it.

  4. DaveL

    I’m not an expert in analytical chemistry, but I can do long division. At about 3g per .22LR bullet, a hundred-ton pour would produce some 33 million bullets, some 33 thousand “bricks”. I have a hard time believing that bulk orders to retailers aren’t going to include multiple bricks from the same pour.

    1. Rojas

      From the NRC study Chris cites:

      Finding: Variations among and within lead bullet manufacturers make any modeling of the general manufacturing process unreliable and potentially misleading in CABL comparisons.

      Finding: The committee’s review of the literature and discussions with manufacturers indicates that the size of a CIVL ranges from 70 lbs in a billet to 200,000 lbs in a melt. That is equivalent to 12,000 to 35 million 40-grain, .22 caliber longrifle bullets from a CIVL compared with a total of 9 billion bullets produced each year.

      1. DaveL

        35 million out of 9 billion. That’s kind of like saying “the perp was driving a Chevy Traverse, you drive a Chevy Traverse, we consider that model to be sufficiently rare that it’s statistically implausible that anybody else could be the perpetrator.”

    2. Chris Halkides

      According to Max Houck there was a high degree of geographic concentration of bullet packing codes. In one Alaskan outlet, 87 to 100% of the bullet were calculated to have the same packing codes, indicating that they had been made at about the same time. Assuming that they came from the same compositionally indistinguishable volume of lead, the value of a match would have to be adjusted.

      1. rojas

        But the manufacture of ammunition is not a one piece flow starting at the smelter.
        A lot more information is required to obtain the relationship of the packing codes or manufacturing date to the lead source. As you noted originally, according to the FBI’s own metrics they found one box with 14 different CIVL.

        The initial processing of the lead is typically done in a batch process fed from a single ingot. And if that’s all there was to it there could be a high degree of correlation to the manufacturing date. But even a single line process will typically feed into a bin or silo in which product is mixed. Of course there can be several of these lines operating in parallel feeding the same bin from several different ingots. To what degree is first in first out used? Lead does not have a shelf life. If there is not a significant economic or quality advantage very little attention to lot mixing is probable.

        The NRC group looked into this aspect at least in a cursory way. They concluded that variability in the process was too high to make assumptions about lot codes.

  5. Steve White

    I think there is a typo in the article, the part about the probability the crime scene bullet and the bullets taken as evidence were from the same box, instead of saying .00000025% they were from the same box, must be meant to say they were NOT from the same box. The number given is vanishingly small, and the FBI guy was claiming it was a certainty the bullets came from the same box, right?

    1. Chris Halkides

      Steve, I used the quote in footnote 26 of Professor Giannelli’s article. My interpretation is that Agent Riley claimed that the two bullets were indistinguishable and therefore most likely came from the same box. The figure of 0.000025% may mean the chances of a random bullet’s matching the composition of the bullet from the crime. I presume that Agent Riley calculated the total number of 0.22 caliber bullets manufactured over a certain period. By failing to take into account the existence of three different kinds of 0.22 bullets, his total number of bullets was inflated, a point that was made in cross-examination. Obviously, that was the least of the problems with this number.

      1. Steve White

        Thanks for the reply and thanks for debunking junk forensic science, I followed many of your postings about the Amanda Knox case.

Comments are closed.