Larry sends us the following rumination.
When the Landis case is discussed, one popular topic is the applicable margin of error for the lab's testing that led to Landis' suspension for doping. Thanks to the CAS proceeding and the documents released by the Landis team, we have new information regarding this margin of error. In this article, I will consider this new information to see if it affects our analysis of the work performed by the lab in the Landis case.
Let's start with the basics. In the Landis case, the French lab (LNDD) reported an "adverse analytical finding" (AAF) that Landis doped with exogenous (artificial) testosterone. The AAF was based on the lab's
performance of a "carbon isotope ratio" (CIR) test. The CIR test measures the "isotopic value" of two substances in an athlete's urine, and subtracts one value from the other. The result of the subtraction is called the "delta - delta" value. (This result is expressed in parts per thousand, or "0/00", but to keep things simple, I've dropped the "0/00" from our discussion.) If the delta - delta is less than -3.0, then the athlete is
presumed to have doped with exogenous testosterone.
Sounds complicated? Then let's make it simple. To determine if Landis doped, the lab measured "A" and "B", then subtracted B from A. In other words, A - B = C. If C was less than -3.0, then Landis failed the test. In
one of the measurements performed by LNDD on a Landis sample, C WAS less than -3.0.
How does margin of error come into this discussion? Well, every measurement has a margin of error -- no measurement is perfect. The key questions are, how large is the margin of error for a given measurement, and is the measurement accurate enough to be "fit for purpose". If you measure the size of your foot and you're off by 1%, that's probably "fit for purpose" -- you'll probably end up with a shoe that fits. If you work for NASA and your measurements are off by 1%, that's not "fit for purpose" - your rockets are not going to go where you want them to go.
Correct determination of a lab's margin of error is key to the lab's operations. I discussed measures of lab accuracy in my "Curb Your Anticipation" series, and if you'd like a fuller picture of how the lab rules address margin of error, I'd point you in particular to parts 7 and 8 of this series.
Let's get more specific. LNDD has indicated that its margin of error for its CIR testing is ± 0.8. In other words, the result of its delta - delta calculation might be off by as much as 0.8 in either direction. So, if LNDD measures a delta - delta of 2, then the true result might be as great as 2.8, or as little as 1.2.
One measure of the delta - delta for Landis was -6.14. If we apply the stated margin of error to this measurement, we end up with a delta - delta that can be no greater than -5.34. Remember, any measure of delta - delta less than -3.0 is supposed to be a violation. -5.34 is smaller than -3.0, so on this measurement Landis flunked the delta - delta test even taking the 0.8 margin of error into account.
(For those not so mathematically inclined, please understand that the isotopic values for compounds in CIR tests are typically negative numbers. A negative number is a number less than zero. With negative numbers, our sense for which numbers are larger and which are smaller can get confused. A negative number like -5 is "less negative" than the number -6, so -5 is larger than -6. This is confusing for those of us who don't deal with negative numbers in our day-to-day lives!).
Of course, if the correct margin of error at LNDD was a lot higher than 0.8, then LNDD might not have been able to prove that Landis doped.
We've long suspected that the LNDD's stated margin of error is too small. Our suspicions are based on measurements reported by LNDD that vary by a lot more than ± 0.8. For example, if you look at the latest version of the Arnie Baker wiki defense, at p. 203, you find a chart (figure 141) showing a peak with a measured "isotopic value" of -31.64. The measurement here is for the isotopic value of a substance called 5a Androstanol AC (5aA AC), which is added by LNDD to all urine samples. 5aA AC is a reference material that LNDD buys from a lab supply store, and we know that it has an "isotopic value" of -30.46. So, in this particular case we know that LNDD's measurement of this single isotopic value was off by 1.18. And remember, the delta - delta measurement requires the lab to measure the isotopic values for TWO substances. If the measure of the isotopic value of both of these substances is off by 1.18, then we could conclude that the correct margin of error at LNDD might be ± 2.36, not ± 0.8.
Let's summarize. The determination of the LNDD's margin of error for CIR testing is CRITICAL to the determination of whether the lab correctly found an AAF in the Landis case. The lab's stated margin of error is ± 0.8. Some of us think that the real margin of error at LNDD is a lot higher.
With the release of the CAS decision and a host of new documents here on TBV, we now have a critical new piece of information relating to the LNDD margin of error.
It turns out that the LNDD's accreditation to perform the delta - delta test did NOT initially indicate a margin of error of ± 0.8. It indicated a margin of error of 20%. See The Wiki Defense p. 87.
Before we look at what a 20% margin of error might mean in the Landis case, let's take a look at how USADA reacted to the revelation that the LNDD accreditation documents referred to a 20% margin of error. USADA argued that (1) the reference to a 20% margin of error was ITSELF an error, (2) this error was corrected in the accreditation documents -- the correction being made AFTER the analysis of the Landis samples, but in a retroactive manner effective PRIOR to the analysis of the Landis samples, and (3) even if the lab actually had to apply a 20% margin of error to its CIR measurements, the Landis samples would STILL have had a delta - delta of less than -3.0 and would have flunked the CIR test.
It is CRITICAL that we understand USADA's position here. USADA has stated that the reference to a 20% error rate was a mistake, and that this mistake was timely corrected. In USADA's post-hearing brief, p. 8, USADA states that the applicable margin of error was ±0.8, not 20%. In his opening statement for USADA (pp. 183, 187), Richard Young stated that the 20% error rate initially reported for the CIR test was in reality the margin of error for a DIFFERENT test, and that the stated margin of error for the CIR test was corrected in December of 2006 effective as of May 1, 2006.
But before we address whether the reference to a 20% margin of error was a mistake, we need to first try to understand how a 20% margin of error would work. This requires us to ask the question: 20% of WHAT? Remember, a delta - delta calculation is, ultimately a subtraction problem: isotopic value A minus isotopic value B equals delta - delta C, or just A - B = C. If the margin of error is 20%, is that margin of error applicable to C, or to A and B?
If you play around with the numbers a bit, you'll find that if you apply a 20% margin of error to C, you are claiming MUCH greater accuracy than if you apply the 20% to A and B. Let's say that we're looking at a calculation of A - B = C, where A is 100 and B is 50. If we apply the 20% margin of error to C, then we're saying that C might be anywhere between 40 and 60. That's not too bad, as accuracy goes. But if you apply the 20% to A and B, then you could end up with a C that might fall anywhere between 20 and 80. That's not nearly as accurate.
I'm not a math guy, but as it turns out, the difference between these two applications of percentage error seems to become greater the closer that A gets to B. For example, let's use values for A and B that might be typical.
In a real-world delta - delta test, where A might be 30 and B might be 25. In that case, if you apply the 20% margin of error to C, you get a range for C between 4 and 6, or ± 1. If you apply the margin of error to A and B, you get a range for C between -6 and 16, or ± 11! Clearly, there's a big difference between applying the margin of error to A and B and applying it to C.
Both sides have BRIEFLY considered how the 20% margin of error might be applied to CIR testing, and predictably, the two sides disagreed on how this would work. In the Wiki Defense, Arnie Baker applied the 20% to the calculation of A and B, with the result that all of Landis' CIR test results were normal. See Wiki Defense p. 87. USADA applied the 20% to the calculation of C, effectively increasing the applicable margin of error to about ± 1.2, but not affecting the finding against Landis. See USADA Post-Hearing Brief p. 8 footnote 7.
Assuming that the 20% margin of error was applicable, which side is right - was there an AAF, or wasn't there? It's a close question, requiring us to dive into the convoluted logic of the Landis case. The answer probably is, it's impossible to say that either side was right.
Why do I say that neither side was right? Well, without more information, a stated 20% margin of error is close to meaningless. As we've already pointed out, a 20% margin of error can be applied in different ways to imply different levels of accuracy. In addition, a 20% margin of error produces different ABSOLUTE ranges of error, depending on the size of what we want to measure. If we apply our 20% figure to a delta - delta of 10, we get a margin of error of ± 2. If we apply it to a delta - delta of 2, we get a much smaller margin of error, ± .4. And for a delta - delta of 0, our 20% rate would indicate that the calculation is exactly on the nose, with NO error (20% of zero is still zero). That can't be right.
What makes sense is to apply the 20% margin of error to the kinds of delta - delta calculations that we would expect the lab to encounter in the real world testing of athletes. In particular, we care most about the lab's margin of error when the lab is measuring a sample with a delta-delta close to -3.0 (which, as we noted above, is the standard for determining whether an athlete doped with artificial testosterone). If a lab measures a delta - delta of +30, or -30, we're not so concerned about margin of error. But if the delta - delta computes to something close to -3.0, say -3.1, then we're VERY concerned about margin of error. It makes sense to measure a test's margin of error for results that are CLOSE to the level required to prove an AAF.
(An aside for the more scientifically inclined: USADA also appears to have claimed that the LNDD's margin of error was better for peaks with longer retention times, and that the ± 0.8 margin of error was specifically applicable to peaks with retention times roughly corresponding to the peaks being measured for the CIR tests. This is a topic for another day's exploration.)
So, perhaps the original lab accreditation meant to say that, when the lab is measuring a delta - delta close to -3.0, then the lab's margin of error is 20%. THAT would make more sense, and would largely eliminate the concern we expressed about widely different delta - delta calculations having widely different absolute margins of error. Of course, the LNDD CIR accreditation did not SAY this - it did not say "20% margin of error at a delta - delta near -3.0", it just said "20%". But let's try this out as an assumption. What would happen if we tried to work out a margin of error of 20% near a delta - delta calculation of -3.0?
Well, let's do some math. 20% of 3.0 is 0.6. You might then assume that a 20% margin of error should translate into an error rate of ± 0.6. But the math doesn't exactly work out that way. If the lab measures a delta - delta of -3.6 with a 20% rate of error, then the TRUE delta - delta might fall anywhere in the 20% range above or below -3.6. This 20% range (calculated with a round-off favoring the athlete) is from -2.8 to -4.4 - and remember, -2.8 is too large a number to allow the lab to find an AAF. A delta - delta of -3.7 has the same problem - the high end of the 20% range is too high for an AAF. But a delta - delta reading of -3.8 allows for a finding of an AAF even with a 20% range - in fact, it's the highest possible delta - delta (expressed to the nearest tenth of a percentage point, with rounding off favoring the athlete) that could result in an AAF given a 20% margin of error. Compare this -3.8 figure to the -3.0 figure required under the CIR rules, and you get an absolute margin of error of ± 0.8. The exact same margin of error used by LNDD in the Landis case.
In other words, if we assume that the accreditation for the LNDD CIR method should be understood as specifying a 20% margin of error near a delta - delta calculation of -3.0, then the inclusion of the 20% error rate was NOT a mistake. With this assumption, the 20% margin of error was DEAD ON accurate. Or more precisely, given our assumption, a 20% margin of error is the SAME THING as an error rate of ± 0.8. And most importantly, the 20% margin of error would change nothing in the Landis case: the critical delta - delta calculation used to find the Landis AAF would not be affected one iota by applying a 20% margin of error - again assuming the correctness of our assumption on how the 20% margin should be understood.
Is that the end of our discussion? Not exactly! Yes, *I* think that the 20% margin of error can be most accurately translated into an absolute margin of error of ± 0.8. But there's a powerful and authoritative witness in this case who ABSOLUTELY disagrees with me! There's a witness in this case who has testified that a 20% margin of error is NOT applicable to the CIR test at LNDD and does NOT equate to ± 0.8. And while all of the other witnesses for Landis were largely ignored by the two arbitration panels, the witness on this point CANNOT be ignored by USADA or the arbitrators.
On this point, the witness in favor of Landis is USADA itself.
Go back to our earlier discussion. In this case, USADA has argued that the 20% margin of error was a MISTAKE. Not just a mistake, but effectively a LARGE mistake, a mistake that required retroactive correction. USADA argued that the correct margin of error was not 20%, it was ± 0.8. Apply some simple logic: if 20% is a mistake, and ± 0.8 is not a mistake, then 20% does not equal ± 0.8. Ergo, I must have been WRONG to conclude that a 20% margin of error is best understood as a margin of error of ± 0.8.
OK, I can hear some of you raising objections. Sure, USADA claimed that the 20% margin of error was a mistake, but that doesn't necessarily mean that 20% does not equate to ± 0.8. Perhaps when USADA argued that 20% was a mistake, it meant that the accreditation body should have explicitly stated the assumption that the 20% was applicable to delta - delta calculations of around -3.0. Or perhaps USADA meant that 20% was a mistake in that it was a confusing way to express the margin of error, and that a stated margin of error of ± 0.8 would be easier to understand. Perhaps USADA was REALLY arguing that the 20% margin of error was essentially correct, but was EXPRESSED in a mistaken way.
But that's not what USADA argued.
As we pointed out above, USADA argued that the 20% margin of error was a mistake caused by the accreditation body confusing margins of error for DIFFERENT tests. According to USADA, the accreditation body took the margin of error for the "T/E" test, and accidentally reported it as the margin of error for the CIR test. In other words, the reported 20% margin of error had NOTHING TO DO with CIR testing. It's not that the 20% margin of error should have been expressed differently - USADA's position is that the CIR test did not have a 20% margin of error in any way we could imagine.
And just to make this point crystal-clear, USADA provided us with an example of how a 20% margin of error would have been applied to CIR testing, if the 20% margin of error HAD been applicable:
"Even using 20% uncertainty, Appellant's sample would still have been declared positive (6.14 delta-delta units ± 20%=4.91 delta-delta units.)"
In other words, USADA argued that a 20% uncertainty did not translate into a margin of error of ± 0.8. It argued that a 20% uncertainty would translate into a 50% larger margin of error -- roughly ± 1.2. And for whatever it's worth, it appears that the CAS ruling agrees with USADA on this point. See CAS Opinion paragraph 48.
So . given that USADA categorically disagrees with my interpretation that a 20% margin of error translates into a margin of error of ± 0.8, I'm forced to consider how else we might understand a 20% margin of error.
Can I adopt the USADA interpretation quoted above, which effectively translates the 20% margin of error into an absolute margin of ± 1.2 delta - delta units? No. The USADA method will produce a different absolute margin of error depending on the size of the delta - delta measurement: there will be a large margin of error for a large calculation of delta - delta units, and no margin of error whatsoever where the delta - delta equals zero. We rejected this approach before as making no sense. The only approach that makes sense is to apply the margin of error to calculations that we expect to see in the real world; in particular, those calculations close to the -3.0 delta - delta used to determine an AAF.
Can we apply the 20% to delta - delta calculations close to -3.0? No, that's the approach we discussed in detail above, the one we tried to use but that USADA effectively rejected.
Let's go back to our explanation of the delta - delta calculation: the calculation boils down to A - B = C. We can't seem to find a way to apply the 20% error rate directly to C. What about trying the Arnie Baker approach, and applying the calculation to the isotopic values A and B? Based on what we know, I'll pick a set of isotopic values that could produce a delta - delta value of -3.0: I'll pick a value for A of -28 and a value for B of -25. If we apply the 20% margin of error to each of these calculations, we get an absolute margin of error of ± 10.6.
To note, we'll get a slightly different absolute margin of error if we apply the 20% figure to different possible real-world calculations of A and B. For example, we noted above that where A is -30 and B is -25, application of the 20% figure will give us an absolute margin of error of ± 11. I won't try to go through any more of these calculations. For the moment, let's say that applying a 20% error rate to real world values of A and B produces an absolute error rate of about ± 10.
There are three things we can say about a margin of error of ± 10. First, it completely exonerates Floyd Landis. Second, it's about 12 times larger than the LNDD's reported margin of error of ± 0.8. Third, it's an absurd margin of error - not absurd from the standpoint that it could not conceivably be true, but absurd from the standpoint that no decent accreditation body could possibly approve a CIR method with a margin of error of ± 10. A method with this kind of margin of error could not be fit for any purpose.
Where does this leave us? What can we possibly conclude from all this?
I'm not sure.
Obviously, we can't prove much from a stated percentage margin of error that might mean either ± 0.8 or ± 10. We might argue that if the margin of error is ambiguous, it should in fairness be interpreted in a manner favorable to the athlete. But this is the sort of legal "technical" argument that seems to appeal to me and to no one else, not even here on TBV.
We might argue that USADA got so caught up in lying and covering up, that it became more convenient for USADA to simply cover up the 20% margin of error ("it was a mistake!") than to take the time to understand that it might well be the real margin of error at LNDD.
Or perhaps we should take USADA at face value, and assume that the stated 20% margin of error WAS a mistake. But since the ± 0.8 margin of error seems to be based on this 20% calculation, we might also conclude that the LNDD's stated margin of error of ± 0.8 is ALSO a mistake. We would then ask what the real margin of error might be at LNDD, and whether this margin of error was ever determined and accredited.
We might also ask questions about the accreditation process. The margin of error for a particular test is critical to understanding the test. If the 20% margin of error was correct all along, how was it that LNDD persuaded the accrediting body to change this margin (and to testify before the CAS that the 20% figure was a mistake)? If the 20% margin of error was itself an error, how can we explain how the accrediting body made such a critical mistake, and how is it that LNDD did not notice the mistake until the Landis team pointed it out? Can we really take the accrediting process seriously as an assurance of lab quality, as the CAS did in its decision, when both the lab and the accrediting body have acted so casually about a piece of data this critical to the overall process?
Personally, I conclude that the ADA system is so flawed that it's difficult to draw conclusions from anything that happens there. But that's my subjective conclusion, and I encourage you to reach your own conclusions.
[edited for formatting!]
September 07: Hearing Award
October 07: Hue's Hearing Appraisal
November 07: Major document Release
January 08: Larry's Curb Your Anticipation
Saturday, July 05, 2008
Larry sends us the following rumination.