2. The Arbitration Decision
3. Burden of Proof
4. T/E Ratio Test
5. IRMS Test
6. Amory testimony
7. Concluding Remarks
I was invited by Bill Hue to present the case supporting the majority decision. Unfortunately, I can only go into a few of the issues raised by the parties. As TBV has observed, it appears that the key legal issue is whether the IRMS retention times violated the applicable technical standards.
I was trained as a lawyer, although I haven’t practiced for a long time. I also have training in statistics and economics, but no biological sciences training whatsoever. So understanding the science here was difficult and possibly imperfect. I read large portions of the transcript (but not all), some of the legal briefs that where available, and tried to understand the exhibits. But I certainly might have missed something.
2. The Arbitration Panel’s Finding of Doping.
USADA charged Landis with using a prohibited substance, artificial testosterone, based on 2 test results from his stage 17 urine samples: 1) elevated testosterone to epitestosterone (T/E) ratios, and 2) IRMS tests showing artificial testosterone in his urine. Additionally, IRMS retests of Landis B samples from other stages were positive for artificial testosterone as well, even though no elevated T/E ratios had been found in the initial A samples. The IRMS test is a direct test, it can determine whether the testosterone is artificial. The T/E ratio test is indirect; elevated T/E ratios most likely result from doping, but could occur from other causes. The majority decision sustained the doping charge on the basis of the IRMS test results on stage 17, but found that the T/E test results were defective because the lab failed to “monitor” 3 ions as was required by one of the technical standards (IS). The majority found some other lab errors including at least one IS violation, but concluded that none of these impacted the IRMS test results.
The dissent agreed that the T/E tests violated the IS, but also found that the IRMS test was defective because of testimony that the T/E ratios and certain metabolytes did not behave as predicted by scientific studies, and because some labs used a more stringent criteria for finding exogenous testosterone. These defects however did not violate any technical standard. The dissent also argued that various and cumulative lab record keeping and other errors discredited all the test results including the IRMS test results.
3. Burden of Proof and Requirements for Finding a Doping Violation
A doping violation may be proved by any credible means including confessions or admissions, testimony, or more relevantly lab testing results. The anti-doping organization (here the US Cycling Federation and USADA, or “USADA” for short) has the overall burden of proof to show a doping violation to the “comfortable satisfaction” of the arbitrators. This is more than a “balance of the probabilities” (analogous to preponderance of the evidence in US civil trials), but less than beyond a reasonable doubt as in US criminal trials.
What do these differing burdens of proof mean? Some people have claimed that they cannot feel comfortable convicting Landis unless they are 99% to 100% sure. This is clearly not realistic or appropriate. We know that scientific tests have error; they give false positives and false negatives. I’ve read that doping tests give many more false negatives than positives. That’s why so many confessed dopers have never tested positive. In court cases, proof by a preponderance of the evidence the standard used in civil trials technically means only a 51% probability of assurance, although studies have shown that judges usually require something more like 55%, and jurors require even more. Proof beyond a reasonable doubt for conviction of a crime usually requires around 80% assurance by jurors up to a high of 95% for murder. So proof by a “comfortable satisfaction” of a doping violation should probably only require 70% to 80% assurance if that.
(See, Handbook of Jury Research)
3A. Presumption that scientific test results valid
Where a finding of doping is based on lab tests, both the A sample and B sample must test positive. In such cases, USADA is aided by the legal presumption that the WADA labs have conducted their scientific analyses in compliance with scientific standards, as laid out in the WADA International Standard for Laboratory Analysis (ISL), that is, the scientific results are valid. The athlete may rebut this presumption by showing a departure from a WADA international technical standard (IS). The athlete need only show such a departure by a “balance of the probabilities” (preponderance of the evidence). If such a departure is shown then the burden shifts back to USADA to prove that the departure did not “cause” the doping finding. It is unclear to me whether the burden of proof on USADA in this regard is the “balance of probabilities” or a “comfortable satisfaction”, but the majority decision seems to have adapted the “balance of probabilities” standard. If USADA can’t prove that the departure did not cause the doping finding, they lose.
The key point I would point out here is that the departure from an International Standard should be causally connected to the finding of doping, otherwise the doping finding should stand. As a general proposition the violation of technical rules or sloppy lab procedures should not invalidate the test results unless causally connected to those test results. I think this is a reasonable rule. We should not let off an otherwise guilty athlete as a prod to improve unrelated sloppy lab work. The remedy for sloppy lab work is the yearly testing during accreditation process or other administrative penalties, not letting off guilty athletes. The Landis majority decision seems to adopt this view in stating that the arbitration panel must “weigh the evidence” to determine whether an IS violation “affected” the finding of doping:
“Therefore, violations of the ISO 17025 or of WADA Technical Documents can be violations of the ISL for purposes of rebutting the initial presumption favouring the Lab that an AAF has been established. However, that of itself does not mean that the AAF does not amount to an anti-doping rule violation. The Panel must weigh the evidence to determine if the violation affected the AAF. If that is the case then the anti-doping rule violation may not have been made out at law.” (emphasis added)
If an IS violation is established it is not clear what sort of proof is required to show that the violation did not cause the finding of doping. In the Landaluce case before CAS, the majority decision found a violation of the IS mandating that the same person cannot perform the A sample test and the confirmatory B test. In that case the person performing the IRMS confirmation test, had done some minor work on the A test. This was said to be due to workload and understaffing. On its face this violation would seem to have little factual causal connection with the IRMS doping finding. It has more to do with the appearance of impartiality. The majority also found that UCI failed to show that this violation did not cause doping finding. The UCI presented no evidence to show that this violation did not cause the doping finding, but just argued that it didn’t. So it’s unclear what showing would be required. Is it sufficient to show that the B tests were performed up to standard and that there was no evidence of bias or impropriety? Or must one completely throw out the B and even the A tests because the same person worked on both tests, and prove the doping by other means. In nearly all cases this would be impossible. I would argue that the former is appropriate. The latter would let off an otherwise proven doper on a technicality and would undermine the confidence of non-doping athletes in the fairness of the system. It would be an incentive for others to dope to keep up with the dopers.
Assuming no IS violation is shown, what weight is to be given to other scientific evidence aimed at discrediting the doping finding? Landis presented evidence of other lab errors, evidence of alternative laboratory practices, alternative more stringent positivity criteria, and general scientific evidence suggesting that the tests results were inconsistent with doping (the Amory testimony). The WADA rules appear to bar the use of such evidence to rebut the presumption and the majority agreed. Article 18 reads:
“Compliance with the International Standards (as opposed to other alternative standards, practice or procedure) shall be sufficient to conclude that the procedures addressed by the International Standards were performed properly.”
The majority decision stated that an IS violation “is the only relevant evidence to determine if the Athlete’s attempt to rebut the presumption of Article 18 may be successful. Proving some other procedure, practice or alternative standard is of no consequence in rebutting the presumption favouring the Lab.”
On the other hand, assuming an IS violation is shown by the athlete, is other general scientific evidence not constituting an IS violation now generally admissible and relevant for the general purpose of discrediting the test methodology and results? The rules on this don’t seem to be clear. If USADA only presents evidence to show that the IS violation was not causally connected with the doping finding (as it might have in the Landaluce case), then I would argue that the scientific evidence should be limited to that issue. If USADA tries to show the doping violation by rehabilitating the test results or by other test results, then the permitted scientific evidence should be expanded to address those issues, and this might include general scientific challenges like the Amory testimony. So whether, and for what purpose, this additional evidence may be relevant will depend on the facts of the case. This appears to be an open question, and its possible a panel might permit general scientific evidence to attack the overall validity of the testing results. Campbell in his dissent seem to treat this type of evidence as relevant to discrediting the IRMS results even though he found no specific IS violation with respect to those results. I think he was wrong to do so.
3B. Are the WADA procedures unfair?
The arbitration rules and procedures reflect a compromise between cost, accuracy and speedy resolution on the one hand, and an athlete’s ability to contest a finding on the other. I don’t think that that line was drawn unreasonably. WADA scientists have developed scientific tests for the presence of doping compounds that take into account cost and the probability balance between false negatives and false positives. The WADA detailed technical standards reportedly resulted from the legalistic advocacy of USADA which sought to impose uniformity. I wonder whether this is an overly legalistic approach. Its possible that science involves too much variety and moves too fast for codified standards to give the answer in all cases.
The presumption of scientific validity is reasonable and removes the need to litigate the scientific validity of a test in every arbitration, which would be expensive and wasteful. There are limitations on the right to discovery in this arbitration, but the arbitrators seem to have the discretion to order additional discovery. This is typical of many if not most arbitrations. Again I think this is a reasonable limitation in the interest of expediting the hearing process. Discovery as a matter of right in civil court proceedings can be incredibly time consuming and costly. It is also possible that revealing too much scientific information publicly, e.g. the background testing and validation behind the tests, could aid dopers in devising ways around the test. Most employees fired for violating a drug policy, don’t enjoy anywhere near the contractual due process rights that the athletes do here. I do find the rigidity of the WADA strict liability rules at times unreasonable, and I think more discretion should be given in this respect. I’ve read that the WADA code was heavily influenced by the USADA’s legalistic, rule bound, and inflexible approach to achieve uniformity.
4. The T/E Test Results
The T/E tests were thrown out because of an IS violation, i.e. three ions were not monitored in the confirmation tests as required by TD2003IDCR. I don’t really disagree with this, so the crucial issue will come down to the validity of the IRMS results discussed below. I did go through the T/E arguments and summarize them here for my own understanding. You can skip this if not interested.
The T/E ratios on the B samples were found to be 11 to 1, and those on the A sample 6 to 1. Both exceeded the 4 to 1 threshold for positivity, however only one ion was monitored. The majority decision (and dissent) found an IS violation of TD2003IDCR because the lab did not monitor 3 ions for identification purposes in their confirmation tests of the T/E ratios. The lab monitored only 1 ion, the m/z 432 ion, in both its screening tests and confirmation tests. USADA failed to show that this violation did not result in the doping finding, and accordingly the majority threw out the T/E results.
TD2003IDCR governs the identification of compounds and reads:
“The laboratory must establish criteria for identification of a compound. Examples of acceptable criteria are:
“In some cases it may be necessary to monitor selected ions to detect the substance at the Minimum Required Performance Limits. When selected ions are monitored, at least three diagnostic ions must be acquired.”
On the other hand TD2004EAAS governs testosterone testing and defines a threshold T/E ratio as follows:
The T/E value is given by the peak area or peak height ratio of testosterone and epitestosterone ......obtained by measuring the ion at m/z 432 by GC/MS analysis in a Single Ion Monitoring mode (SIM)..... The confirmation of the identity of any steroid reported with abnormal properties must be made (refer to technical document TD2003IDCR).”
TD2004EAAS specifies a specific ion, m/z 432, for testosterone testing, but only requires that 1 ion to be tested. Since this standard refers to TD2003IDCR for confirmation tests, the majority found that 3 ions were required for those tests. Was it “necessary” to monitor multiple “selected ions” in the confirmation tests? The majority “interprets” TD2003IDCR to mean this, but it’s not apparent to me on its face that that is the only reading. TD2003IDCR states that “in some cases it may be necessary” to use multiple “selected ions”. What was the evidence that this was such a case? The majority stated that if only 1 ion was required that the technical standards should have specified this with more “precision”. In requiring greater precision for the confirmation test, the majority decision was persuaded by Dr. Goldberger’s testimony that ion m/z 432 was contained in at least 10 substances other than testosterone and that it was “normal” practice to monitor 3 ions. I didn’t read any contrary testimony. USADA apparently argued that the lab’s single ion positivity criteria was documented for ISO inspection, and implicitly approved by ISO and since the ISL incorporated ISO standards, this approval meant that the labs positivity criteria met ISL standards. The majority’s reading took the position that the specific (IS) controlled over the general ISO certification.
Three ions were reportedly “acquired” in the tests but only 1 was “monitored” or analyzed. I wasn’t able to find any explanation from the testimony about why only 1 ion was monitored. Did the testers fail to follow known lab procedures, or did the lab and/or testers interpret the standards to only require 1 ion? I wasn’t able to find this out, but maybe I missed something. Campbell in his dissent rhetorically implies that if the lab couldn’t get the T/E test right, this is evidence that they couldn’t get the IRMS tests right. I think this is a pretty weak evidence entitled to little or no weight, since there is no causal nexus between the missing ion results and the validity of the IRMS tests. The IRMS results must be judged on their own.
5. The IRMS Test Results
The majority’s decision is based on the IRMS results showing artificial (“exogenous”) testosterone in both the A and B samples of Landis’ stage 17 urine sample. In addition, IRMS results showing exogenous testosterone in Landis B samples in other stages is offered as corroborating evidence. GC-IRMS testing isolates certain metabolites of testosterone, and then tests their carbon 13 to carbon 12 ratio to determine whether the source of the testosterone is artificial or natural (“endogenous”). If the measured ratio is sufficiently different, here “3 deltas”, from the ratio for natural testosterone, then a doping violation is found.
There seem to be 3 major challenges to these results: A) the metabolites were not correctly identified because retention times did not meet standards, B) the carbon ratios of those peaks were not correctly measured, C) other labs require the carbon ratios of at least 2 metabolites to exceed the 3 delta threshold.
5A. Retention times violated TD2003IDCR
Landis argued an IS violation in that the retention times (RTs) and relative retention times (RRTs) of the metabolites as measured by the GC-IRMS didn’t match closely enough to the respective RTs and RRTs measured by the GCMS as specified by the international standard TD2003IDCR, e.g. within .2 minutes or 1%. Consequently they claim that the metabolites were not adequately identified and any IRMS calculated carbon ratios were invalid.
TD2003IDCR reads in relevant part:
“The Laboratory must establish criteria for identification of a compound. Examples of acceptable criteria are:
For capillary gas chromatography, the retention time (RT) of the analyte shall not differ by more than one percent or +/- 0.2 minutes (whichever is smaller) from that of the same substance in a spiked urine sample, Reference Collection sample, or Reference Material analyzed contemporaneously. In those cases where shifts in retention can be explained, for example by sample overload, the retention time criteria may be relaxed.”
Retention time (RT) is the time it takes for a specific analyte/metabolite to travel through the gas chromatographic (GC) apparatus until it is detected. RT can be measured from various starting points including the retention time of another known substance (“internal standard”, “chromatographic reference”). Relative retention time (RRT) is the ratio of the retention time of the analyte to the retention time of some other substance such as the internal standard. Different metabolites will have different RTs and different RRTs so in principal they can be distinguished and identified based on their RTs, although some different substances may have the same RTs making identification more difficult.
In general retention times for the same analyte performed on the same machine will vary depending on various GC factors, including the column, flow rate, column pressure, carrier gas, temperature, and dead volume. (Shimadzu website) So these factors need to be made and remain constant on the same machine for the retention times to always match within the limits. Retention times may also vary on different machines of the same type because the GC factors cannot always be made identical. Clearly retention times will also vary for different types of machines.
Landis’ expert Meier-Augenstein (“Meier”), testified that the retention times (RTs) of metabolites measured in the GCMS and the RTs of the same metabolites measured in the GC-IRMS varied by up to 8 minutes, and the RRTs varied by up to 7%. Meier appeared to measure the metabolite RTs from the 5a Androstanol, although I’m not absolutely sure about this. It’s not clear from the transcript how Meier calculated his RRTs. For example, did he adjust for the longer combustion time of the GC-IRMS. Landis argued that both the .2 minute standard and the 1% standard were violated, however Meier did concede that as between two different machines only RRT should be used as a basis for comparison. Meier also testified that he might expect RTs for the same analyte to vary by 1 to 2 minutes between different GC machines in his lab.
Both the majority and minority decisions take the position, based on the testimony of Brenna, that the 1% standard doesn’t apply to GC identifications made on different types of machines as in this case, but only to GC identifications made on the same machine. So there was no IS violation. The majority states that the RTs and RRTs when measured on each machine individually met the .2 minute and 1% standard.
Brenna testified that the IRMS process involves an additional combustion and drying processes which adds to the RTs of the analytes. He stated that accordingly neither the straight RTs nor RRTs could be compared between GCMS and GCIRMS. The majority decision characterized this additional time as “constant”. Some have argued that if that is the case, then in principal one can subtract out that constant time and thus make the RTs and RRTs comparable. However, there is little evidence on this issue and the majority decision seems to have mischaracterized it. Meier said that one must compare RTTs but said nothing about the time added by IRMS, or how such added time would affect the calculation of the RRT. He does not say whether he corrected for this additional time in calculating his RRTs. I wonder whether he wasn’t being deceptive by this omission. Brenna said that the added time would be “approximately” the same.
A,......“Because in the GC, molecules move through the GC
at a rate which is characteristic of each individual molecule, so a molecule that moves through the GC slowly will move through it at
that rate, and compared to a molecule that moves through the GC more quickly. So, and then when it emerges into this region here, all the molecules move through this region at approximately the same rate. So there is what we call "differential retention" here, but not here, and that has implications for calculating retention times and also relative retention times.”
“Q. So, for example, in your laboratory, would you expect the retention times for your GC to correspond with -- your GC/MS to correspond with your GC IRMS?
A. No. And we run GC/MS every day, and GC/C-IRMS every day, and we match our peaks 25 every day.
Q. And how about relative retention times?
A. No, for the reasons I've outlined.”
According to the testimony of Brenna (and Ayotte if I recall), identification of the substance can be achieved by matching their peaks against the known chromatographic standard which is identified by it’s retention time, as well as comparing those peaks to the peaks in the GCMS of the identical sample. Retention times may be used to aid in this matching, but don’t have to match within the 1% standard. So far, I have read no testimony or other commentary that explains why an accurate identification cannot be made on this basis. When I, with my untrained eye, look at the peaks of the respective chromatographs they match and map onto on another despite their RRTs being off by 4%. Violation of the 1% standard, even if applicable, doesn’t seem to prevent identification in this particular case. I suspect that the arbitrators were similarly persuaded. Seemingly knowledgeable scientific posters on Daily Peleton, especially “onemintjulep” and “rational head” explain why the visual match is clear. So far nobody has been able to examine the peak correspondence and explain specifically why those peaks don’t match. Meier did testify that in the general case one could have visually identical chromatographs where completely different substances were being measured. But in this case the same sample, with the same substances, was run through the two machines. So all major peaks had to be accounted for. The only way to account for all major peaks was to map the analytes in the GC-IRMS onto those identified in the GCMS. There appear to be no alternative mappings possible.
I did want to examine some additional arguments about the retention time arguments since these may come up in the CAS trial.
Several things stand out in reading TD2003IDCR.
a. Arguably the lab can establish other criteria than those specified in TD2003IDCR.
The 1% retention time standard is an “example” of “acceptable criteria”. . However, at least with respect to the three ion requirement, the majority read TC2003IDCR to be mandatory where it was found to apply, and to exclude alternative standards. I think this was a generally reasonable result in this case, since USADA did not really offer scientific evidence justifying an alternative “one ion” standard. However, I think it must be clearly shown by scientific evidence that the technical standard applies to facts of the case. In this regard, I think the arbitrators must rely on scientific testimony as to the meaning of the standard, and should not rely on their own “interpretation” of the language. I would have like to have seen testimony from those who drafted the technical standards whether the three ion requirement applied here.
In the case of the 1% standard I would like to see testimony that it was intended to even apply to GC-IRMS testing, before claiming it was mandatory. IRMS testing on a separate machine seems to present special problems for using retention times. I found an 1998 Olympic technical committee draft of the 1% standard and it was limited to specific substances not including those tested here. This raises the question whether the 1% standard evolved before IRMS testing.
b. The 1% standard may not apply to relative retention times RRT.
The 1% standard applies by its terms only to “retention time” (RT), a direct measurement, but not to “relative retention time” (RRT), a ratio. The majority decision seems to assume, at least for the sake of discussion, that it applies to “relative retention times” also. Again, how can arbitrators decide whether the term “retention time” incorporates “relative retention time” without scientific testimony. How can we assume that the specific 1% standard, as applied to retention time, should also be applicable to relative retention times without scientific testimony. The Shimadzu website explanations of relative retention time, that I’ve read, state that the error in RRT increases for an analyte that elutes farther from the reference standard.. This suggests that an accuracy standard applicable to straight retention times may not be applicable to relative retention time.
c. Even Meier in his own lab could not meet the 1% standard.
The 1% standard is not absolute; it may be relaxed where shifts in retention time can be explained. In this case not only were different types of GC procedures used, GCMS and GCIRMS, but some of the GC conditions were probably not identical, in particular temperature. So even if the 1% standard might in some cases apply to RRTs between machines, this is a case where the 1% criteria should be relaxed.
Meier’s testimony suggests that even using the exact same GC conditions it would be impossible to meet the 1% standard for relative retention times between different machines whether of the same type or completely different types as in the Landis case.
A. “....You use -- you use a retention comparison because that is usually a bit difficult because, at the best of times, no two GCs and no two identical columns, even if they're the same manufacturer, will give you identical retention times. You go for what's called relative retention times. You add an internal standard to which you relate the retention time of everything else.” (T – 1362)
“A. We've got Hewlett Packard's trace gas, Agilent, I think we even have an old 7 Varian. Probably, four -- four or five different types of GCs.
Q. Okay. And if I took an internal reference compound like 5-alpha androsterone --did I get that right that time -- 5-alpha androstenol AC, and I ran it in two of your different GC instruments, would you expect me to get the exact same retention time?
A. Not the exact same retention time, no. But if you used the same temperature break and the same helium flow, the same column, or, at least, let's say, because you can't have the same column in two instruments at the same time, so you're using the same column time from the same manufacturer, even ideally from the same batch, you should, within reason, such as, for instance, the plus or minus of one or two minutes, you should get the same retention time, yes”. (T- 1503)
Clearly “one or two minutes” exceeds the .2 minute standard. Some people have claimed that labs routinely can achieve retention time precision in the thousandths of seconds. This is clearly not possible in the type of chromatography conducted here.
And if you add two minutes or 120 seconds to the RT of 866 seconds for the 5A Andro internal standard and to the RT of one of the metabolites, your RRT could be off by as much as approximately 2 to 3%, again exceeding the 1% standard. Thus, even Meier in his own lab probably could not meet the 1% standard. It really appears that the 1% standard is too stringent for comparison of retention times across different machines. Even, Meier seems to concede that the standard can be and should be relaxed.
Q. It's supposed to be not more than plus or minus one percent?
A. It's not supposed to be -- I mean, in duplicate cases I think there's a bit of leeway. I have no idea what the leeway is, but I can't imagine it's 600 percent. Because from one percent to 6 percent, that's -- well, 500 percent difference. I can't -- I can't see that. So I have no confidence in the data in terms of peak identification whatsoever. (T – 1409-10)
5B. Carbon Ratios not measured correctly.
Landis claimed that the IRMS measurements were not accurate because of linearity and other problems, and because of manual adjustment of peaks. The majority decision discussed these in detail and rejected these arguments, and the dissent didn’t really discuss the scientific issues at. I don’t have time now to discuss these arguments in detail.
5C. Other labs require at least 2 metabolites to test positive
Landis argued that the lab should require positivity for all four of the metabolites tested, and the dissent argued that the lab should require positivity for two metabolites as is done in the UCLA WADA lab. However, the language of TD2004EAAS does not appear to require positivity for multiple metabolites. “The results will be reported as consistent with the administration of a steroid when the 13C/12C value measured for the metabolite(s) differs significantly i.e. by 3 delta units or more from that of the urinary reference steroid chosen.” Landis argued that positivity for only one metabolite was shown although this is disputed as explained below. The majority did not appear to rule on this question explicitly. It simply found that the IRMS results were valid. So it’s decision could be interpreted as requiring only 1 metabolite positive, or interpreted as finding that 2 metabolites had been positive. The dissent on the other hand claimed that the more stringent UCLA standard discredited the IRMS test results.
The IRMS test showed 6+ delta units for the 5Alfa diol- pdial pair in both the A and B samples, and 3.51 delta for the Andro 11 ketol pair in the B sample and 3.99 delta units in the A sample. The dissent accepted the argument that the 3.51 delta did not exceed the 3 delta threshold when one took into account the lab’s measurement error allowance of .8 units. While the lab as a matter of procedure appeared to use the error allowance, it does not appear that it was legally required to do so by the ISL. If it had not done so, the 3.51 delta would have met the threshold and two metabolites would have been positive in the B sample. Moreover, the 3.99 from the stage 17 A sample did pass the 3 delta threshold even if one took into account the .8 unit error. This along with the multiple positive delta results on the B samples from other stages leaves little doubt in my mind that positivity was shown, even if technically only one metabolite in the stage 17 B sample may have exceeded the delta threshold. Clearly requiring 2 metabolites, or all 4 metabolites as the Landis team argued, to test positive would result in fewer false positives. But it would also result in many many more false negatives. No scientific testimony was presented to show why using only one metabolite was not an adequate standard. In any case, WADA rules state that compliance with a more stringent alternative standard was not required.
6. Amory Testimony
Campbell in his dissent argues that the testimony of Amory raises sufficient doubts in his mind that he cannot be sufficiently confident in the IRMS test results. Since the testimony of Amory shows no IS violation in connection with the IRMS test, this was not sufficient legal grounds for rebutting the presumption that the IRMS test results were valid. This is why I think Campbell’s reasoning was contrary to the law, and this is probably why the majority decision did not even discuss the Amory testimony. Nevertheless, I want to examine the persuasiveness of the Amory testimony since much has been made of it.
Amory testified based on a review of the literature that the IRMS testosterone components, in particular the 5a-diol and 5b-diol,always moved in tandem, and further that above normal T/E ratios persisted over time especially if there was repeated doping. According to Amory, on stage 17 the 5a-diol tested positive, but the 5b-diol tested negative, and thus were not close enough to one another in value as predicted by the studies he had reviewed. The Landis T/E ratios did not test high in the stages other than stage 17. Amory argued that these facts were inconsistent with Landis having ingested artificial testosterone. Based on this testimony, Campbell concluded: “When you consider all the errors and ISL violations in this case, the fact that the results also do not comport with known science is dispositive. I cannot be comfortable satisfied that LNDD’s results are correct.”
However, there was other testimony that was inconsistent with Amory. There was testimony that micro-dosing and oral doping as opposed to injections would not lead to the persistence of above normal T/E ratios. Amory conceded these points but then commented that if the dosages were that low there would be no benefit to the athlete. This was humorous, since he had previously testified that testosterone couldn’t help a cyclist during the race even if administered at high doses. There was testimony that T/E levels fell during the course of a long race, so taking low doses of testosterone might just maintain one’s T/E levels.
Shackleton, (the leading expert on testosterone metabolism, but a bumbling witness) testified that theoretically it would not always be the case that the 5a and 5b diols would move in the same direction as claimed by Amory when someone had ingested testosterone. Several case study examples were introduced showing 5a and 5b diol values similar to Landis by subjects who had ingested testosterone. Landis claimed that these studies had not been refereed and thus should not be admitted, but this fact did not mean that the data was unreliable, only that the case study may not have been of sufficiently wide interest so as to justify publication in a journal. Moreover, the refereed articles quoted by Amory appeared themselves to be in the nature of case studies and thus anecdotal, although involving many more subjects over time. Amory offered little or no theory to explain why the components must always move together over time. Both Shackleton and Amory stated that the metabolic breakdown of steroid components by the body was an incredibly complex process. So given this, Amory’s claim was weak, not at all conclusive, and overstated.
[BACK FROM MAIN BODY]
7. Concluding Notes
From the evidence in the record, I believe the majority’s finding that the 1% retention time standard set out in TD2003IDCR was not applicable to the GC-IRMS was correct in this case. Accordingly, the GC-IRMS tests should legally be presumed to be valid. If Landis is able to present specific evidence that other labs use something like the 1% standard or even some relaxed standard in their GC-IRMS analyses, or specific evidence explaining why the visual peaks don’t match, then I might revise my opinion. That would be something he might want to attempt in the CAS appeal.
Whether or not any of the test results should have been thrown out on technical legal grounds, I came away from reading the evidence with something like an 80% subjective assurance that Landis had doped. The IRMS tests for the other stages were persuasive in this regard. Given that, I would not want Landis to get off on a technical violation of some legal rule.
The majority decision discussed in great detail many of the key scientific and legal issues. I wasn’t able to go through each of those issues in detail or to determine whether all of the legal questions were decided correctly, but it does appear that the majority acted diligently and conscientiously in going over the evidence. I was less impressed with the dissent. Campbell really did not address the scientific evidence in any detail, and much of his argument was rhetorical and ignored the legal rules he was supposedly interpreting. Perhaps this reflected his difficulty in understanding the scientific evidence. He claimed various procedural and rule violations but did a very poor job of showing any causal nexus with the testing results. The very first argument in his dissent claims bad faith cherry picking of evidence by the lab and documentation violations, but as to the cherry picking he doesn’t cite to the transcript so I couldn’t figure what he was talking about. To my mind the strongest arguments questioning the lab results were made in the doubts expressed by the majority decision, not the arguments raised by the dissent. Campbell was clearly biased in favor of the athlete, and I will take it as a given that at least one of the other arbitrators was biased in favor of USADA. In this regard, I can understand an arbitrator not wanting Landis to get off on some legal technicality, when he might have the subjective conviction that Landis doped. But my analysis above has been based on what was expressed in the written decisions and my attempt to understand the evidence.