Tuesday, January 01, 2008

Mr. Idiot figures something out

In a comment to "Getting around to Specificity", Mr. Idiot reaches some important understandings, and we post it for broader visibility. It's been a real slow couple of weeks, and what comments there have been are in topics lost down the page.


My basic understanding is now that LNDD's method of detecting the presence of exogenous testosterone is not "fit for purpose." This failure of the assay to be fit for purpose has come about because LNDD failed to take account of a key difference between the pre-IRMS testosterone test, and the GC/C/IRMS testosterone test.

Before IRMS (about 1997 at UCLA - I don't know exactly when LNDD got it, but certainly before 2003), testosterone was clearly a threshold substance. The T/E ratio had to cross the quantitative threshold of 6/1 (now 4/1). The key measurement was the AMOUNT of testosterone compared to the amount of epitestosterone.

In that context, TD2003IDCR makes perfect sense. In order to use GC/MS to detect exogenous testosterone you can identify and measure the amount of testosterone and epitestosterone with a GC/MS in Selected Ion Monitoring (SIM) mode. By monitoring three ions of each substance you are able to both clearly identify and quantify both T and epi-T. First you separate the substances with the GC, and then monitor the three key most abundant ions (i.e. the "diagnostic ions") of the proper GC peak with the MS.

The key thing to note here is that even if your GC separation is not perfect, or even if there is an indistinguishable perfectly co-eluting peak, leading to other substances in your GC peak, it does not invalidate your measurement. If the ratio of the three diagnostic ions is right (those are known ratios) then you know you have the right substance with no interference AT THOSE IONS. There maybe other things in that peak, but, when you are not using IRMS, it doesn't matter because you have identified and quantified your testosterone properly with the three diagnostic ions. Same for epi-T.

[MORE]


(This is not really relevant to my present topic, but as it turns out LNDD only analyzed ONE diagnostic ion in their T/E test, so the arbs ruled their process violated TD2003IDCR and that the evidence of that test was without value. If that was the only test, the case would have been dismissed.)

Now, when TD2003IRCR was approved in 2003 IRMS was still relatively new, and not all WADA labs had an IRMS machine. When TD2004IAAS became effective in August of 2004, IMRS was still officially an "add on" - that is, if the T/E confirmation test, describe roughly above, was not conclusive, then an IRMS test was recommended. Thus TD2003IDCR was written so that administration of exogenous testosterone could still be confirmed without IRMS - and, in fact, as best I can tell, that is still the situation today. That said, even WADA understands that the IRMS test is much more conclusive when done properly.

So, historically speaking, you have the situation in which GC/MS works "fine" (at least by WADA standards), and then, at different times in different labs, the IRMS test is added as an additional piece of evidence.

Now, what LNDD apparently did when they added the IRMS test to their arsenal, was to continue to do the GCMS test the same way they had before - with SIM - the way it had been perfectly effective before, and, importantly, a way it is perfectly in line with TD2003IDCR.

HOWEVER, when the IRMS test is added to the GCMS test, the GCMS test must be done a little differently. Of course, you are analyzing metabolites of testosterone, not T itself, but that is not important here. The key thing is that SIM no longer does the job you need done. For GC/C/IRMS you not only need to know that there is substance x in a given GC peak, but you also need to know that there is nothing else in addition to substance x in the peak. SIM mode does not tell you this. Only doing a full ion scan of a given peak will tell you this. That is, you have to monitor not just three diagnostic ions, but all ions present in the peak because all the ions are going to the IRMS. Remember for the T/E test, it didn't matter if other stuff was present. The SIM/three diagnostic ions told you everything you needed to know. But doing the same thing as part of the IRMS test is not good enough. It does not give you the information you need.

So, if the GCMS part of GC/C/IRMS is done in SIM mode then the whole assay is not fit for purpose, even though, for "historical" reasons, it is probably technically allowed according to TD2003IDCR.

The wise reader will ask why WADA has not yet recognized and fixed this problem. Well, I can only guess that it is because, prior to the Landis case, no lab had run the GC/C/IRMS assay in exactly this way. You see, while TD2003IDCR says that SIM mode is acceptable, it also says the "preferred method" is to use full scan mode. And if you run the assay with this full scan mode then you have the conclusive evidence you need that you have measured the right stuff in the IRMS. I imagine that other labs have either not done IRMS or have done IRMS with the GCMS part done in full scan mode. Dr. Goldberger testified at the hearing that he had seen lab documentation packages from the UCLA lab and they included the full scan data for the T/E test - and if they did it for that, they would surely do it for the GCMS part of the GC/C/IRMS test also.

So, why didn't Landis' legal team press this particular issue at the hearing - that the way LNDD does the GCMS part of the IRMS test makes the assay not fit for purpose? Well, one answer is that, as Larry said above, that is a very difficult legal argument to make given the nature of the controlling documents. Especially given that the way LNDD does it appears to be okay according to TD2003IDCR. The truth is that TD2003IDCR has inadequately accounted for the nature of the IRMS test - leaving the "loop hole" of running SIM mode open, even when doing IRMS.

Another answer as to why Landis' legal team didn't press this at the hearing is to say that "They did, sort of." This is exactly what they were trying to get at with all the arguments about good chromatography. If they could not use (the inadequate) TD2003IDCR to prove an ISL violation, they had to get at the issue another way. If they could show that there was a some degree of likelihood of other material in the peaks of interest, then that should have raised enough questions to prove an ISL violation of the "matrix interference" bullet of ISL 5.4.4.2.1. But there are two problems with that approach - ISL 5.4.4.2.1 has weak language ("should" rather than "shall"), and also that there are no cut and dried criteria for what constitutes good or bad chromatography. The arbs, who did NOT understand the critical flaw in applying TD2003IDCR to IRMS, allowed the use of SIM to stand, with mediocre chromatography, because "should" does not mean "shall."

And this is where the "threshold substance" vs. "non-threshold substance" argument comes in. The way the IRMS test works, clearly testosterone SHOULD be a threshold substance, and the stronger language included in ISL 5.4.4.2.2 should apply. But because WADA has not yet admitted the fatal flaw in applying TD2003IDCR to the IRMS test, both Landis' legal team and the arbs assume that testosterone is a non-threshold substance and the weaker language of 5.4.4.2.1 applies.

There is more to talk about with regard to what exactly could be in those peaks of interest other than what should be there, but that's obviously enough for now.

Bottom line is that LNDDs method of using the SIM mode for the GCMS part of the GC/C/IRMS test makes the assay not "fit for purpose."

Oh, I just have to add one more thing - about ISO certification. This apparent technical adherence to TD2003IDCR is why LNDD could get ISO re-certification for its IRMS test just six months before Landis' tests. ISO certification does not assure that the test really does what it is supposed to do. It only assures that the test is in line with the controlling documents. If those documents are flawed, that's not ISO's fault.

SYI

We've noted before that the WADA system appears to have no body responsible for "fitness for purpose" review of test protocols. If a dunking stool were used for determination, and it's execution were to the letter of the SOP, ISO would be OK with it.


47 comments:

wschart said...

An interesting argument here. But, is this a question of "questioning the science", which, as we know, is not allowed by WADA rules. This could be why the Landis team tried to make this point in a rather round about way, as they might not have been allowed to attack this issue head on.

Mike Solberg said...

Perhaps it is, wschart. I don't know exactly how this would work out before the CAS.

ISL 5.4.4.2.1 says:

Confirmation methods for Non-threshold Substances must be
validated. Examples of factors relevant to determining if the
method is fit for the purpose are:
· Specificity. The ability of the assay to detect only the substance of interest must be determined and
documented. The assay must be able to discriminate between compounds of closely related structures.


I think it is clear that LNDD's method for IRMS is not fit for the purpose, because it does not meet this specificity criteria. But, obviously, as Larry has made clear, the legal nuances make this a difficult argument. I think the main problem is that TD2003IDCR was supposed to provide the greater detail which would describe how to meet criteria like 5.4.4.2.1, and LNDDs method, in my view, technically meets the requirements of TD2003IDCR. So it is hard to argue for a violation of 5.4.4.2.1.

I don't know how to approach things legally when it is the application of TD2003IDCR to IRMS that is the problem.

I don't know what would qualify as "questioning the science." This is not so much questioning the fundamental science, as questioning the fitness for purpose of this particular assay. If LNDD had provided full mass spec analysis (rather than just SIM) of the peaks of interest, the science would be just fine (although the actual performance of LNDD could still be an issue).

Larry said in some post a long time ago that the law usually goes along with common sense. Once you understand the individual elements of this puzzle (the documents, SIM, full scan analysis, GCMS, IRMS, the history of the tests) then common sense tells you that using SIM mode for the GCMS part of the IRMS doesn't cohere with common sense. So, hopefully, this can worked out legally.

In my post, of course, I was trying to understand/explain how this unfortunate situation could come about, and I think the way the IRMS test was brought in as an "add on" to the GCMS test does explain that.

syi

Larry said...

wschart, you CAN question the science. My upcoming analysis will make this clear. The science gets the benefit of a presumption of validity, but the athlete can attack the science and overturn the presumption.

Mr. I (with your last post, you've lost the "Idiot" appellation, at least for the moment), I have put aside my opus to work on on a response to your latest, terrific post.

TBV, what's with the "Mr. Idiot figures something out"? You make it sound like this is headline news! Like "Man Bites Dog". ;^)

Larry said...

Mike -

In your title post, you made the following statements:

what LNDD apparently did when they added the IRMS test to their arsenal, was to continue to do the GCMS test the same way they had before - with SIM

How did you reach this conclusion? LNDD's SOP requires this analysis to be performed in full scan mode. (The LNDD SOP for the MS portion of its CIR testing for exogenous testosterone in M-AN-52, shown for example at LNDD1427.) The MS acquisition parameters set forth on USADA 125-126 and 304-305 indicate that LNDD utilized the SOP in MAN_52.M (which I assume is the same as M-AN-52) and show full scan MS acquisition parameters of 50-550 m/z. Yes, as we've noted before, the graphs at USADA 145 and USADA 322 are single-ion chromatograms, but they could have been produced from full scan data. In fact, these single ion chromatograms chart 16 different ion types (by m/z), ranging from 191 m/z to 344 m/z.

You may be right that LNDD did this analysis in SIM mode, but I'd like to know how you reached this conclusion, as there is the evidence I cited to the contrary.

Larry said...

Mike -

In the middle of writing a full reply to your post here, I realized that you may have uncovered a more significant ISL departure than you reported. I understand the point you're making is that you can't do IRMS peak analysis in a way that's "fit for purpose" by analyzing diagnostic ions, that instead a properly validated method for analyzing peaks here would require the lab to take full mass spectrum data. But in the process of making your argument, I think you went about 80% of the way required to show that LNDD failed to follow the procedures required under the ISL to identify peaks with diagnostic ions, because TD2003IDCR actually requires LNDD to acquire and analyze the full mass spectrum data.

I can't believe it's taken me 4+ months to figure this out. But I've spent all day going through this analysis, and I can't find any holes in it. (OK, actually I have identified a small potential hole in my analysis, but I'm not going to point it out just yet.)

The rules for identifying peaks with diagnostic ions are set forth in WADA Technical Document TD2003IDCR. The requirements vary, depending on whether the diagnostic ions are acquired in full scan mode or SIM mode. The requirements are:

FULL SCAN MODE

(1) All diagnostic ions with a relative abundance greater than 10% in the reference spectrum obtained from a positive control urine, a reference collection sample or a reference material must be present in the spectrum of the unknown peak (minimum of 3 ions; special rules are applied if three such ions are not available), and (2) the relative abundance of three diagnostic ions shall not differ by more than the amount shown in TD2003IDCR from the relative intensities of the same ions acquired from a spiked urine, a Reference Collection sample or a Referencce Material.

SIM MODE

(1) At least three diagnostic ions must be acquired, (2) the relative intensities of the three diagnostic ions shall not differ by more than the amount shown in TD2003IDCR from the relative intensities of the same ions acquired from a spiked urine, a Reference Collection sample or a Reference Material, and (3) the signal to noise ratio of the least intense diagnostic ion must be greater than 3:1. (There are a couple of other requirements for SIM Mode that I'm not mentioning here. Emphasis added.)

I believe that LNDD acquired its diagnostic ions in full scan mode. I've set forth some of my reasons in my prior post here. Additional proof can be found in LNDD's report of its GC/MS qualitative analysis at USADA 149-150 and USADA 323-324, where LNDD summarized its analysis of the diagnostic ions for the 6 metabolites at issue in the CIR testing. Note that these reports expressly state that the analysis is "pour les abondances relatives" - for relative abundance. As highlighted above, the relative abundance analysis is only for full scan mode -- under SIM mode, the analysis would be based on relative intensities. (The definition of how a relative abundance analysis is supposed to work is set forth near the end of TD2003IDCR, and the analysis shown at USADA 149-150 and USADA 323-324 matches the relative abundance definition.)

(By the way, if I'm wrong and the ions WERE acquired in SIM mode, then where in the LDP is the required analysis of the signal to noise ratio? In fact, where is there ANY evidence of noise in the charts on USADA 145 and USADA 322?)

Mike, as I'm sure you've realized, the above analysis is mostly a rehash of your analysis. But there's one point that you and I have missed up until now. Given that LNDD's peak identification was performed in full scan mode, LNDD was required to acquire more than three ions. In other words, LNDD had to perform an analysis of the full mass spectrum for each of the six metabolites in order to comply with TD2003IDCR.

LNDD was required (a) to perform a full mass spectrum analysis on the peaks for each of the 6 metabolites contained in the mix cal acetate sample to identify "all diagnostic ions with a relative abundance greater than 10%" in each such peak. LNDD was then required (b) to perform a full mass spectrum analysis on the peaks for each of these 6 metabolites as they appeared in the FL S17 fractions, to determine that all of the relevant diagnostic ions identified in step (a) were present in these peaks. LNDD did not do this analysis. This is a departure from the ISL.

Do you agree?

I AM working on a full response to your post here, but I keep encountering significant side issues.

blackmingo said...

Larry/Mike/All:

First -hope your new year finds you well.

Second, Larry, I have little in depth knowledge in this area -I have been following Mike and your discussion but find it hard to keep up. However, when looking at TD2003IDCR page one, I noticed that in your commment you ommitted from your Full Scan Mode passage these quotes: "A full or partial scan is the preferred approach to identification" and particularly, "When a full or partial scan is acquired," before the rest of the quotation from TD2003IDCR.

I think if those quotes are in there, it seems less like a unflinching requirement and more like a conditional requirement. I think this does not resolve Mike's dilemma that the rules are written so as to not require full scan mode? Admittedly I am lost here, but I wanted to find out why you left those modifying passages out of your quotation.

Best,

Dan

Larry said...

Dan/Blackmingo -

You may be pointing to the "small potential hole" in my argument! Which is that, arguably, TD2003IDCR did not require ANY kind of examination of diagnostic ions - not in SIM mode, and not in full scan mode. Arguably, LNDD's comparison of retention times between the mix cal acetate and the S17 fractions was alone sufficient to satisfy TD2003IDCR and identify the MS peaks in the first phase of the CIR testing.

The problem with this argument is that it goes too far! If this argument is valid, then there was no problem with LNDD's identification of testosterone and epitestosterone in its T/E testing, as these substances could ALSO have been identified under TD2003IDCR by using retention times.

Probably more analysis is needed here. And Blackmingo, I'm not sure I've answered your question.

bk said...

Hello,

These are clearly "Threshold" substance assays. From the ISL definitions:

Non-threshold Substance: A substance listed on the Prohibited List for which the
documentable detection of any amount is considered an anti-doping rule violation.

Threshold Substance: A substance listed in the Prohibited List for which the
detection of an amount in excess of a stated threshold is considered an Adverse
Analytical Finding.

The threshold for the AAF is less than -3 o/oo compared to reference metabolite less an additional amount to account for uncertainties.

A substance that is no natural pathway to be present in the body does not a threshold. Simple detection would suffice.

Also, I would consider integration issues to be Matrix Interferences, which is defined interference
in the measurement of the amount of Prohibited Substances or their Metabolites or Markers by components of the sample matrix.

Example: It is "easy" to determine the peak area to less than 0.8 parts per thousand, when it is clean and without baseline. It is exceedingly difficult when extra chemicals in the matrix causes a baseline, sloping baseline, nearby peaks, or unresolved peaks.

BK

Unknown said...

Of the subject, but who had the time to write all these rules??

And holy cow, just a simple Testorone test sure is complicated.

Should it be this hard? If you're going to destroy someone's reputation, shouldn't it be easier to prove their guilt?

Mike

Larry said...

BK, on threshold v. non-threshold substances, I've posted an analysis on 12/21 at 4:13 pm under "Getting Around to Specificity." If you haven't seen this analysis, you might want to take a look.

Michael, it will only get worse from here. The proposed test for human growth hormone is supposed to be pretty difficult. And the rules necessary to set up the "biological passport" to prove AAFs will be long and complicated.

ShepFan said...

About ISO certifications...

I have some experience there, having written my former employer's software development SOP for medical devices. And I've helped prepare for their ISO 9002 certification, which we passed.

The "dunking" comment was no doubt tongue-in-cheek, as there (at least in the USA) FDA guidelines which must be followed for the development and documentation of medical products. The key to ISO certification is having SOPs that adhere to the applicable "best practices," having a coherent system of documentation in place, and painstakingly using that system to document ongoing adherence to those SOPs and best practices.

This, in view of the arbitration board's public criticism of LNDD's failure to follow their own SOPs, suggests the lab is overdue for a visit from an ISO review committee.

velovortmax said...

C12 and C13 testosterone are diet based steroids. All people have freely circulating C13 testosterone. C12 and C13 testosterone once intoduced into the body have identical chemical compositions. A GC/MS testosterone/epitestosterone ratio cannot determine a difference between C12 and C13 based testosterone. It merely measures a testosterone "spike" and suggests doping. The only way to detect C13 based testosterone is by establishing a threshold of -3mil determined by measuring C13 specific testosterone metabolites and comparing them with C12 specific testosterone metabolites. If WADA inferred a presence of C13 testosterone using only a T/E ratio the exact locus of the presence of the C13 testosterone would never be determined. Thus false positive rates would increase. Since the AAA Majority dismissed the T/E ratio in the Landis case the only factor left to consider is the GC/C/IRMS metabolite delta/delta evidence. The GC/MS evidence of the presence of testosterone is bogus and need not be considered.

velovortmax said...

One more thing. Before GC/C/IRMS WADA used longitudinal tests attempting to catch dopers because maintaining a testosterone "spike" proved to be impossible. Landis requested longitudinal tests and these tests were denied by USADA. Further evidence that Landis did not resort to evasive tactics to foil a correct T/E ratio.

velovortmax said...
This comment has been removed by the author.
Unknown said...

The fact that USADA withheld data from Landis, tried to do their own 'longitudinal' research by testing prior samples, and hired outside council proves to me that they didn't trust the results given to them by LNDD.

That they went ahead and prosecuted Landis proves that Tygart didn't have the balls to stand up to WADA.

And they took a good man down in the process without any reprecussions to their careers - YET....

Larry,

Do you think they athletes will ever have a chance? I mean, Gatlin got busted for taking a prescribed drug while he was in college. How dumb is that?? Then they held it against him the 2nd time around.

Mike

Larry said...

Michael -

It is always going to be difficult to challenge a science case made by a policing lab (whether that lab is the local police lab, or the FBI, or a WADA lab). The lab is in possession of all of the facts; under most circumstances, the accused is not going to have access to the same information base available to the prosecution. Plus, judges, juries and arbitrators give great weight to scientific testimony.

All this is made more difficult by the limited rights available to athletes in WADA proceedings, and the prevailing opinion that the ADAs have to crack down on doping.

Do I think athletes will ever have a chance? Well, I expect that over time, the WADA labs will get better at their jobs. This is a relatively young field.

Also, I have hope that the kinds of anti-doping programs run by teams like Slipstream and CSC will give athletes a better ability to defend themselves. If a WADA lab's testing results run counter to the results of the team's testing, then at least the athlete can marshall a little bit of counter-science to support his case.

But otherwise, no, I don't see much hope for any athlete caught in this system.

Unknown said...

Nice post Larry. Kind of scary though.

I can see a scenario where Slipstream or Team CSC tries to dispute a positive test from a WADA lab. It'll be a he said/she said argument and that would be extremely ugly.

Too bad cycling thinks is has to be so transparent because it's killing itself IMO.

Mike

Mike Solberg said...

Hey, it's good to see some new "names" above. Thanks for the contributions!

Above, I wrote: "what LNDD apparently did when they added the IRMS test to their arsenal, was to continue to do the GCMS test the same way they had before - with SIM."

And Larry asked: "How did you reach this conclusion?"

Sorry for the delay, no time last couple days. This is actually a complicated question. The quickest answer to Larry's question is that I actually thought that was common knowledge, but now that he asks the question and I think about it, I guess it isn't. I think I mis-read Goldberger's testimony during the hearing. He talks about them using SIM mode but only wrt the T/E assay.

So the longer version is this:

I think they did do a full "SCAN" (50-550) all across the x axis of the 'gram, as represented by the TIC 'gram at the top of USADA 321. By definition, this had to be a low resolution scan (few scans per second).

The question we really care about is "What did they do when they looked up-close at the peaks of interest?" In other words, what are the 'grams on 322 really showing us?

I guess there are three possibilities. First, it is possible that they didn't do anything different. They just showed us the low resolution data, from the full (50-550) scan, for those three ions in each of those peaks on 322. This possibility has some weight because one noticeable feature of those scans is the low resolution - that is, they cover a full unit of ions (like 315.20 to 316.20), where as in some methods you are able to scan much more narrow ions (like 316.124 - some high resolution scans can differentiate to the thousandths place - to tell the difference between almost identical weight ions - or m/z ratios I guess that would be). So, maybe that is what they did, just showed us the info from the full (50-550) scan for those peaks and those ions.

Against this first possibility is that it would then mean that LNDD made no special attempt to identify the peaks of interest. I think this would basically just be the equivalent of chromatographic separation (but I could be wrong about that).

If this first possibility is right, then the significant point would be that they still didn't show us any data about other ions in those peaks. We still wouldn't have any attempt at showing specificity (or revealing matrix interference - I still don't know exactly what the difference is). So, although this would change the details of my argument about explaining the lack of full mass spec data, the basic flow of my argument is the same.

The second possibility is that they did a SIM mode scan of the peaks of interest. That is consistent with the fact that the peaks on 322 show three diagnostic ions for each peak and nothing else (as in the first possibility). In this case, the low resolution could be accounted for if they scanned all 16 of those ions (on 321-322) at once, which seems likely to me, if for no other reason than that all this data seems to come from the same scan time wise. Also in support of this possibility is that then LNDD at least made some attempt at additional identification of the peaks, arguably meeting the requirements of (a flawed) TD2003IDCR.

Of course, the key point, again, is that this second possibility also tells us nothing about what is in those peaks other than the three diagnostic ions.

The third possibility is that they ran full mass scan on each of those peaks, but then only showed us the data from the three diagnostic ions. This seems to me the least likely. If they had done full mass scans on each of those peaks, they surely would have included the data in the LDP, as it would fully meet the "preferred" method of TD2003IDCR. And in any case, even if they did this full mass scan of each peak, they STILL didn't show us the information we would need to be assured of specificity.

I would say that overall I still think the data on 321-322 comes from SIM mode scans. To me that makes the most sense given that their standard GCMS assay for the T/E test (M-AN-27) calls for SIM mode. The data we have is consistent with them doing the same thing for the GCMS part of the IRMS assay.

I would rule out the third possibility because I can't imagine they did a full scan on each peak and then didn't give us the info. Larry, you make the point that that is what M-AN-52 calls for, but I am not convinced that the "SCAN" of M-AN-52 is a full mass scan of the peaks of interest. It may just be the the full (50-550) low resolution scan of the whole x-axis.

And I would rule out the first possibility because that would mean that they did LESS to show the identity of the peaks in the GCMS part of the IMRS test than they are supposed to do for their regular GCMS T/E assay. That seems unlikely.

You know, I have a faint recollection of some discussion of this over at DPF, but I don't have time to find it right now. If anybody searches it out, let me know.

(And Larry, there is more to be learned about this whole question based on which assays listed on USADA 71-72 call for "SCAN" and which assays call for "SIM." It looks like all the ones which require quantification (because they are looking for exogenous use of endogenous substances) call for SIM, and all those which are just looking for presence/absence ("qualitative assays" - for simple banned substances - like cocaine), call for SCAN. I am not sure of the significance of that yet, but I think it does tell us something, maybe something important.)

And, Larry, I haven't absorbed your longer post above. Hopefully, I'll be able to get to it soon.

(Ugh, I'm tired. If I said anything stupid in this post, chalk it up to drowsiness!)

syi

Larry said...

Michael -

Actually ... we don't really know if a team like Slipsteam or CSC will come to a rider's defense if the rider is accused by a WADA lab and the team thinks it has evidence of the rider's innocence. These programs are young, and I don't think they've faced this kind of test yet.

My guess is that any team dispute of a positive test will be handled quietly, and that we may never hear about it. It's not in anyone's interest for the WADA labs to battle with the team labs. WADA seems very much in favor of programs like those being used at Slipstream and CSC, and would seem to have no reason to attack these programs. And the teams have no reason to get on the wrong side of WADA. Both sides have an interest in getting along with each other.

My guess is that if a Slipstream or CSC rider were to test positive at a WADA lab, the WADA lab would carefully double and triple check their work (they don't want to be humiliated by having their testing "shown up" by the team's testing). Then I think the lab would contact the team privately, to say in effect, "hey, we're seeing what looks like an AAF on one of your riders, do you want to check your data and tell us what you think?"

We DO know that teams like Slipstream are planning to share at least some of their testing data with the powers that be (probably with UCI, but I'm not sure). There will probably be some kind of dialog between the teams and the ADAs about drug testing. There will be available lines of communication.

Ultimately, we don't know how this will play out. But you asked me if I had any hope for a fairer system, and this is where my hope lies.

Unknown said...

That makes sense. Too bad they didn't use that strategy with Floyd.

Cycling was at an all time high when Floyd won the Tour. Yet the powers that be decided to 'show they cared about doping'...without any consideration for the riders, sponsors, fans, etc.

I know that's my biased opinion, but it is what it is. I mean if Floyd had been busted for blood doping or epo or something more substantial, that would have been one thing. But Testorone, it doesn't make sense, especially with as many questions that have been brought up with regards to his tests. It should have been much more cut and dried from the prosecution point of view.

Mike

Mike Solberg said...

Okay, Larry, forget all that other stuff, I now think you are right. The 'grams on USADA 322 (et al) are not SIM mode, but result from the full scan mode used to make the TIC 'gram at the top of USADA 321.

Now I follow your argument of Jan 1 9:33. And the answer to your final question is yes, I now see what you mean that the full mass spec data for the peaks of interest is required by td2003idcr because they have to find ALL the diagnostic ions with a relative abundance of greater than 10%, in both the reference material (the cal mix) and the sample.

We see no evidence in the LDP that they looked beyond the three diagnostic ions shown for the peaks on USADA 321 and 322. There may well have been other diagnostic ions with a relative abundance of greater than 10%, and LNDD was required to find them.

So, it seems to me like you are right. You have finally found the reason that people like Duckstrap say the full mass spec data is REQUIRED by td2003idcr. And LNDD did not put it in the LDP, which (again) means they did not show us whether there are other diagnostic ions with a relative
abundance of greater than 10%. As you said "TD2003IDCR actually requires LNDD to acquire and analyze the full mass spectrum data." (My emphasis, because they did do a full scan, they just didn't show us what it revealed. I know a lot of people who would add here "Maybe they didn't like what they found!")

So, if I understand right, my "brilliant" (so named by TbV himself) post which started this thread wasn't so brilliant after all. The whole SIM part of it was wrong.

But in my defense, I would like to point out that the underlying problem I was getting at remains. Even if LNDD had followed the letter of td2003idcr (by showing us all diagnostic ions with a relative abundance greater than 10%), I think td2003idcr is still flawed when it comes to IRMS. Even if they had shown us any other diagnostic ions, that would still not tell us whether any other substance went into the IRMS.

Hmmm...perhaps that desire, for td2003idcr to assure that nothing else would go into the IRMS is beyond the scope of td2003idcr. Perhaps by identifying all the proper diagnostic ions (which they don't seem to have done), they would have identified the substance properly, which is all td2003idcr is supposed to guarantee. Remember, it is, after all, a document for "qualitative assays."

Perhaps the purity of the peaks is not supposed to be guaranteed by td2003idcr at all. I guess that job would fall back on the ISL then, and I think we get back then to the specificity bullet point of 5.4.4.2.1. (With specificity it doesn't matter whether it is 5.4.4.2.1 or 5.4.4.2.2, because they both have mandatory language.)

It sure seems to me that LNDD has done nothing, beyond separation by gas chromatography, to guarantee specificity, that is, that there is nothing else in those peaks.

And I guess that was good enough for the majority arbs.

syi

BTW, Larry, I think your "small potential hole" (if it is the conditional text that blackmingo mentioned) is no issue at all. In the arbs majority decision they said that SIM mode was the method used by LNDD for the T/E analysis. And if you use SIM you HAVE to analyze the three diagnostic ions. LNDD didn't do that (thus establishing a history of not following the letter of the law of td2003idcr), so the arbs threw the T/E analysis out. In the same way, by doing a full scan, LNDD committed themselves to the requirements of the full scan mode section of the td, and thus to find all diagnostic ions with a relative abundance of greater than 10%.

Larry said...

syi -

I'm still analyzing your post from last night, and now there's a new post this morning, so it may be a bit before I can get back to you.

In the meantime, a few thoughts.

I wanted to give you a terrific link I found today to an "Everything You Wanted To Know About Mass Spectroscopy" article. I mean, pages and pages on different ionization techniques! It's written in a relatively friendly style. See Mass Spec History. I figure you'd like this, as it is an historical approach!

In your last post, you stated that a full scan has to be low resolution (a few scans a second). I agree that a full scan is going to give you lower resolution on a per-ion basis than a SIM. But low resolution does not have to mean low scan speed. If you take a look at the chart on page 2 of Bulletin on Scan Speed, you'll see that an Ion Trap MS can be set to scan from 50 - 550 m/z in a few milliseconds.

The next question is, what scan speed did LNDD use? There is no scan speed mentioned in M-AN-52, but the GC Method parameters shown at USADA 125 and USADA 304 shows a "data rate" of 20 Hz. If that's the scan rate that LNDD used, then I think they were scanning the full spectrum 20 times a second (50 ms per scan).

I'm not sure if any of this scan speed stuff is important. Presumably, scan speed affects the accuracy of measured retention times. However, it wouldn't matter to LNDD's analysis if retention times were only accurate to +/- 50 ms (as opposed, say to the +/- 3 ms that appears to be possible). Maybe there's something to this that someone smarter than me can figure out, but for the moment I don't think scan speed is an issue in this case.

Cheryl from Maryland said...

You guys are amazing, although reading these posts over the last few weeks makes my humanistic head spin.

But, final thought (not helpful for appeal) -- Didn't WADA and the LNDD go through amazing hoops to avoid what would have been straightforward, conclusive, possible with the machines, and what everyone would have accepted -- the entire mass spec data.

Why - arrogance, time, money, inconclusive answers? Cui bono? My mind boggles. They could have justified themselves in two seconds with that data.

Keep up the good work. I agree with Mike, with this kind of thinking, I'm not sanguine regarding blood passports and the team testing.

Best wishes to all for a good 2008.

Larry said...

Cheryl, great post!

I hope that whenever you find your head spinning, you'll ask questions. I can't speak for anybody else, but one of our purposes here at TBV should be to make the science as clear as we can. Especially with the CAS appeal coming up. I'd be very happy if anyone who cares about this case (including the press) could come here for answers and explanations.

As for your main question ... that's a great question. I think that question has baffled most of us -- even people like OMJ over at DPF, who supports the majority decision against FL, seem baffled by LNDD's apparent failure to consider the full mass spectrum data. By all account, LNDD acquired this data. Why do we have no evidence that the data was ever analyzed?

One answer, suggested by Mr. Idiot and others, is that the data WAS analyzed, but that LNDD did not like the data, so the analysis never saw the light of day. I tend to doubt this, as there's no evidence that LNDD's standard operating procedures required analysis of the full mass spectrum. Also, it has to be noted, the absence of full mass spectrum data was not an issue in the arbitration ... so maybe it's only in hindsight that the lack of this data is so puzzling to us.

I think that Mr. Idiot provided the best explanation I've seen: the full mass spectrum data was not analyzed as part of the IRMS testing because it wasn't required (and arguably was not needed) for the older test looking at the ratio of testosterone to epitestosterone (the T/E test). The IRMS test evolved from the T/E test, and wasn't properly modified to add an analysis of the full mass spectrum data. THAT sounds to me to be typical of the way people screw up. (A conspiracy involving the cover-up of the mass spec data is possible, but less typical.)

On the other hand, there are a number of OTHER things that LNDD did in this case that are inexplicable. Why did they use different mix cal acetate mixtures in the two portions of the CIR testing? Why did they use different temperature ramps? If LNDD used different GC columns in the two portions of the test, this would be VERY difficult to explain. Is there an "innocent" explanation for all of this? I'm anti-conspiracy theories by nature, so I'll always look for the explanation that does not require 3 a.m. meetings, document shredding and payments of hush money. But even I have to wonder.

Larry said...

syi -

We're pretty close to being in 100% agreement.

You clearly understand the points in favor of the argument I've made that under the circumstances, TD2003IDCR required LNDD to acquire and analyze full mass spectrum data. But you may not understand the strength of the argument on the other side. I'm going to try to get to the counter-argument in a separate post.

At the end of the day, whatever happens to the argument under TD2003IDCR, I think your original argument still stands strong: LNDD's SOP was not "fit for purpose" under the ISL because the nature of IRMS testing requires peaks to be identified with full scan mass spectrum data. Your argument contrasting IRMS testing for exogenous testosterone with GC/MS testing of the T/E ratio is, as best as I can tell, spot on. The presence of a small co-eluting peak in the testosterone peak is going to make only a small difference in the measurement of T/E ratio (and there's probably a bit of room for error in the 4:1 ratio selected by WADA). In contrast, even a small co-eluting peak could dramatically throw off the delta-delta reading in the CIR testing. The LNDD method for CIR testing could not have been properly validated under the ISL unless it included analysis of the full mass spectrum data.

I think your argument will come into sharper focus once I finish my legal analysis of the ISL.

DBrower said...

So, if I understand right, my "brilliant" (so named by TbV himself) post which started this thread wasn't so brilliant after all. The whole SIM part of it was wrong.

First, he sarcastically dumps on me when I'm skeptical saying, "Merry Christmas to you too." Now he complains when I say he is brilliant.

Some people...

Yes, I stand by the original Brilliance. The key insight there was not the SIM stuff, but the historical Narrative, which explains a whole lot. The story there is exactly the kind of framing I think we'd want to use to build the presentation around. It's easy to understand, and makes it plausible that they are not conspiring conspiracists, but people who got where they ended up with good intentions, but who skipped some crucial steps that no one caught.

That's how I'd want to tell it.

It holds water.

Now, how it was all covered up after the fact, that I'm more willing to throw stones about.

TBV

bk said...

Good morning,

I think that the root of the problem is pure incompetence at every level in the organization. Both personal and organizational incompetence (training, oversight, adherence to procedures, adequate procedures, two sets of eyes on all important details....).

Look at everything: The white-out, the chain-of-custody, the T/E screw up, the different ramp temps, the questionable use of internal standards, not catching high blanks nor out of range standards, different ramp temperatures, different columns (or at least not updating the computer), not even knowing how to back up a computer, differences between laboratory documentation packages on the same set of additional B's, and more.

It is purely a lack of education, training, and rigor. The science and basic organizational operations are not well understood by the technicians, scientists, nor managers.

Anyway, that's my 2 cents and why I don't trust anything from the lab.

BK

Mike Solberg said...

I think that the root of the problem is pure incompetence at every level in the organization.

I think that is pretty much what arb Campbell decided too. So you are in good company, bk.

Remember the epigram with which Campbell began his dissent:

“Whoever is dishonest with very little will also be dishonest with much. . . So if you have not been trustworthy in handling worldly wealth, who will trust you with true riches . .” Luke 16:10.

syi

Mike Solberg said...

Cheryl asked:

"Didn't WADA and the LNDD go through amazing hoops to avoid what would have been straightforward, conclusive, possible with the machines, and what everyone would have accepted -- the entire mass spec data?"

As Larry noted Cheryl, no one can make sense of this really. I think the "historical" explanation is most likely. They were used to not needing it because of the nature of the earlier (and sufficient) GCMS T/E ratio test.

I recently noticed one little bit of information in the LDP that I wonder about. On USADA 305 that under the "Data Analysis Parameters" printout, under the "Qualitative Report Settings" it says the following:

Output destination:
Screen: No
Printer: Yes
File: No


In "mass spec language" the "Qualitative Report" would be the data identifying the substances (as opposed to quantifying) them. The output destination of the "Qualitative Report" only went to the printer, not to the file??? I assume that would be a computer file, which would leave record of the full mass spec data.

This may be a lot to read into a very little bit of information, but doesn't that suggest that the ONLY record of the full mass spec data was the printed copy?

Perhaps they had the software program set up so that they only printed the data that we see on USADA 321 and 322 (and all the equivalent pages for the other fractions)? That would fit what we seem to observe - that is, that they collected, but did not analyze, the full mass spec data.

Just above the "Qualitative Report Settings" are the "Percent Report Settings." This is really speculative, but the "percent reports" could be the 'grams we have on USADA 322 (and similar pages for other fractions). Those are the 'grams that give us the percentages, or ratios, shown on USADA 324. And again, that section says:

Output destination:
Screen: No
Printer: Yes
File: No


Again, this is pretty speculative, but it could be that they only sent this key information to the printer and didn't even save a computer file with the data.

One last thing, the next section on USADA 305 is "Quantitative Report Settings" (it spills on to 306), and there is says:

Output destination:
Screen: Yes
Printer: No
File: No


So it looks like they REALLY weren't interested in whatever data that was, as it only went to the screen. I'm not sure what data that would have been. Anybody?

syi

Larry said...

syi, regarding the "File: No" settings ... yes, I've noticed that too, but I don't know what it means. Do we have access to the operating manual for the MS used at LNDD? Do we even know the make and model for this MS?

Mike Solberg said...

It was the Isoprime, right? And even LNDD didn't have a copy of the manual.

Come on, no experts on the Isoprime out there?

syi

Larry said...

syi -

I know that the IRMS was an Isoprime, from GV Instruments. See the testimony p. 110 (pdf p. 18).

I think that "Isoprime" is GV Instruments' brand name for their IRMS line. See Isoprime web page.

So I think the MS machine was some other make and model.

My best guess at the moment is that the MS is an Agilent model. I say this because there are settings shown in the LDP that match Agilent protocols. For example, Agilent uses the "MSDCHEM" directory to store information, uses *.M file names for method files and *.u file names for tune files.

A "Getting Started" manual for one version of the Agilent GC/MS software is here: agilent manual.

Gotta sleep. Later.

Russ said...

Larry and SYI,
Glad to see you guys keeping up the good work!

I did a google search and some selected results are below. Primarily I was looking for machine identification but could not resist including a few relevant clips!

Two different sites are ref'd and quoted.

From:
http://www.cacnews.org/pdfs/4thq2007.pdf

Title: "The Floyd Landis Sports Doping Case:" Page 11 RH column 3rd paragraph down:

"This data from LNDD is easy to follow since they use the same Hewlett Packard/Agilent 6890GC with
Agilent 5972 Mass Selective Detector that I used for about 20 years."

Interesting from page 13 last par on RH side:
"In terms of both peak separation and peak size, the epitestosterone peak is unsatisfactory.
LNDD needs to review their protocol. A longer capillary column should produce better peak separation
and and added internal standard might provide better precision/reliability." Note the following
paragraph gets into details of SIM requirements for identification of a specific compound!

IRMS:
Page 15 LH column last par:
"However, even the IRMS test values obtained by LNDD for the four testosterone metabolites are
untrustworthy." followed by a discourse on fractionation required reading on through this on p16 LH par:
"Bottom line, if the chromatography of Landis stage 17 urine sample was unacceptable for obtaining
a reliable T/E ratio via GC/MSD, it would be even more unacceptable for IRMS!"

Page 20 RH column next to last par:
"In the Landis case the technicians at LNDD unthinkingly applied the lab's testing protocol to his
urine sample. Had they instead used critical thinking, they would have realized that not only was
the sample too degraded, the GC baseline far too noisy, and peak size and separation unacceptable
to provide a reliable T/E ratio, they would have realized the these same problems could only
exacerbate any attempt at IRMS.
Were LNDD's data presented at an actual criminal trial before a jury in the adversarial U.S.
court system, I wager the trial would never even reach the stage of closing arguments."

From:
http://blog.environmentalchemistry.com/2007/
06/floyd-landis-wada-lndd-chain-of-custody_26.html


Regarding machine type:

Q. All right. I'd like to show you some documents, Dr. Davis, beginning with LNDD 313. Have you seen this before?

A. I have, yes.

Q. And what is it?

A. This is a printout from the JA 10 series IsoPrime, the one used to analyze the Stage 17 samples.

Q. And, Dr. Davis, just for -- for ease of sake, why don't we say that the Stage 17 IsoPrime is -- we'll call it the IsoPrime 1.

A. Okay.

And regarding linearity:

Q. Based upon what you've seen in the records and your visual -- and your inspection of LNDD and discussions with LNDD personnel, do you have an opinion as to whether or not the IsoPrime 1 instrument was linear?

A. I think it drifted in and out of linearity, and I think there was also a degree of uncertainty as to how unlinear it was, because they did not do the tests properly over the full range. And let me just -- to instruct you, let me just emphasize how important linearity is. If you have a peak -- if your system is nonlinear, the isotope number will change dependent solely on the peak heights. If you have a big peak and a small peak, there'll be a shift in that isotopic number reported, irrespective of the composition of that compound. And in the context of GC/C-IRMS in doping control, we have quite a big range in peak heights, so the system has to be linear. This is a very demanding isotope application. Your system has to be working properly.

-------------------------

Hope this is helpful, gotta go squeeze in a ride on a beautiful afternoon!

Russ

Larry said...

Russ, that's terrific research. Thank you.

I think that the "File: No" settings probably do not mean anything. LNDD clearly acquired and saved a GC/MS data file. This is indicated on each page showing a chromatogram. For example, the file shown on USADA 144 (FL's "A" sample) is 17807474F3.D (in the D:\Msd22\Juil06\2307 directory).

My guess is that the "File: No" setting you're looking at reflects the settings required to send data to the printer. In other words, LNDD first saved the data to a file, and then entered the settings required to do a print-out. These settings say, in effect, send the saved data to the printer, don't send it to the computer screen, and don't save the data to another file.

This is my guess.

Russ said...

Larry,
Your link (copied next below) had some interesting information.
http://masspec.scripps.edu/mshistory/whatisms_details.php

Some of the terms that have been wrestled with here are clarified a bit and I was surprised to find them much more specific to the MS process than I would have guessed.

Selected copies to illustrate:
Note it looks like these are all newer technologies in the MS arena but still great help to understanding some of the issues.

Matrix and Matrix interference:

In MALDI analysis, the analyte is first co-crystallized with a large molar excess of a matrix compound, usually a UV-absorbing weak organic acid. Irradiation of this analyte-matrix mixture by a laser results in the vaporization of the matrix, which carries the analyte with it.

MALDI matrix -- A nonvolatile solid material facilitates the desorption and ionization process by absorbing the laser radiation. As a result, both the matrix and any sample embedded in the matrix are vaporized. The matrix also serves to minimize sample damage from laser radiation by absorbing most of the incident energy.

matrix background, which can be a problem for compounds below a mass of 700 Da. This background interferences is highly dependent on the matrix material; (still MALDI)

acidic matrix used in MALDI my cause degradation on some compounds.

DIOS is a matrix-free method

DIOS enables desorption/ionization with little or no analyte degradation.

While DIOS is comparable to MALDI with respect to its sensitivity, it has several advantages due to the lack of interfering matrix: low background in the low mass range;

FAB typically has a liquid matrix. It is also important to note that FAB is about 1000 times less sensitive than MALDI.

It is common to detect matrix ions in the FAB spectrum as well as the protonated or cationized (i.e. M + Na+) molecular ion of the analyte of interest.

FAB matrix -- Facilitating the desorption and ionization process, the FAB matrix is a nonvolatile liquid material that serves to constantly replenish the surface with new sample as it is bombarded by the incident ion beam. By absorbing most of the incident energy, the matrix also minimizes sample degradation from the high-energy particle beam.



Mass Range and Sensitivity:
See Table 1.5. General Comparison of Ionization Sources.

Effusing the sample at very low flow rates allows for high sensitivity. (while nanoESI was the topic, I would guess this is probably typical)

The design of modern analyzers has changed significantly in the last five years, now offering much higher accuracy, increased sensitivity, broader mass range, and the ability to give structural information. Because ionization techniques have evolved, mass analyzers have been forced to change in order to meet the demands of analyzing a wide range of biomolecular ions with part per million mass accuracy and sub femtomole sensitivity.

It should be noted that the increased resolution (typically above 5000) and sensitivity on a TOF reflectron does decrease significantly at higher masses (typically above 5000 m/z).

Resolution:

Since resolution is defined by the mass of a peak divided by the width of a peak or m/Δm (or t/Δt since m is related to t), increasing t and decreasing Δt results in higher resolution.

Figure 2.1. The effect of resolution upon mass accuracy. (has a great graphic illustration of resolution)


Vacuum:
In general, maintaining a good vacuum is crucial to obtaining high quality spectra.

Performance Characteristics

The performance of a mass analyzer can typically be defined by the following characteristics: accuracy, resolution, mass range, tandem analysis capabilities, and scan speed.

Accuracy

This is the ability with which the analyzer can accurately provide m/z information and is largely a function of an instrument’s stability and resolution.

Resolution (Resolving Power)

Resolution is the ability of a mass spectrometer to distinguish between ions of different mass-to-charge ratios. Therefore, greater resolution corresponds directly to the increased ability to differentiate ions. The most common definition of resolution is given by the following equation:
Resolution = M/ΔM Equation 2.1


See Figure 2.2. The resolution is determined by the measurement of peak’s m/z and FWHM ,

Mass Range

This is the m/z range of the mass analyzer. For instance, quadrupole analyzers typically scan up to m/z 3000.

Scan Speed

This refers to the rate at which the analyzer scans over a particular mass range. Most instruments require seconds to perform a full scan,

Regards,
Russ

Russ said...

Did it again,
The link needs ".php" at the end
it was called "Mass Spec History" by you on your imbedded link.

Russ

Mike Solberg said...

My guess is that the "File: No" setting you're looking at reflects the settings required to send data to the printer. In other words, LNDD first saved the data to a file, and then entered the settings required to do a print-out. These settings say, in effect, send the saved data to the printer, don't send it to the computer screen, and don't save the data to another file.

This is my guess.


That's probably right, Larry.

How's that legal work on specificity going?

syi

Larry said...

syi -

The legal work is moving slowly.

I think I've mentioned, your concern about specificity seems to fall under the category of method validation. I think you are questioning whether LNDD's SOP was properly validated in the first place. But I'm also trying to figure out the circumstances that would require a lab to re-validate its methods. I don't know where this inquiry is going to go, but it's connected to the work I need to do to answer Ali's questions.

Ali has raised the point that LNDD should have had to do SOMETHING when its EDF retest numbers failed to jibe with its original test numbers. I don't think that Ali's point goes to method validation, at least not initially, since LNDD's failure to get the numbers to match up could have been caused by any number of factors. So I've been trying to examine whether the mismatched numbers should have triggered some kind of quality control initiative by LNDD to explain what went wrong, to correct any flaw in how the testing was done, or even to revalidate the SOP. It would seem logical to me that such an effort would be required by the SOP, or the ISL, or by good scientific quality control in general ... but I'm not finding anything so far to support my sense of what would be logical here.

It's strange ... but the concept of "quality control" in science does not seem to include the idea that the lab should review its work for anomalous data. The stuff I'm reading seems to indicate that "quality control" IS very much as USADA described it during the arbitration: it consists of things like the test runs, and the positive and negative controls.

I am assuming that there must be more than this to the concept of quality control, and I'm trying to do further research.

Mike Solberg said...

Maybe this is what your work will explain to me, but I still don't understand why you seem to think that ISL 5.4.4.2.1 and 2 don't require the lab to meet the standards of Specificity, etc. in actual practice, and that failure to do so is a ISL violation.

5.4.4.1 says:

Selection of Methods:
Standard methods are generally not available for Doping Control Analyses. The Laboratory shall develop, validate, and document
in-house methods for compounds present on the Prohibited List
and for related substances. The methods shall be selected and
validated so they are fit for the purpose.


And 5.4.4.2.1 (under the title of 5.4.4.2 "Validation of Methods") says:

Confirmation methods [note: as opposed to "screening methods"] for Non-threshold Substances must be
validated. Examples of factors relevant to determining if the
method is fit for the purpose are:


Then is our list of Specificity, etc.

So surely when you read 5.4.4.2 in the context of 5.4.4.1, that's how you decide if the method is fit for purpose. And if your assay doesn't do what 5.4.4.2.1 or 5..4.4.2.2 say it needs to do in actual practice, then it can't be fit for purpose.

What would be the point of saying that the method had to do something, if the actual performance of the method didn't? That is turning the method into an abstraction, which isn't a very scientific thing to do.

I would repeat what I have said before - at the hearing, Suh got Ayotte to agree that those criteria had to apply to the actual tests, not just an abstracted method. And the arbs certainly wrote as if 5.4.4.2.1 provided real criteria for the real tests.

Are you still saying something different than this?

syi

Larry said...

syi -

Yes. I am saying something different than what you're saying.

I am saying that the rules you're citing go to the question of method validation. I'm saying that there's a difference between validating a method and performing that method. I'm saying that there's a difference between validating a method and "checking your work" to see that you followed the method correctly in a given case. I'm even saying that there's a difference between method validation and reviewing your work in an individual case to see if the method produced results that make sense to you (in the latter case, I'm hoping to prove both that the lab IS required to perform this kind of review in each case AND that this review can lead to a new round of method validation).

This makes sense to me. The effort you spend in planning a routine procedure is different from the effort you spend in performing the procedure once the plan is in place. In the planning phase, you're going to think more, and test more, and use different and additional criteria. When you use an approved method, your focus is primarily on following the method.

There's also the issue of quality control. Quality control should be performed each time the method is performed. Quality control should be part of the method as validated. But I read 5.4.4.2 to address method validation and not quality control.

IMO, when you cite 5.4.4.2, you are pointing to criteria for proper method validation. You are saying that LNDD ignored a key criteria for proper method validation, and consequently that the method as developed is an ISL departure. You are attacking their method, and not how the method was performed or the results it achieved (though as I said above, I'd also like to be able to argue that odd results should lead to revalidation). I agree with this argument. LNDD's SOP was no good from day 1 because it had no way to assure specificity.

This IS your argument, isn't it?

But I take it that you'd prefer to interpret 5.4.4.2 to address how the method is performed and not how the method was designed. I understand why you'd want to interpret 5.4.4.2 in this way. It's a good deal easier to attack how the plan played out in an individual case, than it is to attack the plan itself. This may be why the FL team never appeared to address the specificity question. I think their plan was to attack the testing, but not the test.

You ask: What would be the point of saying that the method had to do something, if the actual performance of the method didn't? Agreed, to the extent that the lab should be required to follow their approved methods, and that any failure to do so should be treated as an ISL departure. (Even this is difficult to prove, but it makes logical sense, so I'm sure I'll be able to prove it somehow.)

BannaOj said...

The EU, actually has some very specific things to say about laboratory practice, and I'm wondering if they can be used in any sort of legal sense since the lab was in France.

I'm trying to find the text for 9/11/EC and 9/12/EC to see if they might apply, but all I've found are the ammendments not the originals...

BannaOj said...

This OECD, seems to have been signed on to by the EU, but I can't find the exact links.

Some seem to be about "good laboratory practice"
and others about "good clinical practice"

here's a beginning.

http://www.oecd.org/department/0,3355,en_2649_34381_1_1_1_1_1,00.html

bk said...

Hello,

I am making an argument that there is fair probability of a false positive in the Landis 5aA peak (USADA 349). The compound responsible for this potential false positive is also in the Landis F2 chromatograph at approx. 1358 sec (USADA 343).

By eyeball, the F2 chromatograph has a total of 5 of 11 peaks that are in common with the F3 chromatograph. The F2 chromatograph peaks are:

880* (Included in F3)
910 (Included in F3)
1035
1125 (Included in F3)
1241
1267 (Included in F3)
1302
1358 (Included in F3, and the F3 5aA)
1430
1450
1580

* Note that the 880 peak listed here is the close doublet the 5aAndrostanolAC standard.

It is likely that the F2/F3 chemical separation was not specific to all of the compounds contained in the F2 fraction. This is because nearly ½ of the F2 peaks also have peaks at the same location in the F3 fraction. If the separation was specific to all the compounds in the F2 fraction, then only a small number of randomly occurring peaks would overlap between fractions.

In addition, it is noted the F2 peak at 1358 exists only in the Landis samples, and is absent in the blank samples. This same peak has a low o/oo in the additional B’s.

I concede that my eyeball estimation of 1358 peak location is debatable, and a better estimate of location could completely disprove my point.

I conclude that it is reasonably possible that the F2 compound is responsible for a false positive 5aA value, because the 1358 peak has a low o/oo value, it is not present in the blanks, and because there of general evidence of F2 compounds in the F3 chromatographs. It is my estimation that the probability of this condition causing a false positive is approximately 33% (potential cherry picking here). At any rate, the probability of a false positive is likely greater than would typically be considered acceptable.

How this simplistic observation could relate to specificity and ISLs, that’s up to you guys....

BK

Larry said...

bk -

I'm not seeing what you're seeing, at least not yet.

Your argument, as I follow it, is that (1) the F2 and F3 fractions share in commmon a number of minor peaks, meaning that we can expect the F2 minor peaks to be in F3 as well, and (2) there's a minor peak in F2 that, if present in F3, would co-elute with the 5aA peak, thus potentially throwing off the delta-delta measurement.

I think you've correctly identified the 11 significant peaks in F2 (and for convenience, I'll refer to them by number from left to right). We know that peak 1 (the internal standard) should be in both F2 and F3 by design, so your argument is really based on whether F2 peaks 2, 4 and 6 are also present in F3.

You seem to be basing your argument here solely on comparing the IRMS chromatograms on USADA 343 and 349. This is very difficult to do, given the low resolution of these graphs. The area around RT 910 seconds in USADA 349 is too much of a mess to identify a peak as small as F2 peak 2. There are a couple of small peaks at around USADA 349 RT 1125 that MIGHT correspond to F2 peak 4. I do not see a peak at USADA 349 RT 1267 to correspond to F2 peak 6 (the closest peak I see at USADA 349 is at about RT 1250).

To get better resolution, I'd look instead at the corresponding GC/MS chromatographs at USADA 342 (F2) and 348 (F3). At USADA 342, peak 2 is a small peak at an RT of about 11.1, peak 4 is a small peak at an RT of about 13.2 and peak 6 is a tall peak identified as Andro and comes in with an RT of about 14.6. Your peak 8, the one you think may have co-eluted with the 5aA in F3, is at about RT 15.7 - 15.8. At USADA 348, there ARE peaks that potentially match the RTs for F2 peaks 2, 4 and 6 (I see an F3 peak at USADA 348 that's more like 14.5 than 14.6, but I won't quibble). However, the 5aA peak at USADA 348 has an RT of about 15.5, which does not match up so well with the RT of F2 peak 8.

I'll also note for what it's worth that the GC/MS peak heights for F2 peaks 2, 4 and 6 (short, short, tall) do not match the peak heights for the peaks you're trying to match to at USADA 348 (tall, short, short). I'm not generally a proponent for peak height matching, but if you think that F2 peaks 2 and 6 are present in F3, you might want to guess why there's more of peak 2 and less of peak 6 in F3.

So ... I'll conclude that your "pattern match" between USADA 343 and 349 is interesting, but does not lead me to conclude that F2 peak 8 is present in the F3 5aA peak. I think this is a good example of why Brenna's "pattern matching" technique is inadequate for identifying peaks. We humans are good at identifying patters, but reasonable people can reasonably differ over what the patterns mean. Science demands that our peak identification criteria be more robust and less subjective.

Oh, and by the way, I'm not a scientist etc.

bk said...

Larry,

A couple of quick comments back. Your opening logic does equate to my thoughts.

The #1 peak, that I'm considering are the compounds responsible for the shoulders next to the internal standard, not the standard itself.

Although I do see what you see in the GCMS chromatograms, I'm purposely looking at the IRMS plots, because there are too many variables with the higher resolution GCMS plots: temperature and maybe length and type. We have seen peaks move slightly between the two instruments.

I also don't think peak heights are not particularly relevant here. There was a chemical separation, and the degree of speciation between fractions would be different for each compound.

I also agree there is plenty of quibble room in my arguments.

And by the way, I spend a lot of time in a lab, but my background is not wet chemistry, GC nor mass spec.

kfaber said...

In several posts, WADA Technical Document TD2003IDCR is referenced.

It is perhaps not directly relevant to the outcome of the Landis case, but the mass spectrometry criteria put forth in that Technical Document are fundamentally flawed. This has been emphasized in several articles, e.g. in R.A. de Zeeuw, Substance identification: the weak link in analytical toxicology, Journal of Chromatography B, 811 (2004) 3-12. An illustrative quote from the Abstract: "Moreover, the criteria for establishing a “positive match” leave much to be desired. These observations are corroborated when comparing some recent guidelines for qualitative analysis (issued for various forensic areas by SOFT/AAFS, NCCLS, NLCP, WADA and EU). Apart from showing substantial differences between them on pivotal issues, the guidelines contain various elements that appear scientifically incorrect and/or legally untenable."

Briefly, a statistical foundation is entirely lacking. As a result, one has no idea what the risk of a false positive declaration might be.

The latter is a general problem with anti-doping practices, which has recently been brought to the attention of a wider audience in D.A. Berry, The science of doping, Nature, 454 (2008) 692-693.

Just check (WADA or whatever source) documents for the presence of 'technical' terms like "statistics", "statistical", "validation", "risk", "false positives". The result of such a search may be a real eye-opener, even for the non-technically oriented.

The statistics-based solution has (for this particular class of methods) been published and thoroughly tested a decade ago, see H. van der Voet, W.A. de Boer, W.G. de Ruig and J.A. van Rhijn, Detection of residues using multivariate modelling of low-level GC-MS measurements, Journal of Chemometrics, 12 (1998) 279-294 and W.J. de Boer, H. van der Voet, W.G. de Ruig, J.A. van Rhijn, K.M. Cooper, D.G. Kennedy, R.K.P. Patel, S. Porter, T. Reuvers, V. Marcos, P. Muñoz, J. Bosch, P. Rodríguez and J.M. Grases, Optimizing the balance between false positive and false negative error probabilities of confirmatory methods for the detection of veterinary drug residues, Analyst, 124 (1999) 109-114.

Pretty frustrating!

Best regards,

Klaas Faber

Larry said...

Klaas -

Before getting started, thank you for posting here. I don't believe you've posted on this site before, and it's helpful to have experts writing here on these technical topics. Unfortunately, I fear that few people will see your post. If I can entice you to speak further about the issue of MS peak identification, then maybe we can persuade TBV to give your discussion its own thread, and more people will see what you have to say.

Based on all we know, it would not surprise me at all if the TD2003IDCR standards were inconsistent with other similar standards in place elsewhere, or if there are no studies validating the identification criteria used in TD2003IDCR.

Unfortunately, the problems with peak identification in the Landis case go well beyond any inadequacy in the TD2003IDCR standards. In the Landis case, we have the problem of how to properly identify IRMS peaks, which is not really addressed by TD2003IDCR, even though TD2003IDCR arguably provides the governing standard. In the Landis case, the lab took the steps that we understand are standard to identify the MS peaks. However, the lab did not have a standard in place for IRMS peak identification, and it made this identification based on a casual visual comparison to the identified MS peaks. The lack of a written standard for IRMS peak identification strikes many of us here as a violation of TD2003IDCR, but the lack of such a standard did not trouble either arbitration panel in the Landis case or any of the ADA witnesses.

I have searched in vain for anything in the scientific literature on how to identify IRMS peaks.

I'd like to hear anything you have to say on any of these topics.