Friday, December 07, 2007

The 5bA Anchor argument

One of the arguments lying around is that the 5bA is sufficient to anchor the identification in the Landis F3, based on the following reasoning. The mix-cal acetate has an internal standard, 11k-etio, 5bA, and andro, some of which appear in all fractions. Using the pattern matching method, we claim we can see in the Blank F3 which the IS and the 5bA are matching from the mix-cal acetate. We know the blank also has 5aA, so the peak that is in the blank that matches the peak in the Landis F3 must be 5aA, and similarly for the 5bP. Even though we don't have 5aA or 5bP in the calibration standard, we can extrapolate by transitivity through the blank.

This is shown below:


Figure 1: Formerly Fig 5 of Retention Times II.


The yellow bars are the things we claim match up through all the samples, the mix-cal, the blank, and the Landis F3.

[MORE]

We should note that there are still some open issues about exactly which peak is claimed to be the IS in the Landis F3, because the area is quite cluttered. This is what led WMA to wonder if LNDD decided which one matched by looking at the CIR value of the peak, even though the measured value wasn't entirely within the spec for the CIR of the IS. This adds some lingering uncertainty, because the SOP calls for adjusting things so the IS comes out in the vicinity of 870 seconds, which requires you to know which peak is the IS. If you're off, then later peaks might also be off.

Having claimed to match the IS, and duplicated other conditions, then the aligning peaks in the blank and Landis F3 are taken to be the 5bA.

Figure 2: Claimed matches on the IS and 5bA on the mix-cal, blank and Landis F3.

Now things get transitive. We don't have a standard for the 5aA or 5bP, but we believe they are in the blank. So if we know what they are in the blank, we can similarly match. Let's look at the 5aA first.

Figure 3: The peak claimed to be 5aA in the blank is mapped onto the Landis F3


The identification is being made in the blank based on the belief that the 5aA follows the 5bA -- see Shackleton, who used the same column, for example. But what about that peak that is slightly after the one we're looking at?
Figure 4: There's a little peak just past the one claimed to be 5aA in both of the samples.

We have a small peak in both samples just beyond the one claimed to be 5aA. Could it be the real 5aA? If so, how would we know? We don't have certainty from a calibration mix of any kind. What we do have is a general idea from Shackleton that the 5aA follows closely, and that it often appears the 5bA is tailing into the start of the 5aA. The argument would be that since we have that kind of tailing from the claimed 5bA into the claimed 5aA, that's what they must be.

Leaving open the question: where does TD2003IDCR talk about proximity and tail shape as identification criteria?

It is also very interesting that LNDD took the CIR of the peak following in the blank, but did not in the Landis. Why would they take one measurement, and not the other, for a peak in the same location in both samples? We see in the blank that peak is less negative than the claimed 5aA, and closer to the value of the claimed 5bA. We know it is trivial to take such a measurement, so why wasn't it made for the Landis?

The same argument arises in the 5bP:
Figure 5: Claimed 5bP in the blank, then mapped down to the Landis F3.


There is much less of a peak following the claimed 5bP in the Landis F3:

Figure 6: Peak following claimed 5bP in the blank is minute in the Landis F3


So, there we are. Absent the actual analytes in the mix-cal, we have to extrapolate identities from the blank onto the athlete sample. This isn't a technique specified in TD2003IDCR. The truth of that claim is built on claims that the IS and 5bA were correctly identified in the blank, which leaves us with the previous discussion of the validity of retention times and general pattern matching.

Or, as DailBob notes, the 5bA anchor theory is trying to reach identification by logical argument, not science.

Comment away...


78 comments:

tbv@trustbut.com said...

Larry, remember the way you gave up on the identification discussion because you thought folks weren't interested?

Here we are again.

TBV

Larry said...

TBV, LOL about the "lack of interest" in the identification discussion! Yes, I guess these discussion points wax and wane in interest. I'm glad I never threw out my Nehru jacket, I bet that comes back into style too.

This is a very impressive analysis on your part. I want to make a few global points, then pause to allow M to comment. After all, M is the main proponent here of the 5bA Anchor theory. I'll reserve my comments on some of the finer points in your analysis until later.

I'll preface my points with the usual "I'm no scientist, but ...". In fact, I'm thinking about changing my name here to "Larry Who Is No Scientist", just to save time.

1. Let's give the theory some credit. It's the best theory I've seen to date supporting LNDD's identification of IRMS peaks. IMHO this is a better theory than the ones put forward by Brenna, USADA and the majority arbitration decision. IOW, if we can shoot down this theory, then (again IMHO) we can feel pretty confident that the critical peaks in the FL case were NOT properly identified.

2. The 5bA Anchor theory depends on the scientific validity of using a naturally occurring peak such as 5bA to "anchor" a reference between two graphs. From what we've learned, it seems to be OK to inject an artifical peak as an internal standard into a sample, and then use the artificial peak as a reference point for comparing two chromatograms for that sample. But I've seen nothing to indicate that we can use a naturally occurring peak such as 5bA as an internal standard. Brenna, Botre, USADA, the majority arbitrators -- none of them said it was scientifically OK to do this.

3. If the 5bA anchor theory is scientifically valid, then why include the GC/MS testing as part of the CIR test? Maybe to check the complete mass spectrum of the peaks ... only the LNDD didn't do this.

4. The main problem I have with the 5bA Anchor theory is that it doesn't do what it claims to do. Strictly speaking, it matches peaks but does not identify them. It cannot identify peaks because it relies exclusively on the IRMS, and by its nature, the IRMS cannot identify a peak. Only a GC-MS can identify a peak.

(a) Start with the IRMS chromatograms for the mix cal acetate. Even if we limit our look to these chromatograms, we do not have identification of peaks. We see 4 peaks, and we know the identity of the 4 substances in the mix, but we don't know which peak is which substance. We can make an educated guess that peaks appeared in the order you've shown in your graphs, but it's an educated guess and not an identification. Identification of these peaks is possible only by comparing the mix cal acetate IRMS chromatogram to a mix cal acetate GC-MS chromatogram, using "criteria" meeting scientific standards for peak identification (such as the criteria set forth in TD2003IDCR). And, FWIW, I can't even find a GC-MS chromatogram for the mix cal acetate in the LDP.

(b) Even if you CAN make an accurate educated guess in (a) above, and even if it IS scientifically valid to use a naturally occurring peak like 5bA as an internal standard, we still have not identified the peaks that were used to convict FL. We have not identified the 5aA peak or the 5bP peak. At best, we've identified the 5bA peak. The 5bA peak may be a more convenient peak than the internal standard for doing the "pattern matching" advocated by Brenna, but at the end of the day, we're still stuck with pattern matching. If we have doubts about "pattern matching" as a scientifically valid criteria for identifying IRMS peaks, then the 5bA Anchor theory is not going to do anything to assuage our doubts.

5. Maybe the most damning thing I can say about the 5bA Anchor theory is that it uses the IRMS mix cal acetate runs for an unintended purpose. The reason for the mix cal acetate runs is to check to see that the IRMS is running consistently throughout the course of the IRMS testing, and in particular to check that the IRMS is accurately measuring isotope ratios over a wide range. This is why the mix cal acetate runs are set up as "bookends", one before the first blank urine run, and one after the last athlete sample fraction run. The mix cal acetate runs are not designed to identify peaks. The 5bA Anchor theory proposes to use the mix cal acetate test for a purpose other than the purpose for which this test was designed.

I think we should see the 5bA Anchor theory for what it is: (i) an attempt to use a portion of the CIR testing (the mix cal acetate runs) for a purpose for which it was NOT designed, which has been put forward because (ii) the normal methods that WERE specifically designed for IRMS peak identification (use of RTs and RRTs) do not work in this case. Unfortunately for the theory, it fails in the way the Brenna theory fails, and in the way the majority opinion fails: the theory does not identify the 5aA and 5bP peaks that were used to convict FL.

(IOW, I think I agree with you)

Mike Solberg said...

"And, FWIW, I can't even find a GC-MS chromatogram for the mix cal acetate in the LDP.

I think USADA 309 is the right thing.

syi

Larry said...

Mr. Idiot, I think USADA 309 is a chromatogram for something else. There's too many peaks in this chromatogram for it to be the same mix cal acetate used in the IRMS. Also, there's 5aA and 5bP in this mix.

m said...

Larry,

For the IRMS:

The 5B is properly "identified" in the IRMS because it was injected into the mix cal.

The 5B is contained in the mix cal acetate (USADA 360 and 361, 362 and 363), and in the Blank Urine (USADA 351), and in the F3 sample. The retention times match within the 1% standard, 1316.7 sec., 1323, and 1318 respectively.

Moreover, I bet with additional testimony we can probably validate the blank urine also as a reliable reference material. The retention times of all of the metabolites matched between the F3 and blank urine. There is just too much redundancy here, that refutes any claim that the 5A is not the 5A.

You seem to be making legalistic arguments. "We can't look at the mix cal for retention times because it was only intended for quality control." This from the same people who claim that the only proper way to match sample retention times was with a reference sample like the mix cal acetate.

I'm talking about the science. There is no scientific reason why the retention times of the 5B in the mix cal cannot be used. All of you are looking for excuses not to believe your own eyes and the associated retention time data.

TBV,

The 5A trails the 5B in the GCMS by about 30 seconds, and about 30+ seconds in the GC-IRMS.

You can't assign the 5A peak in the F3 IRMS sample to the small trailing peak as you have done unless it switched places with that small trailing peak (identified as d in my graphs below). Are you claiming that? There is not evidence that that occurred.

It seems we are rehashing old identification arguments. Maybe I will go back and rework some of my statements. I know most of the references are contained in some earlier posts.

GRAPH OF RRT RESULTS GCMS and GC-IRMS


|||||||||||||||||||||||||||||||||||||||||||||||||||||||||scale

1) GCMS Mix Cal
................I........a.A.B.d......P


2) GCMS F3
................I........a.A.B.d......P


3) GC-IRMS F3
......................I..........a.A.B.d.......P


4) GC-IRMS F3 Blank Urine
......................I..........a.A.B.d.......P


5) GC-IRMS Mix Cal
......................I............A............

Again

Again look at my graphs.

tbv@trustbut.com said...



You can't assign the 5A peak in the F3 IRMS sample to the small trailing peak as you have done unless it switched places with that small trailing peak (identified as d in my graphs below). Are you claiming that? There is not evidence that that occurred.


This takes us back to burden of proof. There is equally no evidence there was no switch, because we don't have solid identifications, and we changed the chromatographic conditions significantly between the MS and the IRMS: We have different columns and different ramps. What we have that stayed the same are the IS and the 5bA. It's not clear we IDed the IS correctly in the forest at the low end, since it's corroboration can only be the CIR value that is out of spec to the cal mix. That seems to leave us hanging on the thread of the 5bA identity, which may or may not be an "anchor"

Note that we are talking about positive ID as required by td2003idcr, not proof by negation, "what else could it be?"

(We're also not talking about specificity at all in this conversation, so let's try to keep that out, OK?)

TBV

m said...

TBV,

.....a.B.A.d......


I'm talking about the science, not td2003idcr. I've made the case for the legality under td2003idcr those are different arguments. We are talking about the science here.

So you are claiming that the 5A switched with the d. That is the only way that the 5A is not identified, because the 5B is identified by the mix cal acetate.

So what evidence do you propose that tends to show that the 5A peak switched with the "d" trailing peak in my graph.

We have 2 propositions:

1. The 5A peak did not switch and follows directly the 5B.

2. The peak that follows the 5B is not the 5A, because it switched with the d.

What probability to you assigned to each proposition? I assign 95% to proposition 1, and 5% to proposition 2.

Larry said...

M, what you are doing is looking hard at the data and trying to come up with an explanation for what you are seeing. Which explains for me your "believe your own eyes" statement. I admire the effort, and I've said that I think your work on this score is better than the work of Brenna, LNDD, USADA and the majority arbitrators. I believe that I understand your argument completely. I'm giving you credit for having made it.

But my arguments back to you are not intended to be "legalistic". I'm happy to make legalistic arguments, if you'd like to hear them. I'm trying to make SCIENTIFIC arguments. Yes, I'm no scientist. But I'm listening to the scientists, like D-Bob, Duck and Ali, and I haven't lived 50-odd years on this planet without learning a little science in the process. If you don't want to regard me as an authority on science, I'm OK with that, but at least hear me out.

Let me try to list the SCIENCE reasons why I can't buy into your theory. At least not yet. I'll try to make this a different list than the ones I've made before.

1. Your theory is unique. Brenna did not advance it, Shackleton did not endorse it, none of the WADA labs are following it, Tygart did not argue it, Botre did not whisper it into the ears of the arbitrators, and the arbitrators did not include it in their final opinion. It is new stuff. It has not been argued in the journals, it has not been peer reviewed, it has withstood no scientific scrutiny whatsoever. At best, your theory is a new and untested hypothesis. I will give your new and untested hypothesis a serious hearing, in part because of the respect I have for you personally, and in part because I think IMHO it's not at all bad. But regardless of what I think of your hypothesis, it is just that: a possible explanation for how one might identify IRMS peaks, an explanation that has never been reviewed or tested by a scientist. And while it is terrific science to put forward hypotheses, it is lousy science to reach scientific conclusions on the basis of untested hypotheses. (I mean no offense here, I'm just trying to explain my position.)

2. Your theory is based on observations you've made in one single case - the FL case. Again with all due respect, how can this be a respectable scientific theory? Any scientific theory I've ever heard about is based on experience with many cases.

3. You are applying your theory to the same case you used to derive your theory! This cannot be good science. Science requires that the predictive power of a scientific theory be tested against a case that was not used to derive the theory itself. For example, I might have experienced a solar eclipse last Tuesday, and propound a scientific theory that from here until eternity, there will be solar eclipses on Tuesdays. Then we wait until next Tuesday to see if my theory will be disproven. I cannot argue that my theory is proven because there was a solar eclipse last Tuesday. Last Tuesday's solar eclipse is the data I used to DERIVE my theory; this same data cannot be used to PROVE my theory. Ditto for your 5bA anchor theory: you used the FL case to DERIVE this theory; you cannot then turn around and say that the same FL case proves that your theory is right.

4. Your theory flies in the face of what I understand to be a red-letter SCIENTIFIC rule of chromatography, namely that IRMS testing cannot be used to identify peaks. As you know, the substances in the GC peaks are incinerated in the combustion phase of the IRMS. Their identity is lost in this process. The resulting peaks contain nothing but CO2. (If this is not red letter SCIENCE stuff, then I'd ask one of the science types to step in and correct me.) For this reason, every science explanation I've ever seen for identifying IRMS peaks (and this includes Brenna's infamous "pattern matching") requires that IRMS peaks be identified by reference to GC-MS peaks. Your theory would effectively allow us to abandon the GC-MS phase of the CIR testing, a phase that seems to be accepted as scientifically necessary by the scientists. So, not only is your theory new and untested, it's also a radical departure from the existing science.

5. Finally, I have one main problem with your theory. I think this final problem is the ultimate in scientific problems. Let's assume for the moment that we can somehow get past the problems I've raised above. We get your theory peer reviewed, and tested on multiple cases, and we somehow explain our way around the fact that (for good scientific reasons) IRMS has never been regarded as a way to identify peaks, and we've all had time to adjust to the radical nature of this theory, and we address all of the other problems. We manage to do the science things necessary so that the Theory of the 5bA anchor is solid science, the kind that scientists can rely upon in matters as serious as the FL case. Then, where are we? We've managed to scientifically identify two peaks in the FL IRMS. They are two peaks that we don't actually need to identify in the FL case. There are two other peaks that we DO need to identify, and they're ... well, they're somewhere else. You still have to pattern-match to identify those other peaks.

If pattern-matching is a valid scientific technique (and I do not think that it is), then we don't really need the Theory of the 5bA Anchor. We can match patterns from the 5aA AC internal standard, just like Brenna did. If pattern matching is not a valid scientific technique, then it remains invalid notwithstanding the 5bA anchor.

M, I humbly advance the preceding arguments as SCIENTIFIC arguments.

Now, onto the points you made above. You point out that the RTs match for the peaks in the various IRMS chromatograms. I grant you that. You've matched peak to peak. Putting the specificity arguments to one side, I grant you that the stuff in the peaks you're matching up is probably the same from sample to sample. I also know that the stuff in any peak you're matching up has to be one of the four substances in the mix cal acetate. I also suspect that these peaks will appear in the order of 5aA-AC, etio, 5bA and 11 keto. But I don't know this for sure. Unless I can identify the peaks in the mix cal acetate IRMS chromatograms with scientific certainty, I'm nowhere. Which leads us back to points 1-5 above.

I pointed out that the mix cal acetate portion of the IRMS test was NOT designed for peak identification. You think this is a "legalistic" argument. I think it is a science argument. To be certain, methods designed for purpose "A" often prove to be useful for other purposes - I'm thinking about how baby aspirin can reduce the risk of heart attacks, and certainly no one designed baby aspirin so that middle aged men could reduce the risk of a coronary! So granted, nothing prevents the mix cal acetate to be used for peak identification, once it's proven to work (the way baby aspirin was tested in groups of hundreds of men over the course of many years ...).

You make a reference to "people who claim that the only proper way to match sample retention times was with a reference sample like the mix cal acetate". Am I one of these people? Hmm. I think I did express some regret that LNDD did not do IRMS mix cal acetate runs using a mix that contained 5aA and 5bP. That would overcome some of the objections I'm raising here, but not all of them. I would still have the objection that every method for IRMS peak identification I've ever seen propounded by a scientist (including Brenna) requires that IRMS peaks be identified by reference to GC-MS peaks.

You say that there is no scientific reason why the retention times for 5bA in the mix cal acetate cannot be used for identification. I agree. But that's kind of a legalistic statement, isn't it? You're using a double negative. The better question is phrased by removing the two negatives: is it scientifically valid to identify IRMS peaks by the use of the 5bA anchor method? The answer is no. IMHO. Not until you jump through the hoops listed above.

Mike Solberg said...

m, I haven't commented on this "5bA anchor" argument much yet, partly because I don't think I have processed all these arguments yet, and partly because it "scares" me. Eeeek, it might prove me wrong. I have to say that your "5bA anchor" argument has some obvious merit and plausibility. (I guess in saying that I am leaving aside the specificity argument for now.)

In a way it is a more detailed variation of the "peak matching" argument, so I wonder if there is any literature/studies you can direct me to that talk further about the peak matching method of identifying compounds in an IRMS. I'd like to read about the scientific foundation of it before commenting further.

syi

Larry said...

Mike, for the record, I agree with your reaction to the 5bA Anchor theory, except for the part about it being "scary"! It's the best explanation I've seen supporting LNDD's identification of the S17 5aA and 5bP IRMS peaks. M deserves credit for bringing it to our attention. But it's not scary. Not IMHO.

For all of the reasons I've argued above, I don't think this theory identifies 5aA or 5bP in a "scientific" way. As D-Bob wrote elsewhere, this theory is based on "logic", not science. As you wrote, it is a variation on "peak matching".

From both a scientific and a legal perspective, I'm troubled by the approach taken here by M, which (as you said) is only an elaboration of the approach taken by Brenna and (ultimately) by the majority arbitrators. The approach is, if we can come up with an explanation for the FL test results that seems logical to us, then it's OK to convict FL based on this explanation. The problem I have here is with the after-the-fact nature of the explanation. To my knowledge, no one suggested BEFORE S17 that this was the right way (or even A right way) to identify IRMS peaks. To my knowledge, the 5bA anchor theory was not a part of LNDD's SOP prior to S17. To my way of thinking, both from a legal and a scientific perspective, there's not much truth value in seeing whether you can come up with a novel explanation of the test results after you've studied the results. The truth content comes from first developing and testing criteria for evaluation of test results, THEN doing the testing, and FINALLY applying your pre-existing criteria to the results. Otherwise, I think you're just guessing.

What disturbs me even more is the knowledge that science HAS developed criteria for IRMS peak identification, and this criteria existed prior to S17. This criteria requires GC-MS peaks to be identified with the complete mass spectra and by retention time analysis (as D-Bob explained), and then for the IRMS peaks to be identified by reference to the GC-MS peaks, again by use of retention times or relative retention times. This was the criteria that the experts had in hand when it came time to analyze the S17 results. The experts attempted to use this scientific criteria, and the criteria failed to identify the S17 IRMS peaks. We're reaching out to the 5bA anchor theory, not because we lack any scientifically accepted criteria for identifying IRMS peaks, but because the scientifically accepted criteria failed to justify the conclusions reached by LNDD.

If I were prepared to recognize the 5bA anchor theory as a scientifically valid way to identify IRMS peaks (and I'm not), then I'd at least have to ask the question: why don't our two alternative sets of scientific criteria (5bA anchor theory and RT-RRT comparison to GC-MS peaks) agree with each other? If the S17 IRMS 5aA peak is identified by means of the 5bA anchor, then why couldn't it be identified by the GC-MS RT-RRT comparison?

Up until recently, we've assumed that the RT-RRT comparison didn't work because LNDD was "sloppy" in the way it set up the chromatographic conditions for the GC-MS and IRMS portions of the CIR testing, which threw off the RTs for the IRMS so that they could not be compared to the RTs for the GC-MS. We now know that this assumption is wrong in at least two respects. First, LNDD was not sloppy in its set-up -- it appears that LNDD set up the chromatographic conditions precisely as described in the SOP. Second (assuming for the moment that LNDD used the same column for both its GC-MS and IRMS testing), the difference in the temperature ramp-up for the GC-IRMS test and GC-MS test should have resulted in shorter than expected RRTs for the IRMS, but instead we see longer than expected RRTs.

Given that the RRTs have moved in a different direction that what we'd expect, I would suggest that there's something going on here that we don't understand. Your suggestion that the LNDD used different columns for the GC-MS and GC-IRMS test (a suggestion supported by the reports in the LDP) is the best one I've heard to date. If the LNDD used different columns in the GC-MS and GC-IRMS portions of the CIR test, then all bets are off. You can toss the 5bA anchor theory (and for that matter, the RT-RRT criteria) right out the window. As OMJ has admitted, there's no way to identify IRMS peaks if you screw around too much with the chromatographic conditions.

So ... for all the reasons I've stated here and elsewhere, I don't buy the 5bA anchor theory.

Mike Solberg said...

As I said, I'm withholding judgment until I see what m has got on this. He must be able to refer to something to back up his argument. I'd like to see it. If this is a more detailed version of Brenna's "eyeballing" method of identification, then Brenna must rely on something to back him up. I'll wait for m to direct me.

syi

Larry said...

TBV, want to give you some more specific feedback on your analysis here, and raise a couple of questions.

I think this is a terrific piece of work, by the way.

You said "We should note that there are still some open issues about exactly which peak is claimed to be the IS in the Landis F3, because the area is quite cluttered." To be certain, the area IS quite cluttered on the LNDD graph (see USADA 349), and Dr. M-A made a big point that he could not identify the IS on this graph. However, M's argument here appears to me to be pretty good. For the moment, I'm looking at the testing of the "B" sample. LNDD identified the IS in the chromatogram for FL's F3 sample as a peak with an RT of 872, and a delta 13C 0/00 of -30.11. (USADA 351) From the data processing results - Analysis of Sample Peaks on USADA 350, we can see that this peak is the third peak mentioned, with an RT shown more precisely to be 871.9. If I'm following M correctly, he's matching this peak to a peak in the blank urine with an RT of 872.4 (USADA 347) and peaks in the mix cal acetate with RTs of 870.6 (USADA 360) and 870.5 (USADA 362). That's pretty good matching, well within the 1%/12 second criteria under TD2003IDCR.

Also, while the area around the IS apprears cluttered in USADA 349, this is in part because the graph shown in USADA 349 is too compressed. The x-axis is too short to properly display the data. For example, there's a peak following the peak identified by LNDD as the IS, and this second peak appears to be so close to the IS on this graph that you can barely distinguish the two. These two peaks appear so close together, that the delta 13C 0/00 reading for the second peak is superimposed over the delta 13C 0/00 of the IS, so you can't even read the delta 13C 0/00 for the IS on the graph. But this crowding has more to do with poor graph drawing than it has with poor data. According to the data shown on USADA 350, the peak immediately following the IS has an RT of 880.6, about 9 seconds later than the RT for the IS. This later peak does not have an RT that matches up (within the 1% rule) to any peak on the mix cal acetate run. Again following M's argument, there's only one peak on USADA 349 that matches a peak in USADA 360 and 362 -- the peak identified by LNDD as the internal standard.

You stated that "the SOP calls for adjusting things so the IS comes out in the vicinity of 870 seconds." A-ha! So THAT's what the SOP means when it says "ajuster le SI a environ 870s"! Thanks for the lesson in French chemistry.

The remainder of your discussion focuses on trying to figure out which peaks to the right of the 5bA peak (as identified by LNDD, and by the 5bA anchor method) are the 5aA and 5bP peaks. You are looking both at IRMS peak RTs (placement on the x-axis) and IRMS peak heights as compared to RTs and peak heights on the GC-MS.

I am giving the matter of peak heights some careful consideration. We've been told by Duck that peak heights are not really comparable between GC-MS and IRMS runs, and I think I'm beginning to see why they're not. Here's my reasoning (and yes, I'm not a scientist, and the following needs to be reviewed by the science types here at TBV):

1. As Duck has explained, peak heights on an IRMS relate to the amount of carbon in the molecules contained in the peak. The amount of carbon in molecules of testosterone metabolites is pretty consistent from metabolite to metabolite - 19 carbon atoms per molecule in most cases, 21 carbon atoms per molecule for 5bP (that is, if I'm reading the source material correctly - chemistry is not exactly my strong suit. I'm relying on http://www.jbc.org/cgi/reprint/182/1/299.pdf p. 300 and following, you may want to review this to see if I have it right). So peak heights for our substances of interest should, relative to each other, be proportional to the amount of stuff in each peak. This may NOT be the case for other stuff in the mix, of course. Stuff that has heavy carbon content will have relatively tall peaks for a given amount of substance in the peak, and stuff with little or no carbon content may not even produce a peak. This may throw off any effort we make at pattern matching.

2. The IRMS tests appear to be SIM tests run to pick up ions 44, 45 and 46 only. See the IRMS SOP at USADA 329. This means that the IRMS tests measure only a percentage of the stuff shown in the peaks in the GC-MS total ion chromatograms. Moreover, if I understand this correctly, the percentage of the stuff in a GC-MS peak that's made up with ions 44, 45 and 46 is going to be a DIFFERENT percentage from substance to substance. That is how each substance has a different mass spectrum fingerprint - each substance ionizes in different ways. So once again, peak heights on the GC-MS are not going to translate directly into peak heights on the IRMS.

3. There's another factor that I think I've come to recognize, which is that our GC-MS graphs and IRMS graphs are set up quite differently, and measure different things on their respective y-axes. The IRMS measure total signal strength on its y-axis -- you can see this by comparing the measurement of the peak heights on USADA 349 to the signal intensity (Intensite (nA)) shown on the chart on the bottom of USADA 351. As you'll see, the height of each peak on USADA 349 is the same as the intensity shown on USADA 351. In contrast, the peak height on the total ion chromatogram on USADA 348 does not represent signal intensity. You can see this by comparing the peak heights on USADA 348 to the signal intensity for these peaks shown on USADA 321 - I think that the relevant column is "Target Response". As you can see, peak heights are NOT proportional to signal strength. This is easy to see if you compare the peak heights for 5aAC and 5bA on USADA 349. 5bA is about twice as tall as 5aAC, but it has about 4 times the signal strength. This is because the amount of stuff in a GC-MS peak is represented not by peak height, but by peak VOLUME, and peak volume is a product of the peak height and the breadth of the peak base. So this is another way in which comparing peak heights from a GC-MS to a GC-IRMS is problematic: the GC-MS is a two-dimensional graph, while the IRMS is set up more like a bar chart.

For all of these reasons, I don't think that the y-axis provides a valid point of comparison for the GC-MS and GC-IRMS graphs. A "tall" peak on the GC-MS may not end up as a "tall" peak" on the IRMS, and vice versa. Any relationship between peak heights on these two graphs is tenuous at best, and (IMHO) any attempt to identify peaks based on peak height runs a risk of being seriously misleading. I think that only RTs and RRTs (i.e., the x axis) offers a valid ground for comparison.

Now, if M can offer something scientific, something more than Brenna's testimony on pattern matching that would validate an identification approach based on peak heights, then maybe I'll reconsider.

I'm working on some numbers to try and make this point more clearly. More to follow.

tbv@trustbut.com said...


You stated that "the SOP calls for adjusting things so the IS comes out in the vicinity of 870 seconds." A-ha! So THAT's what the SOP means when it says "ajuster le SI a environ 870s"!


That's why they line up so nicely. You pick the one you think should be it, and then you adjust things so its at 870s. Thus, I don't see any evidentiary value in their being near 870s.

How did they pick the one that was the IS to adjust to it was at 870s?

We have no idea. This was why WMA guessed they needed some way of picking it out of the noise, perhaps using its known CIR.

I also note that the listings of peaks, as for example the USADA 350 Larry cites, seem to be incomplete. There are many more peaks present than listed, suggesting some deemed "uninteresting" have been deleted or supressed in some fashion.

Compare USADA 350, for example, to LNDD 894, from EX 86, where there are 22 labelled peaks, of which 13 are in the initial region of the IS.

TBV

Larry said...

TBV, are you saying that the GC automatically adjusts the pressure to make certain that the internal standard (or some peak that the lab THINKS is the internal standard) comes out at 870s? How can it do that? The GC can't spit out peaks at predetermined times, can it? I don't think the GC has some kind of internal sensor that says, "the leading peak is moving too slowly, let's up the pressure and move it out faster." Or does it? I'm sorry, but this last statement of yours has me completely confused.

Your point about extra peaks in different graphs is duly noted. This is another problem I have with the 5bA anchor theory. It effectively says that we do not need to pattern match across the entire x-axes of our graphs, we can just focus on a little portion of the graph and pattern match there. If pattern matching is valid (and I don't think it is), then it should work across the entire graph, and not just on a piece of it.

tbv@trustbut.com said...

No, I'm saying they MANUALLY adjust the inlet pressure so the thing they think is the IS is around 870.

Do we know what the inlet pressure on each injection is? Might this account for some of the reinjections we suspect because results were "wrong"?

Of the things one might keep consistent across chromatographic analysis on different instruments, we have (1) column; (2) temperature ramp; (3) inlet pressure.

Now, what did LNDD keep the same so we'd have apples to apples comparisons?

TBV

Mike Solberg said...

Now, what did LNDD keep the same so we'd have apples to apples comparisons?

Is that a trick question?

In trying to find support for the 5bA anchor argument, I found this study:
http://www.rsc.org/ej/AN/1998/
a805215h.pdf

It's old (1998) but it was interesting that they used two different columns for the GCMS and IRMS. It looks though, like they ran a complete known comparison run in the IRMS, and matched the retention times. But it doesn't say how they knew for sure what the knowns were in the IRMS. It sounds like they relied on the order of the GCMS, but I am not sure. They did use the same temperature ramp for GCMS and IRMS.

tbv@trustbut.com said...

That is Ferchaud, et. al., and the apparatus description is:


Gas chromatography-mass spectrometry. Identification of
the steroid structure present in bovine urine was performed on
a Hewlett-Packard Model 5972 quadrupole mass spectrometer
coupled to an HP 5890 gas chromatograph with an OV-1 fused
silica column (30 m 3 0.25 mm id, film thickness 0.25 mm)
(Interchrom, Montlu├žon, France). The transfer line temperature
was set at 280 °C and the split/splitless injector was maintained
at 250 °C (1 min delay). The initial oven temperature was set at
120 °C (held for 2 min), increased at 15 °C min21 to 250 °C and
then at 5 °C min21 to 300 °C (held for 3 min). Helium (N55)
was used as the carrier gas at 1 mL min21. The electronic beam
energy was set at 70 eV in the electron ionization (EI) mode.
Compound structures were confirmed by comparison with
steroid standards when available.

Gas chromatography-combustion-isotope ratio mass spectrometry.
GC-C-IRMS analyses were performed on an HP
5890 gas chromatograph coupled to a combustion furnace
(Al2O3, 320 mm 3 0.5 mm id 3 1.55 mm od) and a Finnigan
MAT (Bremen, Germany) Delta S isotope ratio mass spectrometer.
Acetylated samples were injected splitless onto a DB1
capillary column (30 m 3 0.32 mm id, film thickness 0.25 mm).
The GC temperature programme was identical with that used
for GC-MS, except that the final temperature was 290 °C, held
for 6 min. The separated analytes were converted into gases in
a combustion furnace filled with copper oxide (held at 940 °C).
Any H2O formed was removed by a water trap, consisting of a
Nafion membrane. The CO2 was determined with an isotope
ratio mass spectrometer. The instrument measured the CO2
isotopomers, i.e. m/z 44, 45 and 46 ions, and the m/z 45/44 ratio
after correction.9 The 13C/12C isotope ratio measured was
denoted with the symbol d and corresponds to the difference
between the 13C/12C isotope ratio of the sample and that of the
international carbonate standard. The international standard
(PDB) is a Cretaceous belemnite shell from the Pee Dee
formation in South Carolina. d13C was calculated according to
the following equation:

[omitted]

For the results presented here, three measurements for each
sample were obtained and averaged. The standard deviation
accepted for urine samples was 0.5‰. Before and after each
injection sequence, a standard mixture of steroids was injected
to check the sensitivity and the specificity of the method and the
stability of the carbon isotope ratio measurements.


TBV

Mike Solberg said...

I know I am cherry picking (because this article says some things that contradict the 5bA anchor argument), but this old Catlin paper,

http://www.arniebakercycling.com/
floyd/other_links/Catlin%201992.pdf

might give some legitimacy to using blank urine as a positive control. Catlin wrote (speaking of GCMS, not IRMS):

Unambiguous identification is accomplished by matching
the RRT and spectra of the identified substances with
those of authentic reference standards concurrently
extracted from spiked urines (positive quality control) or
certified positive cases. The reference spectra are contained
in a mass spectral library. Such a library should be developed in each laboratory by analyzing derivatized
compounds or their metabolites under comparable operating
conditions. If reference standards of metabolites are not available, clinical studies may be performed by administration of the parent drug to man followed by
timed urine collections, and the resulting urines used as
positive quality control.


syi

Larry said...

syi -

I don't see how the Caitlin article supports the use of blank urine as a positive control. According to the testimony in the arbitration, blank urine is a real "live" urine that's supposed to resemble the urine of an athlete. At LNDD, the blank urine is "spiked", since the internal standard 5aA AC is added to the blank urine for the CIR tests. However, the blank urine is not "spiked" with the metabolites that the CIR is seeking to measure, such as 5aA and 5bP. So I don't think that Caitlin is referring to blank urine when he speaks of "spiked urine".

I think what Caitlin is describing in this article is the same procedure D-Bob describes in his "different columns" article here: the legitimate way to identify GC-MS peaks in an athlete's sample is to use RTs or RRTs to compare these peaks to an authentic reference sample (the mix cal would be best), and to verify this identification by looking at a full mass spectrum for the athlete's peaks.

It doesn't make sense to me to use blank urine as a reference sample. To do so, you'd have to first identify the peaks in the blank urine the same way described by Caitlin and D-Bob, against a reference standard like a mix cal (and then with the full mass spectrum). But why go through this extra step? You're just introducing a little bit more potential error (add the +/- 1% for the identification of peaks in the blank urine to the +/- 1% for the identification of peaks in the athlete's sample), and some additional complication (because the blank urine is going to contain peaks in addition to the peaks you're seeking to identify). There's no reason I can see not to "eliminate the middleman" and just use the mix cal as your reference standard.

The only reason we're talking about the 5bA anchor theory is because 5aA and 5bP were not included in the mix cal used in the IRMS portion of the CIR testing, so we have to hunt to another source like the blank urine to find a reference standard. If we were DESIGNING a test procedure rather than trying to address arguable deficiencies in tests already run, I don't think we'd be discussing the use of blank urine as a reference standard.

Also IMHO, it's a huge leap from Caitlin's discussion about GC-MS peak identification to the discussion of using an IRMS reference to identify IRMS peaks. Granted, I don't spend my evenings reading "Chromatography Today" or "Chromatography Illustrated". But I've never seen anyone with scientific credentials say that IRMS peaks for an athlete's sample can be identified by reference to an IRMS standard.

The only exception is, it must be possible to identify the internal standard in an athlete's sample by reference to the internal standard in a mix cal. Otherwise there would be no way to definitively measure the IRMS RRTs in the athlete's sample. However, once you've identified the IRMS internal standard in an athlete's sample, my understanding is that the identification of all other IRMS peaks is supposed to be conducted by comparing those peaks to identified peaks in the corresponding GC-MS chromatogram. Even Brenna does his pattern matching in this way. I have not seen anyone (outside of this blog!) try to identify IRMS peaks by reference to other IRMS peaks.

I'm prepared to stand corrected, not being a scientist and all that.

Larry said...

TBV, thanks for focusing on the issue of GC pressure.

I'm not an expert, etc., but IMHO you've uncovered "Hide In Plain Sight" part deux. Again IMHO, the issue of GC line pressure is worthy of its own article.

LNDD's SOP CALLED FOR DIFFERENT PRESSURE SETTINGS FOR THE GC-MS AND GC-IRMS PORTIONS OF THE CIR TEST.

The SOP for IRMS pressure (USADA 329) - Pression constante: Ajuster le SI a environ 870s.

The SOP for GC-MS pressure (LNDD 1427) - Pression constante: Ajuster le SI a 10.7 min (+/-0.5 min).

The issue here is not whether these two standards can be compared or reconciled. The issue is why there would be two standards! IMHO based on what I've read, the pressure set for the GC-MS portion of the test SHOULD NOT BE CHANGED when you move over to the IRMS portion of the test. IMHO the SOP for the IRMS testing should have indicated this - "on ne change pas le pression", or something to that effect.

IMHO, based on what I've read, it would be CRITICAL to maintain the same GC pressure during the GC-MS and GC-IRMS portions of the test. Changing the pressure would have the same kind of effect as changing the temperature ramp. Changing the pressure screws up the ability to compare GC-MS and GC-IRMS peaks based on RTs or RRTs, and even introduces the possibility of peaks emerging in different orders (I need to provide support for this last statement).

Swim, the issue of GC pressure throws a wild card into our speculation that the IRMS RRTs must be caused by a column shift, since they cannot be explained solely by the difference in temperature ramps. We now have three factors to consider, as TBV has stated: column type, temperature ramp and GC pressure. We know that LNDD changed the temperature ramp from the GC-MS to the GC-IRMS. We're not sure whether they changed both the column type and the GC pressure, but I think we'd need a change in at least one of these things to account for the RRTs.

Mike Solberg said...

But why go through this extra step? You're just introducing a little bit more potential error (add the +/- 1% for the identification of peaks in the blank urine to the +/- 1% for the identification of peaks in the athlete's sample), and some additional complication (because the blank urine is going to contain peaks in addition to the peaks you're seeking to identify). There's no reason I can see not to "eliminate the middleman" and just use the mix cal as your reference standard.

Good point. But as you say, we are not dealing with the way things SHOULD be done here. We (or maybe just I) are trying to figure out if m's argument has any validity. We are dealing with what we've got, and seeing if it is reliable. That was Catlin's point, too, by the way. Using this urine sample as a positive control was a "plan B" for him too. I think Catlin shows there is nothing inherently unreliable about using another "naturally spiked" sample, or by extension a blank urine sample, as a positive control.

You still have many of the same identification problems, but again, this is plan B.

I'm just pushing this to see if it is worth anything. Obviously I could be wrong.

syi

Mike Solberg said...

By the way, Larry, you wrote:

But I've never seen anyone with scientific credentials say that IRMS peaks for an athlete's sample can be identified by reference to an IRMS standard.

Is that right? IF (and obviously that's a big IF if our case) the chromatographic conditions are the same, a reference standard in the IRMS would help a lot, wouldn't it?

syi

Larry said...

Mike, if your plan "A" is to use a mix cal as your GC-MS reference standard, and plan "A" fails, then using the blank urine may be an acceptable plan "B". You still have to have the blank urine peaks identified the way Catlin describes, against a different reference standard and with confirming complete mass spectrum, and you'd have to account for the additional margin for error (the +/- 1% squared problem), and I'm only acknowledging that this could work for the GC-MS. Of course, LNDD did not identify the blank urine peaks in the way Catlin describes (at least I don't THINK they did so), and there's nothing M can do to provide that kind of identification (which is why I think M is relying on the mix cal acetate as his reference standard), so I don't see how the blank urine in FL's case could serve as a plan "B".

As to whether an IRMS reference standard would be helpful ... obviously we need the IRMS internal standard, or we can't measure RRTs that can be compared to the RRTs for the GC-MS portion of the test. I'm not sure that any other IRMS reference standard would be helpful. First, as I continue to stress, I have not heard any of the scientists say that an IRMS reference standard would be helpful. Everyone seems to say that the only way to identify IRMS peaks is by comparing them to identified GC-MS peaks.

But there's something else at work here also, that seems to me to argue against the validity of using an IRMS reference standard. This has to do with the fact that we cannot seem to identify IRMS peaks by RT alone - it seems that only RRTs will identify IRMS peaks. Why should this be? If we subtract out the time for IRMS combustion (which is supposed to be the same time for all peaks), why don't we have matching RTs for the GC-MS and GC-IRMS portions of our test? I don't know the answer to this, but there seems to be some "constant" at work that causes IRMS peaks to have RTs that are more "spread out" than the corresponding GC-MS peaks. We might argue that it's the difference in the GC chromatographic conditions that caused the IRMS peaks to spread out, but I never heard anyone argue this at the arbitration (even Dr. M-A did not seem to say this), so I suspect that there are other factors at work. The idea behind the RRT is that the IRMS peaks should spread out consistently and predictably, by some multiple.

Let's say that for athlete "X", you had 4 peaks, with GC-MS RTs of 1, 2, 4 and 8, and IRMS RTs (after deduction of the combustion time) of 2, 4, 8 and 16. Obviously, the RTs do not match. However, if I understand correctly, the RRTs for these two tests DO match - you simply use a constant of 2. So the RT of each GC-MS peak, multiplied by the constant, equals the RT of the GC-IRMS peak. So you can use RRTs to identify the peaks for athlete "X"'s sample.

Now, let's go on to the testing for athlete "Y". Let's say that for this sample we also have 4 peaks, with GC-MS RTs of 1, 2, 4 and 8, and IRMS RTs (after deduction of the combustion time) of 1.5, 3, 6 and 12. Again, the RTs do not match and the RRTs DO match, this time using a constant of 1.5. Again, we have peak identification.

So we have these two samples in hand, for athlete "X" and athlete "Y". We've identified the IRMS peaks in both samples. Which one is the valid reference standard? Obviously, there's no way you can choose, since they're both equally valid, and since they have different RTs.

Again, my reasoning here may be faulty, so tell me if you think I'm wrong ... but unless the RRT "constant" is always shown to be the same for all IRMS tests, then there's no way to utilize an IRMS reference standard. And if the constant IS the same for all IRMS tests, then it ought to be part of the SOP, and wouldn't we have heard about the existence of such a constant by now?

m said...

I don't have time to work up a full response right now.

I do want to repost my graphs with some further explanation and I will do so later.


Four brief comments:

1. My graphs go to the science of the identification not the legal requirements.

2. None of the experts including Meier thought or testified that the Lab should use the Cal Mix as a full chromatographic reference standard for identification in the GC-IRMS, only for the GCMS.

Mongongu's(sp?) testimony was that the Cal Mix contained the IS as a chromatographic aid to calculate the RT's in the IRMS. Since the Cal Mix also contained the 5B, I think it also can serve as a chromatographic aid to fix the 5B in the IRMS, expecially since it didn't include the 5A which might have been too close in elution time to be sure which was which.

3. I am not introducing some new 5B anchor theory here. My basic argument is that the retention times and peak patterns match between the GCMS and the GC-IRMS if one relaxes the 1% standard as one must.

The fact that all the retention times match within the 1% standard between the mix cal, the F3, AND THE BLANK URINE in the GCMS, and similarly between the mix cal, F3 and blank urine in the GC-IRMS is supporting evidence for that pattern/retention time match.

E.g. the 5A directly follows the 5B which we know is properly identified by the cal mix because the 5B rts match in the GC-IRMS Cal mix and F3. The 5A rts matched in the GCMS blank urine, cal mix, and F3, the 5A rts matches in the GC-IRMS blank urine and F3. Too much redundancy.

4. TBV is dead wrong when he talks about the possibility of the real 5A possibly shifting to the small trailing peak in the IRMS. This can only happen if the 5A and the small trailing peak SWITCH POSITIONS. If the 5A moved to the small trailing peak, what happened to the small trailing peak? It can't disappear. That is why I keep asking him whether he claims the peaks switched and he keeps refusing to answer.

And I'm not even talking about the peak heights and sizes of the 5A. If the 5A shifted to the small trailing peak, why did it all of a sudden shrink down. The basic proposition is that peak size is directly proportional to the amount of the substance. And I'm going to go out on a limb and state that we would expect the 5A and 5B peak height in the IRMS to be similarly as large relative to surrounding small peaks as those in the GCMS because the amount of carbon atoms they contain is directly proportional to the amount of the ions in the 5A and 5B. This is a challenge to Duckstrap to prove me wrong.

In any case I hope to expand on the last point when I redo my graphs.

m said...

ps.

My challenge to duckstrap is not for some general substance but for the 5A and 5B in particular. A knowledgeable scientist knows their carbon content and tell whether it is proportional to the ion content.

Larry said...

m, I'll wait for your full comments before responding.

Larry said...

Swim, M, TBV -

Part of M's argument for the 5bA anchor (and Brenna's argument for pattern matching) is that GC-MS peak heights can be compared to GC-IRMS peak heights. I've discussed at length the theory for why peak heights should NOT match up from MS to IRMS. I've started to study the data to see how this plays out in practice. I have more analysis to do, but it's pretty obvious from looking at the data that peak heights don't translate well from the GC-MS to the GC-IRMS.

Let's take the F3 "B" sample as an example. This is not a particularly egregious example, but it's the one we care about most. On the GC-MS, the 5bA peak is by far the tallest, maybe twice as tall as the 5aA. On the IRMS, the 5bA and 5aA are nearly the same height. On the MS, the 5aA is a little taller than the IS; on the IRMS, the 5aA is much shorter than the IS - the IS is nearly twice the height of the 5aA.

If we set up the IS as our "reference" for how MS peak heights should translate into IRMS peak heights, then the 5bA, 5aA and 5bP IRMS peaks are 205%, 186% and 221% taller, respectively, than what we'd expect. Not exactly a +/- 1% relationship!

On the IRMS, peak size order is 5bA, 5bP, 5aA and IS. On the MS, peak size order is 5bA, IS, 5bP and 5aA. Where's the pattern?

If you go by Brenna's "pattern matching", along the lines of "here's a large peak, here's a medium-sized peak, here's a small peak", then looking at our 4 peaks of interest for F3 "B", I'd characterize the GC-MS as "small, large, small, medium", and the IRMS as "large, large, small, small". Where's the pattern?

I think AT BEST you might try to distinguish between significant and insignificant peaks. I don't think you can match large peaks to large peaks, medium peaks to medium peaks, etc., not when relative peak sizes change as much as I'm seeing from the MS to the IRMS.

Another thing that crops up, for what it's worth, is that the size of the IS IMRS peak varies quite a bit from sample to sample: from about 2.0 to 5.3 on the y-axis for the various samples. I don't know if this factoid means anything, but why wouldn't LNDD have added the same amount of the IS to each sample? I'm seeing the same thing on the GC-MS side of the equation: if I look at the MS target response (which I equate to IRMS peak intensity, though I may be wrong to do so), the MS measures for the IS vary from 3.4 million to 20.3 million.

Larry said...

Here's a second example of "IRMS Peaks Gone Wild":

Look at the S17 F2 results, and compare the IS to the Andro. On the MS, the IS is tiny compared to the Andro - my rough calculations tell me that the Andro is 7 or 8 times higher than the IS. But on the corresponding IRMS graph, the IS is taller than the Andro.

You might say, maybe this is just how Andro translates from the MS to the IRMS. If so, take a look at the graphs for the blank urine F2. On the blank urine F2, the Andro is about twice as tall as the IS. And on the IRMS ... the Andro is ALSO taller than the IS - not by the same degree, maybe about 50% taller.

Not only is there no consistency between MS peak height and IRMS peak height, the relative differences in MS and IRMS peak heights are not consistent from test to test for the same metabolites!

Where's the pattern, Dr. Brenna?

m said...

Larry, TBV, Mike

Still short of time.

But one science question to you all.

Was the 5B in the IRMS F3 adequately identified since its retention time matched the 5B in the IRMS Mix Cal Acetate?

If not, why not. If yes, why.

tbv@trustbut.com said...

Not much time lately, so here's catching up on a few things that seemed open to me.

(1) I'm repeatedly asked if there were place switches. My first answer was, rhetorically, "do we know they didn't?" was apparently not good enough. So the answer is, I don't know what may or may not have been switched. I'm claiming I don't know for fact.

(2) None of the experts including Meier thought or testified that the Lab should use the Cal Mix as a full chromatographic reference standard for identification in the GC-IRMS, only for the GCMS.

I believe this is incorrect. WMA was quite confounded that LNDD did not use the same cal mix for the MS as the IRMS, and thought the identification issue would have been nailed down if they had done so. Since they clearly had "the right" mixture available, why didn't they use it?

That is an open question, actually. Let me offer some possibilities.

(a) Complete oversight. They'd admit they should have, it would save effort (less to prepare and inventory).

(b) Fundamental misunderstanding. The didn't (don't) understand the identification requirements well enough. This leads us into the morass we are now thrashing through.

(c) Intentional omission. Understanding the identification requirements, and having a complete cal mix available, they may still have chosen not to use it.

For (c), I can imagine a few possible reasons; (i) there are chromatographic reasons unknown to us (and WMA); (ii) a perceived simplification, also unknown to us; (iii) there is something they didn't like when they tried it, so they chose not to do it.

What could be a reason why they wouldn't like the results of using the MS cal mix in the IRMS?

The only one I can think of is that they maybe didn't like the CIR results they got on the cal-mix when they used both in the same one. That is, in the cal mix having 5bA and 5aA, they got isotropic results that were not inline with the known values, and "solved" the problem by not running the cal mix that had them both present.

(3) Assuming identical temperature ramps and inlet pressures between the cal mix used in the IRMS and the F3 sample, then yes, I think the 5bA has been identified.

If the pressures were adjusted between injections, the identification would be in doubt.

Is there documentation of the actual inlet pressures on each sample injection?

TBV

Larry said...

TBV, you wrote:

"WMA was quite confounded that LNDD did not use the same cal mix for the MS as the IRMS, and thought the identification issue would have been nailed down if they had done so."

I'll need support for that one, I've read the WMA testimony more than once and I don't recall his saying that. Yes, he possibly was dumbfounded that LNDD would have used one mix cal on the GC-MS side and a different mix cal on the IRMS side. But I don't recall his saying that you can identify IRMS peaks by reference to other IRMS peaks.

Besides, WMA did not think that LNDD has necessarily identified the internal standard in the F3 sample. If WMA was OK with using IRMS results as a standard for identification, then he could have identified the F3 IS by using the IS in the IRMS mix cal acetate.

Larry said...

M -

You asked me the science question:

"Was the 5B in the IRMS F3 adequately identified since its retention time matched the 5B in the IRMS Mix Cal Acetate?"

IMHO, no.

You asked: "If not, why not."

I've advanced many reasons already in the last week or so, and I don't want to repeat. If I had to list one reason, I'd say: I think that D-Bob has summed up the scientific consensus for you: you begin by running your sample through a GC-MS, you identify the peaks in the GC-MS by using a reference standard and by looking at the full mass spectrum, then (carefully maintaining the same chromatographic conditions you used with the GC-MS), you hook up the same GC to the IRMS, run your tests, and identify the IRMS peaks by reference to the MS peaks, using RTs or RRTs. LNDD did not do this to identify the 5bA in the S17 F3 sample, so I don't think the 5bA peak was ever identified on the F3 IRMS.

The method you're advocating for identification of the 5bA peak may be good in theory, but to my knowledge it's never been tested and I can't accept an untested theory as scientifically valid.

In the past, you've expressed some reluctance (I'd call it a healthy reluctance) to entertain the scientific opinions of non-scientists, so I'll stop here. (To be certain, I've posted quite a bit here over the past week or so, so if you want more details from me, I've probably already posted them.)

m said...

Larry,

You misunderstand Dailbob.

He claims the retention times of the analytes in the GC-IRMS must be matched DIRECTLY against the retention times of the analytes in a reference standard, that is the IRMS Mix Cal Acetate.

He does not claim that the IRMS retention times should be matched against the retention times of the reference standard in the GCMS which you now seem to be doing.

We know that the retention times between a GCMS machine and a GC-IRMS machine cannot be matched within the 1% standard because of the combustion period and the difficulty of making the conditions exactly the same when you use two different types of machines. For you to impliedly claim this can be done within the 1% standard is to ignore Meier's testimony.

TBV, Dailbob, Mike, and I believe you at one time, have all claimed that the IRMS retention time of the 5A should have been matched against the IRMS Cal Mix Acetate because it was the only definite reference standard, but ALAS there was no 5A in the IRMS Mix Cal.

Now when I ask you all whether the 5B, which was in the Mix Cal and whose retention times matched, was adequately identified I am greeted with backtracking or thunderous silence because you are all afraid of conceding a point that may come back to bite you.

tbv@trustbut.com said...

Larry, transcript PDF 1232:

"mix S50 16 which contains all the target compounds which they use in the GC/MS to anchor the retention times, why that is not being run. At least the chromatographic challenge in terms of peak resolution would be the same. You would still basically have the
issue that the matrix -- that they won't be free of matrix interference. But at least it would actually give the lab a feedback on whether their chromatography and their IRMS is good enough. So, why they choose to run four and not -- and not a combination of all target compounds that have the androsterone in it, the 5-alpha in it, and the pdiol in it, I don't know. This is -- I mean, this is shooting fish in a barrel."

TBV

tbv@trustbut.com said...

M,

You must have missed my reply where I said,

(3) Assuming identical temperature ramps and inlet pressures between the cal mix used in the IRMS and the F3 sample, then yes, I think the 5bA has been identified.

If the pressures were adjusted between injections, the identification would be in doubt.

Is there documentation of the actual inlet pressures on each sample injection?


TBV

Larry said...

M, LOL! Thunderous silence from ME? Surely you jest. It's almost impossible to get me to shut up! Please name ANYTHING you've ever written that I have not responded to. And BTW, I've posted a bunch of stuff recently that you can respond to any time you like ... my recent posts about relative peak heights would be one place to start. (TBH, I don't expect you to respond to everything I write.)

I may indeed have expressed some surprise and regret in the past that LNDD did not run the mix cal acetate containing 5aA and 5bP in the IRMS. I was younger and more naive in those days! Besides, it would at least have been interesting to examine data from such a run.

I do not agree that it's impossible to match chromatographic conditions from the GC-MS to the GC-IRMS. Why not? Who says you can't? You're a fan of OMJ, so certainly you know that OMJ assumed for a long time that LNDD HAD matched chromatographic conditions from the GC-MS to the GC-IRMS, and that he had some very negative things to say about what happens if these conditions do not match. He pulled back a bit from some of his earlier statements once he realized that, indeed, LNDD used different temperature ramps for the MS and IRMS portions of the CIR testing. But to be certain, his earlier assumption that the conditions were the same for both tests is a powerful argument that it is at least POSSIBLE to set up matching chromatographic conditions.

The majority opinion says that the combustion period is the same time for all IRMS peaks, so if you know this combustion time, it's simple algebra to factor this time into the computation of RTs and RRTs.

I'll also note, as I've noted many times, that Brenna did his pattern matching from the GC-MS to the GC-IRMS. Brenna did not identify IRMS peaks by reference to other IRMS peaks. Neither did the majority, or Botre, or anyone else on USADA's side of the hearing room. I still think you're advancing a novel theory here, one that USADA never thought to advance (and why didn't they think of it? Sorry, couldn't resist, you love to ask me that same question when the proverbial shoe is on the other proverbial foot!).

I will re-read D-Bob's post. But while I do this, you could answer a question for me: if it is scientifically valid to identify IRMS peaks by reference to other IRMS peaks, then why run the GC-MS as part of the CIR testing? Correct me if I'm wrong, but if you're not going to look at the full mass spectrum, which LNDD evidently does not do, what other purpose does the GC-MS serve EXCEPT for peak identification?

TBV, I will look at the good doktor's testimony in context. But I ask you the same question I asked M: why bother with all of that RRT analysis and all those expensive GC-MS pre-tests if we can just skip directly to the IRMS and identify peaks there?

Finally: I'm willing to admit when I've been proven wrong. I've done so before. If your theory about the 5bA anchor is proven to be scientifically correct, then I'll be the first to pat you on the back and buy you a virtual beer.

tbv@trustbut.com said...

Why bother with all of that RRT analysis and all those expensive GC-MS pre-tests if we can just skip directly to the IRMS and identify peaks there?


Two reasons.

(1) It gives you the identity, telling you that peak in the IRMS of the cal-mix is in fact the thing you think it is. Otherwise you don't know what order they are in. Without the GCMS identification, you know that the peaks elute at the same time for the IRMS of the mix cal and the sample, but they are all C02. and

(2) You need the full-scan mass-spec to make any kind of specificity claim for a peak. But that is not for this discussion thread.

TBV


For A,

m said...

Larry,

You are right, "thunderous silence" doesn't apply to you. Perhaps the "backtracking does". -)

I have never claimed that proper identification of the analytes in the IRMS required a direct matching of retention times with an IRMS reference sample. As I said, there was no testimony which suggested this as the proper method.

I did say that IF, as others argued, direct matching was required then the blank urine and mix cal acetate could largely (but not completely) fulfill that requirement.

I have taken the position that it appears that all that is required is a matching of retention times and peak pattern between the GCMS reference material and the GC-IRMS sample. But I have pointed out how matching of the IRMS mix cal, backs up and validates that basic identification.

The reason one performs the GCMS is because one can look at the mass spectra to identify the analyte of interest.

TBV,

I did miss your admission, sorry. My eyes glazed over when you start discussing inlet pressure.

Michael said...

I'll say it again (my other post was deleted I think)

I think M is Rational Head from the DP forum.

Mike

Larry, if my theory is correct, you're having the same conversation with him that I had regarding some other stuff over at the DP forum.

Larry said...

TBV, I'm sorry for harping on a point that may be very small in the overall context of things. But my lawyer brain likes bright lines, as you've pointed out. One of the few bright lines I have here is that IRMS tests cannot be used to identify peaks. IRMS tests can be used to tell us something about the peaks, like a delta-delta measurement, but they can't identify what's in the peak, because the old peak contents have been incinerated and all that's in any of these peaks is CO2. Only a GC-MS can identify what's in a peak.

I'm going to hold onto this bright line rule in a canine fashion, until it is pried by force from my clenched canine teeth.

IMHO, the testimony you cited from Doktor M-A (pdf p. 1232) does not validate the use of an IRMS mix cal run as a reference standard for identification of IRMS peaks.

In this portion of the Doktor's testimony, we're in direct examination, and Suh and the Doktor are up to slide 60 (slide 58 in our package, see pdf p. 1230). From a review of the Doktor's slide show, slide 60 falls under his discussion of good chromatography -- his discussion of peak identification was way back in slide 22-28 (roughly pdf pp. 1185-1205).

Let's try to place the testimony on pdf p. 1232 in the proper context. The context requires us to go back to pdf p. 1226, where Suh brings up the topic of negative and positive controls. As I understand it, a "negative control" is a test you run on a sample you know is drug-free, to make sure that the results confirm it is drug-free. A positive control is a test you run on a sample you know should give a positive test result.

Suh asks if the mix cal acetate is a positive control for GC/C-IRMS, as claimed by LNDD. The Doktor responds on pdf p. 1227 that the blank urine is run as a negative control, and he explains the meaning of positive control and negative control.

Suh then moves to another slide, not identified in the transcript, and the Doktor explains this unidentified slide. Suh asks if the mix cal acetate IRMS values were within the measurement of error, and the Doktor says yes, "for those that are reported." (Since we can't see the slides, it's hard to know exactly what values they are talking about.) Suh asks why it isn't good enough that the IS is within the measurement of error, and the Doktor replies (we're on pdf p. 1229 now) that the IS is a clean matrix without any interference. On pdf p. 1230 the Doktor explains that a clean matrix tests your instrument, but not in the same way as a dirty matrix, which is closer to the conditions you'll encounter when you run your real samples.

Suh then puts up slide 60, which corresponds to our slide 58. This slide compares the chromatography for the mix cal acetate to the chromatography for one of the FL "B" fractions. On pdf p. 1231 the Doktor points out the peak overlap on the "B" fractions, in contrast to the nice clean chromatography for the mix cal. Suh asks if the difference in the quality of chromatography for the two graphs is because the mix cal is a clean matrix. This is the context for the material you quoted.

When Suh asks the Doktor if the difference in chromatography is because the mix cal is a clean matrix, the Doktor goes off on a tangent (as he does frequently in his testimony). He responds that it would have been "more useful" if LNDD had used its mix cal with all six metabolites (mix S50) as an IRMS control, as opposed to the mix cal it actually used, with only four metabolites. He then stated (as you quoted) that if LNDD had used the mix S50 "at least the chromatographic challenge in terms of peak resolution would be the same."

In other words, Doktor M-A expressed regret that LNDD did not use the mix S50 as its control run, because it would have been a tougher test for the ability of the setup to provide better peak resolution and achieve the best possible chromatography. This had nothing to do with the issue of peak resolution.

So, with teeth firmly clenched, I continue to maintain that Doktor M-A did not advocate the use of a mix cal IRMS run as a reference standard for identification of IRMS peaks.

Larry said...

Whoops. At 5:42, second to last paragraph, last sentence, I should have stated that "This had nothing to do with the issue of peak identification."

Mike, my own P.O.V. is that "M" is entitled to his secret identity. I mean, if everyone knows who Clark Kent is, then the fun's over.

m said...

LOL!

Is rational head a lawyer?

I'm a lawyer. Rational head is a scientist, isn' he.

I'm not rational head, although I'd like to have his science knowledge.aefgh

Mike, you can't tell the difference between a lawyer and a scientist. Not a good sign for your judgment.

Larry knows I'm a lawyer.

Larry said...

M, I just figured that "rational" head = lawyer.

I don't spend much time at DPF. It's bad for my disposition. If I want to experience name-calling, back-biting, finger-pointing and the like, I always have my day job.

Larry said...

M, I won't fight against your use of the word "backtracking". We're all learning on the job here, so to speak. If I'm not smarter now than I was a month ago, THEN I'd have reason to worry.

Thank you for clarifying your position. If you are primarily relying on Dr. Brenna's so-called "pattern matching" between GC-MS and GC-IRMS for peak identification ... and if you are arguing that the 5bA anchor theory provides some level of confirmation that the pattern matching is correct, then I'd agree with you that this kind of use of the 5bA anchor might very well be valid. I would NOT feel comfortable with using the 5bA anchor as a primary method to identify anything, but if you'd adequately identified peaks by another valid primary method, then I'd agree that the 5bA anchor provides a secondary method that on some level would confirm the primary identification.

Let me see if I can make this clear. Let's say, for example, that it was possible to identify IRMS peaks in this case, provided that we relaxed the +/- 1% RT standard to something like +/- 2%. Let's say that there was good justification for such a relaxed standard. THEN if you offered up the 5bA anchor identification method as a way to confirm the relaxed RT identification, I could probably go along with that.

As you know, I'm not comfortable with "pattern matching" as a primary means of peak identification, and the 5bA anchor theory does not provide enough confirmation to overcome my discomfort.

We can agree to disagree about pattern matching for the moment, and enjoy our temporary meeting of the minds!

tbv@trustbut.com said...

Larry, you spent way too much time looking to understand a point I wasn't making!

Having been in the room, I remember WMA's testimony as being kind of rambling too, making too many points with too little clear explanation. A symptom of too little time, I thought.

Anyway, I was NOT bringing it up his puzzlement about using the 4 substance cal mix the way you seemed to have been taking it -- nor do I think he was arguing for using only the IRMS cal mix as identification.

He was saying to be sure, you needed everything to be the same everywhere -- you needed the same cal mix in the MS, and the IRMS, and the spiked blanks. Given identity with the MS (which you don't have without the MS), and conditions allowing RRT comparison, you can be sure what the substances are at times in the IRMS cal mix. Given those times, you can then look at the CIR readings you have on your spiked blanks, and tell if your are doing CIR measurements correctly and accurately. Then, and only then can you get reliable results from your sample measurements.

LNDD skipped a bunch of these steps. They didn't match chromatographic conditions in almost any meaningful way; didn't use the same spike cal mixes across instruments, and didn't spike and measure blanks. Yet they are sure their sample measurements are reliable.

TBV

tbv@trustbut.com said...

Larry,

One more thing. With the caveats of comparable chromatographic conditions (same ramp, column and pressure) between the MS and IRMS, I would admit that the substances lined up with the IRMS cal are the same as those in the sample. I know you are not willing to concede that at this point.

However, I don't believe the conditions were the same. I don't know they identified the IS correctly, and I don't know there was no peak swapping, and I don't know they aren't off-by-one peak somewhere around the 5aA.

I see no evidence the pressures were kept constant during the IRMS runs. This is a documentation issue.

I suspect the issue of peak swapping of the target metabolites could be resolved by experiment, and I'd be surprised if USADA wasn't going to do it somewhere.

I do not believe the issue of peak-swapping and co-elution of non-target analytes (dirty matrix issues) can be simply resolved by experiment. This leads to the specificity discussion, not this one about identification.

TBV

Mike Solberg said...

TBV wrote:

He was saying to be sure, you needed everything to be the same everywhere -- you needed the same cal mix in the MS, and the IRMS, and the spiked blanks. Given identity with the MS (which you don't have without the MS), and conditions allowing RRT comparison, you can be sure what the substances are at times in the IRMS cal mix. Given those times, you can then look at the CIR readings you have on your spiked blanks, and tell if your are doing CIR measurements correctly and accurately. Then, and only then can you get reliable results from your sample measurements.

FWIW (and maybe not much second hand), I have been trying to understand these issues more fully, and duckstrap has been kind enough to do some 'splainin' for me. I'd just like to add that duckstrap recently told me very nearly the same thing as TBV wrote above.

The point about looking at the CIR readings you have on your spiked blanks, to tell if your are doing CIR measurements correctly and accurately, is particularly important when you consider the variable nature of the isotope effect. Because the isotope effect can make a significant difference whenever you have less than perfectly resolved peaks, and because the isotope effect changes based on chromatographic conditions, you have to have that known CIR spiked blank to really confirm your process is accurate.

syi

Larry said...

Mike -

Your 8:42 pm point goes to the issue of the delta-delta shown for the IS in the various samples? Some of which were outside LNDD's stated tolerance for error?

Mike Solberg said...

Yes, it would include that, and any other sample peak that doesn't have clean separation and a horizontal baseline.

How they can measure the CIR of some of those IS peaks is beyond me. Of course, they claim they don't need that, because the IS peak CIR is irrelevant. But, as Herr Doktor points out, with a million peaks right next to each other, how do they know they have the right one? Even if they are adjusting the pressure to make it come out at 870 seconds, how do they know they got the right one to come out at 870?

syi

Mike Solberg said...

m wrote:

I have taken the position that it appears that all that is required is a matching of retention times and peak pattern between the GCMS reference material and the GC-IRMS sample.

m, I know you have been arguing about "the Truth," not about the legal requirements. But I wonder if you see the above stated position as consistent with TD2003IDCR? Do you see your argument as basically a "relaxed retention time" argument, with a little extra support from the matching of the pattern of peak heights between GCMS and IRMS, and a little further extra support from the 5bA anchor?

syi

m said...

TBV and Larry,

Re: the inlet pressure.

"LNDD's SOP CALLED FOR DIFFERENT PRESSURE SETTINGS FOR THE GC-MS AND GC-IRMS PORTIONS OF THE CIR TEST.

The SOP for IRMS pressure (USADA 329) - Pression constante: Ajuster le SI a environ 870s.

The SOP for GC-MS pressure (LNDD 1427) - Pression constante: Ajuster le SI a 10.7 min (+/-0.5 min)."

I finally focused on this issue. There is no problem here. The Internal Standard is still accurately identified by it's retention time from the reference standards in both the GCMS and GC-IRMS.

1. They are the scientists, you are not so they had a reason for doing this. Word. They complied with their own SOP, and Landis's experts didn't challenge this.

2. What does this SOP mean? Adjust the pressure so that the Internal Standard will elute around a particular time, e.g. 870 seconds and 10.7 minutes. This indicates that they can and have done this in the past and know roughly what pressure to use.

3. In fact they were successful, because the IS eluted at the specified times in both the GCMS reference standard and the GC-IRMS refrence standard. There is no doubt about the identity of the IS because it was contained in the reference standard (Mix Cal Acetate). And of course the retention time of the IS in the F3 sample and blank urine matched the retention time of the IS in the reference sample in both the GCMS and GC-IRMS so there is no doubt about the identifying the IS in the F3 and blank urine.

WE come back to the fact that the SAMPLE RETENTION TIMES MATCHED THE REFERENCE STANDARD RETENTION TIMES.


PS. TBV I agree with Larry, Meier never said one should use the IRMS mix cal as a reference sample for matching retention times.

tbv@trustbut.com said...

I still don't see how the IS is identified in the IRMS among all its neighboring peaks.

1. "They are the scientists, you are not so they had a reason for doing this."

Sorry, this borders on ad-hominem, and appeal to authority. They are scientists who left lifting rings on their MS magnet. They need more explanation to justify odd procedures.

Ignore me if you like, but WMA certainly has the same kind of questions.

As an SOP, "adjust to taste" is rather vague. I'd expect documentation of the actual pressure in use. It's a significant factor, and it is not recorded anywhere. Without it, we don't know if consistent values were used, or if the values are indicative of correct machine operation.

As I said before, I'm inclined to believe the identification of things in the reference standard if the chromatographic conditions are the same. We know of column differences, we know of ramp differences. We have no idea about inlet pressure.

As I noted earlier, some of the questions might be resolved by experiment -- and some others may not.

Also, I agree that WMA wasn't talking about 4 vs. 6 standard cal mixes for identification using the IRMS. But that rather argues against the 5bA anchor theory, doesn't it? Because he never advocated using IRMS-only for identification, which is what the 5bA theory seems to require to be valid.

TBV

m said...

TBV,

My jibe at you is because you non scientists are digging around in the DETAILS of science and trying to find a mistake when there is something you do not understand. I'm far more ready to chalk it up to the fact that YOU DON'T UNDERSTAND, not that the scientists did something wrong. For example, you claim that there is "no documentation of the actual pressure" used. How would you know? Prove it. I seem to recall notations of gas pressure in the lab docs. I will assume they used the same pressure in all their runs since the IS eluted at 870 in all the samples. It is up to you to prove otherwise, and that it in fact made a difference.


"Also, I agree that WMA wasn't talking about 4 vs. 6 standard cal mixes for identification using the IRMS. But that rather argues against the 5bA anchor theory, doesn't it? Because he never advocated using IRMS-only for identification, which is what the 5bA theory seems to require to be valid."

I have not advocated using IRMS only for identification as I explained above to Larry.

WMA's testimony was framed within his understanding of the legal requirements of IDCR2003, so he was focused on matching retention times with the GCMS reference material.

But if the GC-IRMS Mix Cal Acetate contained just one substance, the 5A, and it eluted at exactly the same time as the 5A in the GC-IRMS F3 sample, I don't think even he would claim that the 5A in the sample was not the 5A in the Mix Cal. I don't think you can deny that either. Care to try?

Ali said...

m,

You said: "But if the GC-IRMS Mix Cal Acetate contained just one substance, the 5A, and it eluted at exactly the same time as the 5A in the GC-IRMS F3 sample, I don't think even he would claim that the 5A in the sample was not the 5A in the Mix Cal. I don't think you can deny that either. Care to try?"


Unless I'm missing something here (entirely possible), why would anyone deny that. You seem to have adopted their arguement - i.e. to provide confidence that the correct peaks are being identified, the mix cal should contain the peaks of interest.

Although if they did what you suggest and dropped the 5B from the mix cal and just included the 5a in it, you'd be no better off. It would just mean that you could nail the 5a and be doubtful over the 5B ?

Maybe you could use Larry's disclaimer: "I'm no scientist, but ..."

Ali

Larry said...

M, I will grant you that I have no business questioning the scientific consensus on anything. Unfortunately, we have no scientific consensus on most meaningful points here, we just have a battle of the experts. In such a battle, you have two choices: (1) you can decide that since there is no scientific consensus, there's no firm basis on which to make a decision, in which case the presumption of innocence takes over and FL is not sanctioned, or (2) people with a lower level of scientific expertise (arbitrators, jurors, bloggers) have to decide which expert is wrong and which is right. I'm perfectly happy to make choice (1).

Ali, as long as you are trying to open up a line of communication with Dr. Davis, I have another question. Assume for the moment that a lab does CIR testing and does the absolute best job possible of preserving all chromatographic conditions between the MS and the IRMS. Assume also that the IRMS combustion time is known and is the same time for all IRMS peaks. Does this eliminate the need for RRTs? In other words, will the MS RT for each peak always be equal to the IRMS RT minus the combustion time (with some small margin for error)?

m said...

Ali,

"Unless I'm missing something here (entirely possible), why would anyone deny that."

It's been like pulling teeth. It seems some people would deny this or hedge this.

Here's the caveat. I said that the IRMS reference material contained ONLY the 5A, nothing else. Therefore if there was a match of retention times in the IRMS we would have an identification, even if there was no mass spectra, because there was only 1 analyte in the reference material and there could be no doubt about which one it was.

Where the reference material contains multiple analytes as in this case, some folks who want to raise doubt (you know who you are) can claim that we don't know if the 5A in the IRMS reference material is really the 5A because its retention time doesn't match the 5A in the GCMS reference material (that also can be verified by its mass spectra).

My response is that the scientists know in which order the analytes are going to elute in the IRMS reference material and approximately the retention time from their reference libraries, previous experience etc. So they are confident that the 5A in the IRMS reference sample really is the 5A. To use one example, we know (from the GCMS and I assume prior experience) that the 5b. elutes long after the IS which elutes at 870 seconds in the IRMS. So we can be confident that the analyte in the reference sample at 1320 seconds is the 5B. To flesh out the example, we know that the IS, etio, 5B, andro elute in that order. So we can match retention times of the IRMS sample with the retention times of the analyte in the IRMS reference material and be confident of the identification. I going to guess that the reason they didn't include the 5A was that it elutes only 20 - 30 seconds after the 5B, and they didn't want to worry about overlap or confusion.

One way to get around this uncertainty would be to run a reference sample for each analyte in the IRMS. I assume the lab doesn't do this because it's too costly and cumbersome and thought not necessary since they are not basing their identification on that matching.

Nevertheless, one could base an identification on that match, as my question was designed to illustrate.

And as to the snide remark, at least I don't hide my qualifications or offer misleading statistical claims when I know better.

Larry,

What I object to is attempting to find ERROR in the DETAILS of the science, like the inlet pressure, when we don't even understand the basics. Moreover, there has been no expert testimony about this, much less any disagreement of experts. I know I'm not going to stop you guys from doing this, but I can still call attention to it.

Ali said...

Larry,

By coincidence, it was Dr Davis who responded (email) to my initial query for info from Mass Spec Solutions. I've explained what we are doing and asked whether he is able to help us with some general technical queries. I'm awaiting a subsequent response.

Ali

Ali said...

m,

Your suggestion of a piece-wise assessment of IRMS peak position by injection of known single metabolites, one after the other, would potentially remove any doubts over position. It's a pity they didn't do what you suggest. If they had, we wouldn't be having this conversation.

As for being snide, I apolgise if that's how it appeared. I thought it was a reasonable extension to the approach that Larry takes. He qualifies his statements with an indication of his area of expertise in what he's talkng about. I'll happily start adopting a "I'm not a lawyer, but .." approach when it comes to legal matters (if I don't already do that ?)

Ali

Mike Solberg said...

m, your suggestion that we not tread in the detailed scientific waters we don't fully understand is all fair and good. I may be more guilty than most in that regard (and I don't even qualify my posts!).

However, for quite a while now we have been (intentionally, I believe) staying away from the legal dimension of these arguments. But it is exactly the legal framework of the case that is supposed to make it possible for non-scientific people (like the three arbs who have already ruled, and the three who are yet to rule) to fairly and justly decide the case. We (like the arbs) may not be able to fully comprehend all the arguments about whether the isotope effect could skew the results and lead to an AAF, or whether the 5bA in the mix cal can help with identification of the sample peaks when the chromatographic conditions are not the same. But we can, presumably, look at the specific, scientific requirements of the International Standard for Laboratories and the technical documents, like TD2003IDCR, and decide whether LNDD did things the way they were supposed to.

The legal aspects of the case are exactly what is supposed to bridge the gap between the scientists and the rest of us. For that matter, the legal aspects of the case are supposed to bridge the gap between different scientists too, for obviously, they disagree. It is not a matter of giving up the search for "the truth" and debating "technicalities." It is a matter of recognizing the limits of both the science itself, and of our understanding of the science.

So I ask again, do you see your "enhanced peak matching" argument as consistent with TD2003IDCR? Do you see your argument as basically a "relaxed retention time" argument, with a little extra support from the matching of the pattern of peak heights between GCMS and IRMS, and a little further extra support from the 5bA anchor?

syi

Larry said...

Swim, great post.

I've stayed away from the legal side of things, in part because even lawyers have to deal with the facts, but also because I sense a limited interest here in legal argumentation. I think we all hope that somehow, this case can be decided on the basis of the science, as a matter of Truth with a capital "T". If FL is exonerated because, say, LNDD used too much white-out on its lab documentation, my sense is that most of us would feel a sense of dissatisfaction with the result.

But you're making a good point, and M in his way is making the same point. The science here is extremely complicated, and we cannot hope to understand it at anything like the level of someone like the top people at LNDD, not to mention people like Brenna and M-A. Every time I try to read the GC operating manual, or one of the science papers published in the journals, I'm reminded that I'm not likely to teach myself chromatography in my spare time on the internet.

M, this is addressed to you, too. You're right, I have to have a bucket full of hubris to read the LNDD SOP and say that their specifications are wrong for this kind of testing! But you're doing the same kind of thing when you advance theories like the 5bA anchor theory, that have never been proposed by a scientist, that have never been peer reviewed, etc. Science is a PROCESS. It's not a matter of making logical deductions, particularly when you're making logical deductions from a layperson's knowledge of the science.

You ask, would I accept as a matter of science an identification of 5aA based on an IRMS-IRMS comparison of an athlete's sample to a mix cal that contained only 5aA? No, I would not. I cannot accept the scientific validity of a theory unless the theory emerges from the scientific process. I feel differently when you ask if any of these theories might have value to confirm results achived via a primary means that HAS been established by the scientific process.

Swim, back to your post. You are making a great point when you say that it IS the responsibility of lay people to understand science well enough to apply rules designed to judge the adequacy of the science. On a certain level, this requires us to trust the rules, and not to regard them as "technicalities" that do not have "Truth" value. My own POV is that, if we can determine that LNDD violated the rules, even technical rules like chain of custody or forensic corrections, then we can feel reasonably certain that we've found "Truth" with greater certainty than any "Truth" we might think we've reached based on our imperfect understanding of the science.

And you are absolutely right. As citizens, voters, jurors, we are going to be asked to make decisions of consequence on science matters. Just the issue of global warming alone presents us with a number of difficult decisions we'll all need to make that require us to understand the science as best we can. Are we going to embrace nuclear power to combat global warming, notwithstanding the risks? Are we going to force Detroit to built cars that don't emit greenhouse gases, even if it means that cars get more expensive and some workers may lose their jobs? What about wind, and solar, and drilling in the Alaska Wildlife Preserve? We can't abdicate our responsibility to be part of the decision-making process here, even if we're not scientists.

So ... we may be reaching the point where it would be valuable to turn from the science back to the law. IMHO the legal case is even stronger than before that LNDD's peak identification failed to meet ISL standards, that there should have been a "burden flip" under the WADA rules, and that USADA did not (and could not) meet the burden imposed by the burden flip. I don't think it's a close question.

If there's interest, I'll stop playing wannabee scientist for a while and be a lawyer instead.

The alternative would be to focus on the rules addressing peak specificity. I'm happy to do that too. As I write this, I'm not certain (from a legal point of view) whether those rules were violated in the FL case.

If I'm going to return to my role here as a lawyer, I would do it in something like a wiki process. I'd post something, and let people comment and add things to it. I'd try to cajole and educate, I'd let you know which arguments I thought were strong and which were weak. The wiki process can be more collaborative on peak identification. The rules on peak specificity are so difficult to interpret, I might have to do most of the work there.

And I'm fine with keeping our focus on the science, which is (I think) a topic of greater interest here.

Bill Mc said...

Just as Georges Clemenceau said about war - That it is too important to be left to the Generals - so science is too important to be left to scientists and law is too important to be left to lawyers. In the FL case, it is important that "amateurs" understand the issues arising from these esoteric realms and to make comments and suggestions about how law and science can best serve justice. That is what is happening here and I believe that the efforts are laudable.

Mike Solberg said...

Larry, I for one would be most interested in seeing your legal argument about specificity. In my view, this peak identification issue is not Floyd's winner - specificity is. The gapping hole of the missing mass spec data, the different chromatographic conditions, the poor separation of the 5bA and 5aA peaks, and the overall baseline issues, all leave the door wide open to specificity problems.

There has to be a way in which LNDD has to prove that they measured only what they say they measured, whether it be ISL 5.4.4.2.1 or something else. It just doesn't make sense to say that they don't have to do that. So if you can show exactly how that is required, I think that would be very interesting.

Personally I think paragraph 233 and following of the majority decision is an important place to look. To my reading, they speak of ISL 5.4.4.2.1 as if it is binding to practice not just method, but they focus on the weaker language of "matrix interference" and ignore the stronger language of "specificity."

I thought this discussion of 5.4.4.2.1 was one of the weakest sections of the decision (second only to their claim that WMA didn't know what he was talking about).

syi

Mike Solberg said...

Oh yeah, and to my first paragraph above, add the likely presence of dexamethesone and methylprednisolone.

syi

Larry said...

syi -

OK, I'll see what I can do. But specificity is definitely a more challenging legal argument, especially if what you want to prove is that the lab was required to examine the complete mass spectrum for each peak.

Here's one challenge for you to think about (and when I describe this challenge, I'm skipping over a bunch of other challenges that we'd have to meet before we ever reach this challenge). Remember, ISL rule 5.4.4.2.1 talks about the ability of the lab's techniques to detect only the substance of interest. What techniques does a lab use to detect only the substances of interest? I think it comes down to sample preparation and setting up the right chromatographic conditions for optimal peak separation. From this perspective, the mass spectrum analysis is not a technique to ACHIEVE specificity, it's a CHECK to see whether you've achieved specificity in a given case.

Does 5.4.4.2.1 require the lab to CONFIRM that its techniques for achieving specificity actually work? Arguably yes, as part of the process of setting up these techniques in the first place and having them reviewed and approved by WADA in the accreditation process. But does rule 5.4.4.2.1 require the lab to REPEAT the confirmation process every time it runs a test?

We may end up having to argue that you can't rely on sample preparation and good chromatographic conditions to achieve specificity, that these techniques are unreliable to a certain extent, and for this reason we must require an analysis of the mass spectrum in every case. In other words, the lab's technique for achieving specificity would have to include repeating sample preparation and set up of chromatographic conditions (and if necessary, making changes to these techniques) as necessary to produce good mass spectrum results. You can probably see where any such legal argument is going to get messy.

Thoughts, comments and reactions?

Ali said...

Everyone,

Just for the record, I'd prefer it if people didn't have to qualify their opinions with their background. I wish I hadn't said that (I was reacting - a weakness). Regardless of the subject matter, I think everyone is capable of making potentially important observations. A lot of the stuff the "lawyers" have posted has already made me think twice about stuff I thought I understood. Most of the science stuff boils down to knowledge and common sense. I assume that it's the same for legal stuff. This isn't a competition, we're exploring theories so anything goes. In summary:

Larry, no more "I'm not a scientist ...".

M, no more "You're not a scientist ...".

Ali, no more "I'm an idiot, an ass and a time waster ..."

Let's just buckle down and see what happens.

Ali

Larry said...

Ali, I'm not a scientist, but OK. ;^)

Ali said...

Larry,

This is a "legal" one which I've raised before but maybe not highlighted. I was wondering whether you thought it was relevant:

ISL rule 5.4.6.3: When estimating the uncertainty of measurement, all uncertainty components which are of importance in the given situation shall be taken into account ...

LNDD quote an uncertainty of measurement of +/- 0.8 delta units. They apply that to all situations, even overlapping peak situations. Brenna's paper suggests that additional error is inevitable when peaks overlap.

In the presence of peak overlap, is it a violation that they don't account for these additional errors, so that "all uncertainty components which are of importance in the given situation shall be taken into account..."?

It strikes me that these are uncertainty components which are relevant and have not been taken into account ?

Ali

Ali said...
This comment has been removed by the author.
Larry said...

Ali, I may be experiencing brain freeze, but where is ISL 5.4.6.3? My copy of the ISL has a 5.4.6.2, but no 5.4.6.3.

Ali said...

Larry,

Apologies, this is probably my fault. SYI posted an extract from an international standard some time back (during the quantization error thread). I assumed that was the ISL (important note: ASSUME stands for making an ASS of yoU and ME, as I have so clearly domonstrated).

I looked back but couldn't find where it came from. Help, SYI !

Ali

daniel m (a/k/a Rant) said...

Ali,

I'm not 100% certain, but I think that SYI may have gotten that quote from one of the ISO standards. Most likely 17025, since that is more specific to medical lab work, but possibly 9001.

Mike Solberg said...

Yes, it was ISO 17025 5.4.6.3. The legal status of ISO 17025 with regard to the actions of LNDD is not clear to me.

But on that uncertainty issue, Ali, in case you haven't seen it, I've been meaning to point you to a document on Arnie Baker's website:

http://www.arniebakercycling.com/
floyd/other_links/
2nd%20USADA%20Symp%202003.pdf

It's a report of a symposium of people who all work with IRMS, and includes a whole section on uncertainty. The pagination is messed up and frustrating, but you can find what you need.

I have no idea how to answer your question about how to apply the ISO 17025 reference to LNDD's tests.

syi

Larry said...

Swim and Ali, the application of ISO 17025 is somewhat complicated. It's a little more complicated for me, because I've never read ISO 17025 and I don't have easy access to it. This is pretty far outside the area where I practice law, but in my copious spare time, I HAVE considered this issue.

The general rule governing application of ISO 17025 is ISL Section 5.1. ISL Section 5.1 states that Section 5 of the ISL "is intended as an application as described in Annex B.4
(Guidelines for establishing applications for specific fields) of ISO/IEC 17025 for the
field of Doping Control." My understanding of ISO 17025 is that it sets forth general rules applicable to labs, but contemplates that specialty labs will need their own rules, and that Annex B.4 of ISO 17025 addresses the need for these specialty rules. So, at least ISO 17025 seems to contemplate the existence of standards like the ISL, and seems to provide that ISL-like standards would be read together with ISO 17025 standards to provide for comprehensive rules governing WADA labs.

ISL Section 5.1 goes on to provide that "[a]ny aspect of testing or management not specifically
discussed in this document shall be governed by ISO/IEC 17025 and, where applicable, by ISO 9001." So you would think that ISO 17025 would be fully binding on WADA labs, except for those matters expressly addressed by the ISL.

Unfortunately, things are possibly more complicated than what is described in ISL 5.1. There is, for example, a very broad and unfortunate statement in ISL 7.1:

"References in the International Standard for Laboratories to ISO requirements are for general quality control purposes only and have no applicability to any adjudication of any specific Adverse Analytical Finding."

I might argue that this provision of ISL Section 7.1 needs to be read in the broader context of the entire Section, which addresses the contents of the LDP. But the breadth of the quoted language pushes the lawyer in me away from arguments based solely on ISO 17025.

Mike Solberg said...

Larry, go to ArnieBakerCycling.com, and click on "the Wiki Defense" link. The links at the bottom have a link to ISO 17025 or at least a long section of it.

syi

Larry said...

syi, thanks for pointing out the link.

I left you with a specificity question the other day, as to whether mass spectrum analysis is a means for a lab to achieve specificity, or is a way for the lab to confirm the other means they use to achieve specificity. I'll raise a second fundamental question for you here, which is: what do you mean by specificity?

Let's take the example of an egg. When have we achieved egg specificity? Clearly, if I separate the yolk from the white, and I put the yolk in one bowl and the white in another bowl, then I have egg specificity. This is like a chromatogram with two pure peaks, and good peak separation.

The opposite of a separated egg is a scrambled egg. Clearly, there's no specificity in a scrambled egg.

How about a hard boiled egg? There is a sense in which we've achieved specificity by hard boiling the egg - in a 3D mapping, we can picture the yolk and the whites as separate. But we live in a 2D world, and many slices of hard boiled egg contain both yolk and white. Also, looking at the egg from the outside, the yolk is hidden, and you don't know for sure it's there. For these reasons, I don't think we've achieved specificity with a hard boiled egg. I think a hard boiled egg is like a chromatogram with a co-eluting peak.

Now, the hard question. What about a fried egg? With a fried egg, the yolk is not hidden, you know it's there. However, there are going to be slices of fried egg that contain both yolk and white. With a fried egg, have we achieved egg specificity? I would argue that we have not.

The fried egg is like a chromatogram with overlapping peaks. Does ISL 5.4.4.2.1 reject all chromatograms with overlapping peaks, because the area of overlap is not "specific" to a single substance?

Mike Solberg said...

Larry, I really didn't mean to ignore your challenge/questions. But you ask so many of them at once! Here goes:

Here's one challenge for you to think about (and when I describe this challenge, I'm skipping over a bunch of other challenges that we'd have to meet before we ever reach this challenge). Remember, ISL rule 5.4.4.2.1 talks about the ability of the lab's techniques to detect only the substance of interest. What techniques does a lab use to detect only the substances of interest? I think it comes down to sample preparation and setting up the right chromatographic conditions for optimal peak separation. From this perspective, the mass spectrum analysis is not a technique to ACHIEVE specificity, it's a CHECK to see whether you've achieved specificity in a given case.

Exactly right.

Does 5.4.4.2.1 require the lab to CONFIRM that its techniques for achieving specificity actually work? Arguably yes, as part of the process of setting up these techniques in the first place and having them reviewed and approved by WADA in the accreditation process. But does rule 5.4.4.2.1 require the lab to REPEAT the confirmation process every time it runs a test?

No, I don't think so.

But specificity is a still an issue, as follows:

5.4.4.1 Selection of Methods -
Standard methods are generally not available for Doping Control
analyses. The Laboratory shall develop, validate, and document
in-house methods for (I add for clarity - "identification of" compounds present on the Prohibited List and for related substances. The methods shall be selected and
validated so they are fit for the purpose.

So the lab has to "develop, validate, and document" their own methods for identification of compounds on the Pro. List. Of course, testosterone itself is not on the prohibited list, but rather only exogenous testosterone. So you have to have methods that are fit to identify that exogenous testosterone is present. That is done by measuring the CIR of the metabolites, and comparing that to the ERC. But you don't know that you have found the right CIR unless you know that you are only measuring the metabolite of interest. So your method can't possibly be fit for purpose if you have not guaranteed specificity in the GC/MS step of the GC/C/IRMS test, and clearly maintained that "purity" knowledge into the IRMS by maintaining consistent chromatological conditions.

5.4.4.1.1 ... The Laboratory must develop as part of the method validation process acceptable standards for identification of Prohibited Substances. (See the Technical Document on identification Criteria for Qualitative Assays) That's our TD2003IDCR.

Again, the prohibited substance is not testosterone, but exogenous testosterone, so the lab is required to develop acceptable standards for the identification of exogenous testosterone, which they have not done until they have guaranteed the content of the peak for which they are finding the CIR. So certainty of specificity must be part of the method, and the validation of the method.

We may end up having to argue that you can't rely on sample preparation and good chromatographic conditions to achieve specificity, that these techniques are unreliable to a certain extent, and for this reason we must require an analysis of the mass spectrum in every case.

You know, this may really be when we say "therein lies the rub." (And, this may be jumping ahead 15 steps, but this point may explain why Landis' legal team approached the hearing the way they did, rather than arguing about whether LNDD produced the mass spec data.) The question does become, "Is sample preparation and (good) chromatographic separation adequate to guarantee specificity?"

I think even WADA recognizes that the answer to this question is "Eh, maybe." TD2003IDCR, which is precisely intended to address the question of identification (and thus specificity) allows for identification by chromatographic separation (with a match of retention times to a standard run contemporaneously). But then it also says "A full or partial scan is the preferred approach to identification" !! The limitation of identification by GC separation is recognized in the document. But the "is the preferred approach" language is obviously not mandatory, so the mass spec data can't be required, and the lack of it can't be an ISL violation.

That's sad. They recognize the limitation, and acknowledge there is a better way, but don't require the better way. To me, that is just inexcusable. But it doesn't end the case, because the GC/MS is just the first part of the process of identifying exogenous testosterone.

If the lab kept the chromatographic conditions the same, then maybe it is case closed. But in our case, they didn't. That breaks the link between GC/MS and IRMS and you have lost your (pseudo) certainty of identification. In that case, there is no way to get back your certainty, other than to rely on the quality of the IRMS chromatography. And again, you have a fight about the quality of the chromatography, and whether it is poor enough to bring doubt as to specificity. And again (frustratingly) here we have a situation where applicable document recognizes the problem, and declares the lower quality acceptable:

ISL 5.4.4.2.1 - Matrix interferences. The method should avoid interference in the detection of Prohibited Substances or
their Metabolites or Markers by components of the sample
matrix.


And, of course, the arbs let the problem slide because it says "should" rather than "shall" or "must." Again, that's sad.

You know, I think for the first time, I see the logic of the arbs majority decision. I think it sucks, but I think I do see it. Previously, I didn't even think their argument was logical. I guess it is, although I think it allows for Landis to be convicted with second tier science, and the sad thing that WADA's own documents recognize it as such. LNDD had better ways available to them (complete mass spec data, and better chromatography), they could have used those better ways (and they may have intentionally erased the data associated with one of those better ways), and they didn't do it. That's disgusting.

Given all that, I have a lot more respect for Landis' legal team than I did previously. Given that the WADA documents allow results based on less than ideal science (which they even recognize as less than ideal science), they had to fight for, one, bad chromatography (which would show the GC separation was NOT a "good enough" method of assuring specificity - which they lost because of the permissive language of "matrix interference," and, two, technical ISL violations which would have given them the burden flip, with USADA having no way to meet the burden given the limitations of the science they relied on.

So, Larry, yes, the challenge you posed was highly insightful and deserving of greater respect. The answer seems to suck, but let's keep working to find a way to hold their lax rules against them.

syi

Mike Solberg said...

As for your second, fried egg, challenge: given that I now understand that WADA says GC separation is good enough for identification and assuring specificity, I think the second challenge is a moot point. It largely doesn't matter what type of eggs you have, or how much mixing of yolk and white you have. If you have put the egg through the right process, then you get to say you have clearly identified what is yolk and what is white (even if your eyes tell you otherwise).

I suppose that there would be some limit to this even for WADA, as if you had the chromatographic equivalent of scrambled eggs. But I guess that's the question we are left with regarding LNDD's work. Do we have the equivalent of chromatographic scrambled eggs or not? Landis and his experts say yes. USADA and their lab directors /experts say no. Two arbs agreed with the latter, so here we are.

But two more points: First, it would seem that there is no basis in the WADA documents with which you could prove "scrambled eggs." There is no criteria set up to decide. According to the documents, if you have gone through the process, is it good enough. It is common sense, not WADA documents, which make us think there has to be a limit to how much the eggs can look scrambled and everything still be "good enough."

Secondly, common sense, but not WADA documents, would make you think that if there was a question about whether the eggs were scrambled (or merely mixed up fried eggs), then you would require additional evidence to decide the issue. That would be the complete mass spec data. But, alas, that is again, common sense, not the requirement of WADA.

Please tell me if you disagree about the egg analogy now being sort of moot.

syi

Larry said...

Swim -

I'm not trying to make you give up the 5.4.4.2.1 argument. I'm trying to get you to see that it's a complicated argument. I personally have NOT given up on the argument, just to let you know.

Swim, as a non-lawyer you need to be more patient with the process of legal analysis. It IS a process, and you have to muck your way through it. I drew a distinction between techniques to achieve specificity and techniques to check specificity, to make a point that rule 5.4.4.2.1 is not as simple or straightforward as some here have claimed. But this distinction is not the right way to start the analysis. The right way to start the analysis is to think about what we mean by specificity, and see if we can define what we mean, and then to pose some "egg" cases to see how our definition works in practice to see if we're comfortable with the definition we have.

I've started this process over at the "Specificity" discussion, and I suggest we try to work our way through it there. I will move over there to respond to your new posts here -- but probably not tonight.

But one last thing: never, NEVER give up on a legal argument when you think you have common sense on your side. You seem to think that when a lab does a CIR analysis, common sense requires the lab to look at the complete mass spectrum. If you're right, and you may well be right, the chances are good that the law is on your side.