Wednesday, December 19, 2007

TIC...TIC...TIC...TIC

Is there something between the 5bA and 5aA? Do the squashed TIC graphs tell us anything? Click images for full-size.



Fig 1: A sample F3 Blank TIC, reported values 5bA -27.54, 5aA -28.40

Fig 2: A sample F3 TIC, reported -28.82, -32.12

Fig 3: B sample F3 blank TIC, reported values -27.54 and -28.31

Fig 4: B sample F3 TIC, reported values -28.79, -31.88



Fig 5 summary: A blank, A sample, B blank, B sample, identically scaled


By appearances, the A blank looks clean, the B blank has a minor signal, and both of the Landis samples have a noticable and distinct peak between the 5bA and the 5aA. This does not appear in the graphs of the individual peaks as one of the three ions for the expected 5bA and 5aA, suggesting is is something else that has not been accounted for.

How big is the intervening peak?



Fig 6: scaled 0 to 7 million in 350 pixels. Minor peak is 11 pixels high, or 3.14% of the 5bA, or 6.28% of the 5aA.


(Initial observation by SYI in comments elsewhere).

UPDATE en route: we're told the graphs are better on other pages; we'll update when possible.

100 comments:

m said...

Why don't you post Shacleton's figure 3 while you're at it.

Haven't we discussed this already, and the Brenna and Meier testimony that addressed this issue and related the Brenna journal article.

My view was that even if there was something between the peaks, Brenna made clear that it didn't improperly bias the 5A carbon values.

Are you raising a new point?

tbv@trustbut.com said...

Shackleton Fig 3 isn't relevant, since it is not a TIC trace. And, it has been shown before, along with the comparable traces from the Landis tests (and blanks).

Looking at the TICs here is new.

Brenna argued the presence of something wouldn't bias the results, but he did not repeat that in rebuttal after WMA showed a -6 difference, and we showed that the presence of a "benign" peak between two others can skew the values.

I would not agree that Brenna made the point you assert "clear".

TBV

Mike Solberg said...

There certainly is something present. It is even more clear in the version of the 'gram they use in the majority decision (p. 41).

m, he's usually leading somewhere...

syi

Ali said...

If Brenna claimed that the presence of another peak either beside or within the peak of interest cannot effect the result, then he was wrong.

Completely and utterly wrong and in denial of his own research.

These new graphs prove that there is something there capable of having an impact on the data processing such that a false positive could be generated.

'Prove that it didn't' would be my stance.

Ali

Mike Solberg said...

TbV, why do you say "looking at the TICs here is new"? That bump has been there all along. It is in the F3 samples from the other days as well, and more pronounced in some of them, like July 22 if I remember right.

I agree it is important. But why is it new?

syi

m said...

TBV,

If I recall correctly, Meier in his slide showed pointed out the "mini peak" between the 5A and 5B in the GCMS sample F3, and raised the possibility that it might bias the following 5A results in the GC-IRMS because the "mini-peak" had "disappeared", and based this on the Brenna journal article which he quoted. Brenna testified as to the meaning of his article, and explained why there would be no improper bias. Whether, Brenna's testimony was on redirect or rebuttle, isn't important. I'm pretty sure it was rebuttle, in any case.

This is not new material.

tbv@trustbut.com said...

It's new here because I never did these pictures before.

Let us admit that the discussion is often circular, and what seems new to one party may have seemed exhausted by others in the last go around.

TBV

m said...

TBV,,

OK.

I sent you another post.

m said...

TBV,

"and we showed that the presence of a "benign" peak between two others can skew the values."

I've never bothered to challenge this before. But I don't think you two are qualified to claim that.

I don't claim to have parsed the spreadsheet model. But, as I said when you first came out with it, for everyone on this board it's a blackbox, and could spew out garbage for all we know. I suggested you run it by the science folks on Daily Peloton, but you thought that crowd was too hostile.

floyd said...

I agree, after all, all Nobel Prize winners run their ideas past Rational Head before publishing them. I'm sure he'll look at it objectively.

Mike Solberg said...

Let us admit that the discussion is often circular, and what seems new to one party may have seemed exhausted by others in the last go around.

Indeed.

I think I have just now begun to understand the science of the various scans (full mass spec, SIM, TIC, etc.).

So, TbV, are you saying that people have sometimes thought that the bump between the 5bA and 5aA was part of one of those peaks (i.e. it was 5bA or 5aA), but now with understanding of the SIMs on USADA 322 is it certain that it is some other substance altogether?

syi

btw, love the new comment format

Larry said...
This comment has been removed by the author.
Larry said...

Swim -

Agreed on the new comment format. Very cool that we can collapse comments, or expand just the one we want to respond to.

I think the point is, without the mass spectrum data, we don't know what the bump is between the 5bA and the 5aA. We don't have any data on that little bump, other than our ability to see the bump on the TIC. If TBV's suspicions are borne out, then we'd know that the bump is made up of one or more of the ions included in the SIM scan, but this could include any of up to 9-16 ions (depending on the nature of the scan), so that wouldn't tell us much.

However, it's also possible that a small bump between the 5bA and the 5aA is characteristic of TICs for CIR testosterone testing. There looks like something similar at UCLA GDC 1362, and at the second Shackleton figure 3 graph. And I can't find any discussion of what this little peak might be!

Interesting to see how similar some of these substances appear to be. For example, notice that 5bA and 5aA both use the same three diagnostic ions, only they seem to appear in different amounts for 5bA and 5aA. Etio and Andro are also very close, they both share the same primary and secondary ion, it's only the tertiary ion that's different.

Ali said...

The other place where the false-positive-inducing peak is very evident is in many of the IRMS plots for the later B samples analysis. You can see it nestled between the 5bA and the 5aA. It really screws up the way they tackle background subtraction.

In fact, it is evident that they can't make their minds up as to the best way of dealing with it.

The spreadsheet predicts that false positives would be likely under these circumstances and, well ...

Ali

Mike Solberg said...

Larry wrote:

I think the point is, without the mass spectrum data, we don't know what the bump is between the 5bA and the 5aA. We don't have any data on that little bump, other than our ability to see the bump on the TIC.

That might not be right Larry - see below.

As for the graphs you link - is the first one messed up? That doesn't look like what we are talking about. And Shackleton has a different type of y-axis (percent, rather than abundance) so I am not sure what the significance of that is. Is that still a TIC 'gram?

Ali, you're right. On some of the other days, the peak between 5bA and 5aA is even more pronounced.

The most interesting day I think is July 22, because it is closest to Stage 17, and because very interesting things happen to this "in-between" peak.

First, back to July 20 - to state the obvious here: if you compare the July 20 GCMS F3 B sample TIC 'gram with the matching IRMS F3 'gram, it is obvious that the "in between" is unaccounted for in the IRMS.

There are at least three possibilities of what could have happened to it.

One, it is in the same position, but simply covered up in the "tail" of the 5bA peak. How that would affect the CIR of either peak is debated, and would certainly depend on where the limits of integration were placed.

Two, it might not have had any carbon in it and so disappeared altogether.

Three, because of different chromatographic conditions between the GCMS and IRMS it might have moved to the right or left and thus have imperceptibly co-eluted with either the 5ba or the 5aA. How that would affect the CIR of either peak would presumably depend on the CIR of the little peak, and exactly how much co-elution there is, and where the limits of integration were placed.

Now, although the following involves some unproven assumptions, we might actually know the CIR of that little peak. It was not measured in the S17 A or B sample because they only measure peaks that reach a certain level on the y-axis.

But when we look at July 22 we just might get some more information. First, on the July 22 GCMS TIC 'gram (USADA 0968) there is also a small peak/bump between the 5bA and 5aA. Note I said "small," - perhaps about the same as on the July 20 'gram.

But then look at the July 22 IRMS F3 'gram (USADA 0991). The "in between" peak has grown (which presumably meant it has a lot of carbon compared to other substances in the 'gram), and in fact had its CIR measured (peak 14).

Now, the CIR of that peak is measured as -23 (very high - i.e. less negative), which admittedly is not very supportive of an argument that it could have skewed the 5bA or the 5aA more negative.

But two things about that: First, when that peak is so close to the 5bA, I don't know how they could get an accurate measurement - maybe it's not really -23. Second, if the 5bA and 5aA can be skewed more negative because of interference from other peaks (regardless of their CIR) - as Ali and TbV show (or claim, depending on your perspective), then that sizable peak 14 could have a real impact.

Now, back to the July 20 'grams. It is certainly possible (perhaps even likely) that a very similar thing happened between the July 20 B sample GCMS TIC 'gram, and the matching IRMS 'gram. To wit, that the peak dramatically increased in size in the IRMS.

And because the chromatographic conditions were not the same between the GCMS and the IRMS, I would say it is highly likely the peak is entirely hidden in either the 5bA or 5aA peak - I don't think there is any way to know which.

So, if those unproven but likely assumptions regarding July 22 are right, then there is surely a serious peak hidden in the 5bA or 5aA of the July 20 F3 IRMS 'gram. At the very least, it is entirely unaccounted for, and there is no way to account for it, because the IRMS can't identify anything.

syi

Ali said...

syi,

Good post.

Yes, now we're getting back into the meat of things. For USADA to demonstrate that none of your scenarios happened, they need to go back to the drawing board and start again with the raw data. That at least may allow them to remove some of the possible screw-ups

That's the only way this can possibly be resolved. Let's get that data and see if any of the traps for the unweary were actually sprung during analysis.

I'm confident that the LNDD technicians fell foul of a number of predictable traps during their analysis. I guess six months on the job training (washing test tubes), doesn't qualify one as an expert in this field. I feel sorry for the individuals involved. They are clearly out of their depth.

The data. It all comes back to the data. USADA need it to confirm their findings which have now been successfully debunked.

Ali

Larry said...

Russ, this may be simpler than you're making it out to be.

Consider that the MS is a scanner with a single particle detector. If you set the MS to scan for one kind of ion, it's going to give you a very accurate reading of the presence of that ion at all retention times. If you set the MS to scan for a broad range of ions, then the scan is going to lose some accuracy, because the MS can only detect one particular kind of ion at any given moment. So, if an ion with a mass-charge (mz) of 100 arrives for detection at the moment when the MS is scanning for a mass-charge of 101, then that ion is not going to be counted.

Of course, since the MS scans rapidly across the spectrum for different types of ions, the MS is going to record a reasonably accurate sample for all ions even if it is set to record a wide range of ions. This will be good enough for many purposes.

The SIM scan is a scan of less than the total range of possible ions in a sample. A TIC is a scan intended to pick up ions over the widest range of mass-charge where you would reasonably expect to find ions.

Now, imagine that you've taken a total scan of a particular sample, and later you want to produce a SIM-style graph from your total scan to measure for the presence of a particular ion. You CAN do that -- you simply graph the data you have in your total scan for that particular ion, and exclude the data you have for other ions. Of course, that's not going to be as accurate as if you only scanned for the ion in question, but again, it may be good enough for many purposes.

(I've always assumed that the graphs at USADA 322 were SIM-style graphs produced from a full scan. TBV is causing me to question this assumption, but I believe that it is possible to produce graphs for a few ions from full scan data.)

Once you have collected data, there are two logical ways to present this data. One is to display the intensity of the ions you've collected based on the time they were collected, and this is the TIC (as TBV has explained it, the "T" in TIC refers to the fact that you're displaying all of the data you've collected, but not necessarily that the data you've collected represents a total scan). The other is to display the intensity of the ions you've collected by the mass-charge of each ion, and this is a mass spectrum graph.

Larry said...

Mike, brilliant post! I'll need to dive into it in depth later on.

But as to your questions: both the UCLA graph and the Shackleton graph I cited earlier were taken from TBV and Ali's series where they attempted to quantify the potential for error in various chromatograms. I'm not sure about that UCLA graph either, but TBV and Ali included it in their mix, so I thought it was fair game to mention it, and there does appear to be a blip there like the one you're talking about.

As for Shackleton fig 3, I know that TBV has said otherwise, but I don't see how that is not a TIC. In the article (GDC 1098 - 1106), Shackleton identifies this graph as "gas chromatographic separation (on GC/C/IRMS instrument)". It identifies Etio, 5bA, 5aA, pregnanediol and pregnanetriol. I don't understand why this would not be a TIC. The scale on the y-axis of a TIC is pretty much arbitrary, as you're only able to measure the relative sizes of peaks with a TIC. Or so I have read.

Finally, if we DO have a small co-eluting peak in the 5aA IRMS, then I don't think it matters what the delta-delta is for that small peak. I think that the little peak will act like the co-eluting peak shown at figure 15 of Idiot's Guide To Integration Part IV, and increase the apparent size of the larger peak. So if the little peak appears in the C12 portion of the IRMS 5aA peak, then (regardless of the delta-delta of the little peak) the little peak is going to raise the apparent delta-delta for the 5aA.

That is, if I've followed Ali and TBV's discussion correctly, and subject to confirmation by someone who knows the science better than I do.

m said...

Larry,

"So if the little peak appears in the C12 portion of the IRMS 5aA peak, then (regardless of the delta-delta of the little peak) the little peak is going to raise the apparent delta-delta for the 5aA."

I believe that is what Brenna refuted in his testimony. He said it would decrease the 5A delta, not increase it. But we've been around and around on this several times now. Ali even went over the Brenna article and said Brenna was correct in his testimony, and that Meier had been wrong.

The UCLA chromatogram was for the F2 fraction, not the F3. I pointed that out to TBV when he first used it as a cherry picked (my words) example of good chromatography.

I'm checking out of the discussion until the new year, so you all have Floyd to yourselves now. -)

I did send one last post to TBV, but he hasn't posted it yet.

m said...

P.S.

Happy holidays to everyone, even the "idiots".

Ali said...

So sad that m has been pushed into a corner where he has to lie:

"Ali even went over the Brenna article and said Brenna was correct in his testimony, and that Meier had been wrong."

It doesn't surpirse me. I've long held opinions about m which I couldn't publish.

How fortunate that he should publish them himself.

Ali

m said...

Happy holidays Ali

Larry said...

M, the UCLA is an F2? Thanks for the correction. Maybe that blip is not as typical as I'd thought.

The point I was trying to make does not really go to the Brenna versus Meier debate. It is simply that, based on my understanding, it's the mere location of the small co-eluting blip within a larger peak that's going to throw off the delta-delta of the larger peak, and that the delta-delta of the blip probably does not matter.

The Brenna versus Meier debate has to do with the direction in which the delta-delta value is likely to be thrown off, and (I believe) focused on the effect of peak interference and not on the effect of a co-eluting peak. But I've already been corrected once today, and I'm ready to be corrected some more.

Ali said...

In the UK, we still say "Merry Christmas and a Happy New Year".

To me, that has a better ring to it than "Have a nice holiday"

I have holidays throughout the year. Implying that Christmas is just another break from work detracts from its religious and cultural significance. Having said that, I'm not that religious myself, but the whole foundation for Christmas is religious.

I'd prefer to say "Everyone, have a Merry Christmas and a Happy New Year", if I were so inclined to say anything. Which I'm not.

Scrooge Ali

tbv@trustbut.com said...

(as TBV has explained it, the "T" in TIC refers to the fact that you're displaying all of the data you've collected, but not necessarily that the data you've collected represents a total scan).

I'd have used "suggested" rather than "explained" as a characterization of what I've put forward. I don't know, and don't presume to assert one way or another.

TBV

Mike Solberg said...

P.S.
Happy holidays to everyone, even the "idiots".


How kind of you to think of us, m. Whatever your holiday is, may it be a blessing.

I think many of us do well at this time of year to remember the words of the blessed virgin:

"He has scattered the proud in the thoughts of their hearts."

Peace,
Mr. Idiot

bk said...

A little different, but related topic.

Does anybody notice the Landis F2 of USADA 343 has a peak at 1380 seconds and exactly where the Landis 5A F3peak is in USADA 349. BUT, the Blank F2 in USADA 340 has no peak at all at 1380 seconds.

Would it be possible that whatever is in the F2 is also in the Landis F3? Anybody know the chemistry here?

bk

Russ said...

Larry,
You guys are on the trail, I think. Great work!

I am not attacking any of it, I am however trying to hone the fine points of understanding and expression the MS technology here. I think it is helpful to get it as accurately as possible (given our resources :-).

So a few more comments (and off to bed - next chance to comment is Friday evening).

You said:
"Russ, this may be simpler than you're making it out to be.

Consider that the MS is a scanner with a single particle detector. If you set the MS to scan for one kind of ion, it's going to give you a very accurate reading of the presence of that ion at all retention times. If you set the MS to scan for a broad range of ions, then the scan is going to lose some accuracy, because the MS can only detect one particular kind of ion at any given moment. So, if an ion with a mass-charge (mz) of 100 arrives for detection at the moment when the MS is scanning for a mass-charge of 101, then that ion is not going to be counted.

Of course, since the MS scans rapidly across the spectrum for different types of ions, the MS is going to record a reasonably accurate sample for all ions even if it is set to record a wide range of ions. This will be good enough for many purposes.

The SIM scan is a scan of less than the total range of possible ions in a sample. A TIC is a scan intended to pick up ions over the widest range of mass-charge where you would reasonably expect to find ions.

Now, imagine that you've taken a total scan of a particular sample, and later you want to produce a SIM-style graph from your total scan to measure for the presence of a particular ion. You CAN do that -- you simply graph the data you have in your total scan for that particular ion, and exclude the data you have for other ions. Of course, that's not going to be as accurate as if you only scanned for the ion in question, but again, it may be good enough for many purposes.

(I've always assumed that the graphs at USADA 322 were SIM-style graphs produced from a full scan. "

I mostly agree, only two comments.. first: The setting for a "SIM scan" would be for a particular mass not a particular ion. The particular ion and molecular makeup (composed of ions) is to be decoded (that is the challenge for MS specific identification). This is a fine point of terminology but suggests to me a tiny bit of confusion.

Second:
To run a (most likely several) select SIM scans means multiple runs and injections of samples. It appears to me, with the use of software to analyze the run results and LNDD lab tech's trying different base lines, etc requires that the full scan was run and all the data to be present (within the ranges they targeted for the test).
So I suspect your initial idea of SIM plots from the total collection of the data is more likely correct.

I'll be surprised if these points make much difference, so hope I am not wasting your time.

Keep up the good work and m, if you are reading, have a Merry Christmas.

Thanks all,
Russ

daniel m (a/k/a Rant) said...

Scrooge Ali and M and Everyone Else,

Happy ChanuRamaKwanzaaMass, too.

Or as the English side of my family would say (as Ali pointed out), "Merry Christmas and a Happy New Year."

Larry said...

And a successfully concluded Festivus to us all!

woody said...

You forget that FSM has the ability to effect all results with HIS noodly appendage. Even if all requirements are met and proper procedure is followed, HE wiggles his appendage and voila, the peaks are changed. Do not underestimate HIS abilities. venganza.org

tbv@trustbut.com said...

For the uninformed, my 16 year old daughter explains the FSM is the "Flying Spaghetti Monster" that created the universe. It is offered as a theory as valid and provable as any for a theistic God. And should anything turn up that appears to contradict the FSM, hypothosis, "His Noodly Appendage" works behind the visible world to fix things up.

One can easily come up with analogies using the FSM and the testing we've seen in this case, but I woudln't dream of doing so myself.

TBV

wschart said...

Hey, let's not forget Saturnalia!

Ali said...

As we're kind of back on the poor chromatography track, I though it relevant to highlight that this was the very subject of "RESPONDENT FLOYD LANDIS' SUPPLEMENTAL PRE-TRIAL BRIEF".

This put the case that poor chromatography and the subsequent peak overlap between 5bA and 5aA (and something else) resulted in unreliable results. This was rejected by the AAA panel.

In other words, they didn't accept that as being true. If they had, then presumably as a minimum, they would go back and examine the EDFs to see the extent of the problem and whether other processing algorithms would yield more accurate results.

I was wondering how valid that judgement was so I had a look on the internet to see what other people thought about it.

I didn't find a single case that claimed poor chromatography had no effect on accuracy. What I did find was:

1) "A general goal of isotope ratio analysis is to make measurements with a precision better than 1 part in a thousand of the naturally abundant value. Clearly, this requirement places stringent demands upon the accurate determination of the baseline, which can be estimated correctly only if there is true baseline separation between adjacent peaks."

-The application of a simple algorithm to isotope ratio measurements by gas chromatography/combustion/isotope ratio mass spectrometry
(L.J.C Bluck and W.A. Coward)


2) "The possible sources of the errors associated are considered ... on average, depleted in 13C by ca. 9‰ relative to the native woods and by ca. 7‰ relative to the Klason lignins. This large variability can be partially attributed to overlapping chromatographic peaks and to the low intensity of some of the peaks."

-Evaluation of errors associated with δ13C analysis of lignin-derived TMAH thermochemolysis products by gas chromatography–combustion–isotope ratio mass spectrometry
(L.E. Beramendi-Oroscoa, C.H. Vaneb, M. Coopera, C.G. Suna, D.J. Largea and C.E. Snapea)

3) "The conventional algorithm resulted in systematic bias related to degree of overlap..."

-Curve fitting for restoration of accuracy for overlapping peaks in gas chromatography/combustion isotope ratio mass spectrometry.
(K.J. Goodman , J.T. Brenna)

4)"Due to the fact that isotope ratios cannot be determined accurately from the partial examination of a GC peak, high resolution capillary gas chromatography (HRcGC) resulting in true baseline separation for adjacent peaks is of paramount importance for high-precision CSIA"

-Handbook of Stable Isotope Analytical Techniques, Volume-I, Chapter 8
(Wolfram Meier-Augenstein)

5) "The major analytical difficulty with this approach is accounting for peak overlap, i.e. when adjacent peaks are not baseline separated. Signal overlap can contribute to both inaccuracies and imprecision in measured isotope ratios"

-Inaccuracies in selected ion monitoring determination of isotope ratios obviated by profile acquisition:
(Adam G. Cassano, Benlian Wang, David R. Anderson, Stephen Previs, Michael E. Harris and Vernon E. Anderson)

6) "It follows that when analysed within a complex mixture, this peak tailing would lead to peak overlap precluding accurate stable isotope ratio measurements."

-Biomedical and Forensic Applications of Combined Catalytic Hydrogenation-Stable Isotope Ratio Analysis
(Mark A. Sephton, Will Meredith, Cheng-Gong Sun and Colin E. Snape)

7)"Both peak overlap and peak distortion have a detrimental effect on accuracy as well as precision of isotope ratio"

-Stable isotope analysis of fatty acids by gas chromatography–isotope ratio mass spectrometry
(Wolfram Meier-Augenstein)

However, on the plus side I found a delta symbol "δ". Go crazy ...

Ali said...

... and a ‰ symbol.

That must be worth a high five. Man, I'm cooking today ...

Ali

Mike Solberg said...

That's great, Ali. Good stuff.

I am not sure, however, that you set the quotes in the right context, legally speaking.

[The pre-trial brief] put the case that poor chromatography and the subsequent peak overlap between 5bA and 5aA (and something else) resulted in unreliable results. This was rejected by the AAA panel.

In other words, they didn't accept that as being true.


But I would say it is closer to the truth that they "punted" on whether that was true, because they said it didn't matter anyway. In the majority decision, this all came down to their discussion of "matrix interference." And the problem is that the ISL says that the assay "should" avoid matrix interference, not that it "must" or "shall" avoid matrix interference. This "should" gave them the escape route to avoid a burden flip on this issue.

That is why some of us have been trying to develop the components of an argument about "specificity." Because when it comes to "specificity" the ISL uses the stronger "shall" and "must" language.

Larry thinks this specificity argument is very complex, and that is it not even clear what specificity means in the ISL, but I'm not on board with that yet.

I think that is what we really have to work out next, because, honestly, with the evidence of the July 22 'grams, I think it is abundantly clear that there is a hidden peak in either the 5bA or the 5aA, and there is no way to know which.

The more difficult thing is to tie that with a specific ISL violation to create a burden flip.

syi

Mike Solberg said...

High five, dude. Now you're cookin' with gas!

syi

Ali said...

OK, finally it sinks in. You have to prove a violation to trigger the burden flip. How about rule 5.4.4.2.1:

5.4.4.2.1 Robustness.
The method must be determined to produce the same results with respect to minor variations in analytical conditions. Those conditions that are critical to reproducible results must be controlled.

From the supplemental pre-trial brief:
"Each operator uses her "judgment and experience" to manually select background points as well as peak start and end. As such, LNDD was unable to reproduce the original results, despite 22 attempts to do so."

Hmmm ... Slight problem on the reproducability front. 22 attempts ? That's a bad day at the office by anybody's standards.

Ali

Ali said...

Or how about:


5.4.4.3.2 Uncertainty.
The method, including selection of standards and controls, and report of uncertainty should be designed to fit the purpose.


So they assess the impact of overlapping peaks when they determine their uncertainty, do they ? If they don't then I'd say it's not fit for purpose.

This legal stuff is a breeze. Why the big bucks ? :-)

Ali

Larry said...

Ali, the big bucks are because we have a legal monopoly on the ability to practice law.

The ISL contains a confusing melange of rules, and not all of them are directly applicable to how a lab conducts testing in a given case. Some of the rules are criteria that WADA is supposed to use to determine whether a lab should be accredited. Some of the rules are criteria that a test is supposed to satisfy before the test can be approved for inclusion in the lab's SOP.

Consider this analogy. Say that you're representing a criminal defendant who drove the get-away car in a bank robbery case. The defendant was arrested by a cop who, at the time of the arrest, weighed 200 pounds. There's an internal police department rule that requires all cops to weigh 180 pounds or less. Can you use this rule to prove that your client's arrest was improper?

(This is not intended to be a perfect analogy! It's just to get the discussion started.)

tbv@trustbut.com said...

OK, Larry, on what grounds in the ISL might you be able to debate the fitness for purpose of a test as implemented in an SOP?

I think the answer is that you, someone charged, can't. This is the "you can't question the science" rule.

It appears to be the case there is no one outside the lab in question that reviews the fitness for purpose of a particular test as implemented in SOP.

While the fallback appears to be the ISO accreditation, it is not clear that ISO evaluates fitness for purpose, only whether the SOP was followed.

Is there anywhere to go to argue an "unfit for purpose" based on the SOP?

TBV

Mike Solberg said...

Ali, the big bucks are because we have a legal monopoly on the ability to practice law.

"Practice" law. Boy, that's for sure.

syi

Ali said...

Larry,

In the case of Robustness (5.4.4.2.1), the requirement is "The method must be determined to produce the same results with respect to minor variations in analytical conditions. Those conditions that are critical to reproducible results must be controlled".

If after 22 attempts, they fail to reproduce the original result, doesn't that mean they fail to meet that requirement ?

The method they use is clearly inadequate for the production of repeatble result. There weren't any differences in the data. It was exactly the same data and they simply couldn't reproduce the original results.

Now, if it were the case that all of their attempts were within +/- 0.8 δC13 of the original, that would be OK, I suppose, but what if any of them weren't ?

If any of those attempts were more than 0.8 away from the original, that would also blow their claimed accuracy out of the water as well as demonstrating that they have a non-robust process.

I guess what I don't understand is why you can't just point to the rule and say "sorry guys, but you're not compliant. Show me how this did not effect the results" ?

I'm confused.

Ali

Mike Solberg said...

Ali, my last post in the "Getting around to specificity" thread might help explain some of the confusion.

syi

tbv@trustbut.com said...

Ali,

You cite 22 test that didn't reproduce, yet I don't think that is established fact. My recollection is that Young claimed all the reprocessing results were within the 0.8 window of tolerance. I suspect he may have made some qualitifcations that seem dubious, but he did make that argument.

GDC1350 (which we don't have directly) is cited in page 11 of the USADA closing argument.

There are 8 original values, and 24 reprocessed values. Of the reprocessed values using auto methods, the A sample e-11k, a-11k are beyond the claimed 0.8, and in the B sample, the e-11k, a-11k, and 5a-p are out of spec. That's 5 of 8 out of spec, with the ones using 11k further out.

For manual reprocessing, in the A sample, all match the original processing within 0.8, and in the B, the e-11k and a-11k are out, and the 5b-p and 5a-p are in spec. That's 2 of 8 ought of spec, but those two are significant departures (1.4 and 1.9).

When processing full auto on the new machine, in the A sample, only the 5a-p is out of spec, by 1.1. In the B, all are in spec. That's 1 out of 8.

(I do not consider the zero subtraction to be relevant in this discussion.)

I don't think we can reasonably say "all" of the reprocessing shows the reproduciblity is bad, given the 0.8 spread allowed by the specificatoin.

We will grant it should have been possible to get exact reproduction using the "saved method" technique, but we need not dwell on that, because it wasn't used, and the tolerance is what is being used. We'd still be left with the variances in the "auto" methods.

In the A, the 5b-p and 5a-p auto methods are out of spec with each other, the 5b-p being worse.

In the B, the auto results for the e-11k, a-11k are bad, but not the 5b-p or 5a-p.

In terms of reproducibility of the data processing results, I'm inclinded to trust the auto methods on the Isoprime2 more. however, they do not (and can't) correct for inadequate separation earlier. The new machine's auto thinks the B blank is a 'near positive' (but not quite) for the 5a-p at -3.66.

This same isoprime2 is what reported Landis "positives" on the alternate B's, with the 5a-p values of -4.62, -5.06, -4.80 and -4.96.

TBV

Ali said...

TBV,

I was referring to the following text:

"As such, LNDD was unable to reproduce the original results, despite 22 attempts to do so. Because of this situation, it was agreed that three outputs would be produced and printed for each sample:

1. automatic background subtraction; and
2. manual processing - using the same operator exercising her judgment as she did during the process of creating the original data printed in the LNDD Documentation Package; and
3. no background subtraction"

These 22 appear to have occurred before the reprocessing exercise. Is this what Young referred to when he said they were all within spec ?

Ali

Ali said...

TBV,

I had a look at the reprocessing results as well. I suppose only the original and manual methods are relevant (both being representative of their "method").

As you said, the B sample E-11K and A-11K are out of spec (I get differences of 1.67 and 1.9 respectively). It's important to realise that the numbers in the table are the mean of three tries, so we can safely assume that the largest difference was greater than that for both cases.

If you give them the widest possible margin of error (+/- 0.8 = 1.6), then it is clear that their method fails to meet the expected accuracy. In two cases out of eight, they have failed.

That's not a particularly good record, by the way. 25% of Floyd's Stage 17 results couldn't be reproduced with the declared accuracy using the LNDD method.

Also, I'm being very generous here and asssuming that the two sets of results happen to be the extremes of what is possible using their method. The probability of that being the case must be extremely low.

So where does that leave the claimed +/- 0.8 δC13 ?

It's clearly not true, so what is their actual accuracy ?

Ali

tbv@trustbut.com said...

Manual appears to be the operational method in use on the isoprime1, but when using the isoprime2 (rabbit ears!), they only use auto.

So the IP2 auto results are reasonable to compare for considering the validity of "the method", I think.

They might want to use manual on the IP2, but they don't know how to.

They evidently don't believe the auto on IP1 is good enough.

TBV

Jon said...

As to the Brenna testimony. Suh did ask Brenna under cross about the "mini-peak(s)" seen on the CG/MS, which disappeared on the GC/C/IRMS. Brenna did not deny the existence of the peaks. Brenna did not deny that the peaks might contain carbon. Brenna did not identify the peaks. Brenna also admitted that the peaks could be absorbed into 5A or 5B. Brenna did not deny that the unknown peaks may have co-eluted. Brenna merely stated that if the unknown peaks were absorbed the carbon content would not aversely skew the total carbon ion content result for 5A or 5B. Brenna did not provide any rational for his assumptions, as I recall.

Larry said...

Ali -

I guess you don't want to spend any time discussing my fat cop analogy. I don't know why my analogies don't go over better. It's all I can do to get Mr. Idiot to accept that substances passing through the GC are like fried eggs, and now you won't accept that procedures that won't reproduce are like fat cops.

Ali, start with the idea that the ISL is a set of standards for labs. Some of these standards are the ones we're most interested in: these are the standards that govern whether the lab performed a test correctly, or whether the test as performed supports an adverse analytic finding (AAF). But not all of the standards function in this way. Consider, for example, ISL 5.3.6.2, which says that the lab's waste disposal must be in accord with national laws. You're not going to be able to overturn an AAF because the lab didn't put its Coke bottles in the recycling can. Or rule 5.3.7.3.1, which says that the lab director has to be familiar with the list of prohibited doping substances. You're not going to be able to force Dr. Ayotte to sit down at the arbitration hearing to take a pop quiz. You may think these are silly examples, and I DID pick them in part for their entertainment value, but if you read through the entire ISL, you'll see that most of the ISL rules fall into this category. Not everything in the ISL goes to the question of whether we can throw out the results produced by the lab in an individual case.

Next, consider that one of the purposes of the ISL is to provide rules for the labs to follow in developing their doping tests. The key rule here IMO is ISL 5.4.4.1, which states:

"Standard methods are generally not available for Doping Control analyses. The Laboratory shall develop, validate, and document in-house methods for compounds present on the Prohibited List and for related substances. The methods shall be selected and
validated so they are fit for the purpose."


In other words, it is mostly left to each lab to develop its own doping tests. Many of the rules in the ISL are intended to guide this development process, to set forth rules
governing this development, and to indicate what a test should be able to do in order to be a good test. (Note to Mr. Idiot: any time you see an ISL rule that uses the word "should", consider the likelihood that the rule contains a standard for the development of a lab protocol, as opposed to a rule that applies to the use of the protocol in a given case. We lawyers are not prone to writing rules containing optional pieces of advice, regardless of the idiotic statements of the majority arbitrators that might lead you to think otherwise.)

Ali, as a scientist, you know that there's a difference between the rules you follow when you develop a protocol, and the rules you follow (after the protocol is developed and approved) when you utilize a protocol in a given test. As a general matter, the rules you'll follow in developing a protocol are going to be more stringent. You're going to do more testing. You're going to poke around and see if the protocol gives you accurate results, and if the results are consistent from test to test, and if the results can be reliably achieved under different circumstances, etc., etc. You probably have a minimum standard in mind for the protocol to satisfy, but you'll optimize the protocol and exceed these standards if you can.

But once you've established the protocol, and written it up, and put it into practice, then things change. You're not evaluating the protocol in the same way any more, you're using the protocol in your day-to-day work. Yes, the protocol probably includes some amount of self-check, to make sure that the test results are accurate, but this self-check is not the same as the process you went through to validate the protocol in the first place.

This is the problem we have with the ISL. Some of the rules in the ISL are intended to govern the development of the lab's doping tests. They are rules that govern whether the protocol is valid. They're not rules intended to determine whether the protocol has been validly used in a given case.

Let's consider rule 5.4.4.3.2 as an example. The rule starts (under 5.4.4.3) by explaining the concept of uncertainty in relation to prohibited substances that are threshold substances and non-threshold substances. Rule 5.4.4.3.1 requires the lab to establish criteria for identification of compounds. Rule 5.4.4.3.2 then states that for threshold reporting, the doping test should be designed to fit the purpose (of determining that the substance is present in the athlete's system above the amount of the threshold). This sounds an AWFUL lot to me like a rule addressing the development of an anti-doping test.

The remaining question is, does an athlete have the right to challenge the validity of a test as developed by a WADA lab (as distinct from challenging whether the test was performed correctly or whether the test was interpreted correctly). The answer to this question seems to be "no" in most cases. For the most part, the presumption in the WADA rules that the lab followed the ISL means that the lab is presumed to have developed good protocols that satisfy ISL requirements. There do appear to be exceptions to this rule where a protocol flies in the face of a specific ISL requirement. For example, the ISL required LNDD to identify and measure testosterone and epitestosterone with three diagnostic ions. LNDD used only one diagnostic ion for this purpose. That was too much even for the hand-picked arbitrators to tolerate.

You've argued that the LNDD's use of manual methods to select background points and peak start and end violated ISL rule 5.4.4.2.1 in robustness and reproducibility. Clearly, the FL legal team felt the same way. But the majority ruled otherwise. The rule seems to be, if you want to challenge a test METHOD, you're going to need to point to a very specific ISL requirement (i.e., 3 ions), and arguments based on fuzzier standards probably will not fly.

I've pretty much limiting my focus to the rules in the ISL governing how tests are to be performed and interpreted, though Mr. Idiot and Duck has forced me (kicking and screaming) to consider the issue of specificity.

Still confused?

blackmingo said...

Jon-

I agree with your characterization of Brenna's testimony. Shackelton I think went as far to fathom that the size of the peak was too small to affect the major peaks but also did not do much 'splainin' as the kids here like to say.

I looked at the hearing transcript -page 1262 to 1270 the cross of WMA- where WMA is challenged to come up with an estimate on how much the smaller peak could effect the deltas of much larger adjacent peaks. With the assumptions he's given -small peak is 5% of the major peak with a -70 delta in the small peak. This would move the major peak estimate by -2 deltas (on page 1270, from -28 to -30).

That's why I like the spreadsheet findings Ali and TBV have posted -they imply that everyone is right about the small peak's affect on the larger (at least that is my limited understanding / take home message).

Ali said...

Larry,

I'd have been more than happy to hear about the fat cop analogy, I thought it was shaping up to be quite amusing.

OK, so you're saying that it is dubious as to whether you can challenge their methods, unless you can demonstrate they are not fit for purpose, which may be difficult.

If you can identify something which indicates that they didn't perform the test correctly, then that's an easier target ?

So what about the uncertainty question. LNDD developed a means for assessing the uncertainty in a particular measurement method. That method is layed down in their SOP and they claim an uncertainty of +/- 0.8 δC13. Now they used that method to process the same data twice (first in the original result and then in the reprocessing).

Observed differences between the two sets of data were in excess of 1.6 δC13. Now, the way I see it, there's no two ways about this. Either their method for establishing their uncertainty is flawed, or they didn't perform the test correctly.

In your language, either the development of the protocol or the implementation of the protocol must have been wrong because they have proven that the uncertainty is actually greater than +/- 0.8 δC13.

There's another thing which occurs to me here. TBV said that LNDD used the Isoprime2 for the subsequent B sample work and did not use their SOP. They operated in Auto mode which is not what they normally do. As far as I'm aware, they still claimed +/- 0.8 δC13 for this analysis, even though it was not performed using their SOP method.

So are they claiming that the level of uncertainty is the same between the Auto mode and their "method"

That's not what the tesimony of Brenna was. He was quite specific that in his lab he insisted on manual confirmation of integration limits and background subtraction because the software could not be relied on to perform this accurately.

I put it to you :-), that the processing of the subsequent B samples was done with no knowledge of the actual degree of uncertainty in the measurement.

That in itself is a clear violation of the ISL.

Ali

bk said...

I just took another look at the Landis F2 peak that is eluting very close in time to the 5A in F3. Looking at the Alt-Bs, the F2 peak is missing in the blanks and Agilurilla controls. The Landis F2 peak is very negative in all those samples.

Does anybody else think this may be more than just coincidence? Is it possible that this psudeo-5A compound is not differentiated in the F2 and F3 seperation process?

BK

Larry said...

Ali, I'm finally getting to your 12:21 AM post. Let's talk about uncertainty.

Under the ISL, the concept of uncertainty appears to apply only to the detection of threshold substances. (On this score, you might be interested in the conversation I've been having here with Mr. Idiot, which also turns on the distinction under the rules between threshold and non-threshold substances.) See for example ISL rules 5.2.4.3, 5.2.4.3.2.3, 5.2.6.8, 5.4.4.1.2 and 5.4.4.3.2. In particular, note the statement in ISL rule 5.4.4.3 that for non-threshold substances, quantitative uncertainty as defined in ISL 17025 does not apply. Since testosterone is a non-threshold substance, arguably the ISL rules on uncertainty do not apply to testosterone testing.

But if LNDD has uncertainty rules built into its SOP that are applicable to testosterone testing, then the violation of these rules could give rise to an ISL departure.

The question is, what exactly did LNDD do that violated their uncertainty rules?

I think, Ali, that you're pointing to the different delta-delta values obtained when the data was reprocessed under Dr. Botre's supervision - results summarized at paragraph 139, p. 44 of the decision of the majority arbitrators. Yes, these numbers are all over the map. However, these numbers were obtained by reprocessing the data using 4 different methods: (1) the original method employed by LNDD, involving manual integration; (2) the OS2 automatic integration feature without manual integration; (3) the OS2 automatic integration with the baseline subtraction feature turned off and no manual integration; and (4) the newer MassLynx software. The last 3 of these analyses were performed at the request of FL's team, and are not part of the LNDD SOP. So I'm not sure that the lab's uncertainty rules can be fairly applied to compare results obtained by using methods outside of the lab's SOP.

There's also the problem that FL's S17 sample flunked the CIR test regardless of the analysis method used during the data reprocessing. That suggests that, whatever the problems LNDD had with uncertainty, these problems did not cause the adverse analytical finding.

I get that these variations in results are disturbing. Maybe LNDD needed to use a higher uncertainty factor, something approaching 2 or 3 delta-delta units. But that would not have saved FL from a delta-delta above 6. I also get that LNDD's inability to properly quantify its CIR uncertainty is further evidence that this lab cannot be trusted. But so far, I don't see an ISL departure on this issue.

Ali said...

Larry,

If their uncertainty factor was about 3, then it may well have saved Floyd. He doesn't have to get from -6 down to 0. The threshold for positives is -3.

With regard to the reprocessing data, I was only looking at th original results and the manual results. These were both produced using the SOP, with differences greater than 1.6 from a SOP that allegedly has an uncertainty of +/- 0.8.

If there's a violation here, it's the lab that has to prove that their method for determining the uncertainty is fit for purpose.

They may make the point you made (whatever it is, it's not enough), but I'd say (as this can of worms is now open) prove that your method of determining uncertainty when peaks overlap is +/- 0.8. And if it's not, tell me what it is.

I'd put good money on the fact, and it is a fact, that the additional uncertainty you get when peaks overlap is not accounted for.

I thought all we had to do was get an "in", so that we open things up for investigation. I see their inability to meet their own defined uncertainty as a definite "in".

I can't understand why it isn't.

Ali

Ali said...

Actually, something else occurs to me.

This difference of 1.9 that they got was achieved reprocessing the same electronic data file. It's not like they rerun the whole GC/C/IRMS test. Also remember that the 1.9 was the mean of three attempts, so at least one of their tries had a difference greater than 1.9.

So where does that leave their claimed uncertainty ? Just the data processing aspect can induce differences greater than 1.9 (and that's with somebody looking over your shoulder) and we're not even accounting for the uncertainty of between run differences in the GC/C/IRMS process.

Come on Larry, how far do you have to go ?

Merry Christmas,

Ali

Russ said...

Ali, Larry,

Ali said:
"If there's a violation here, it's the lab that has to prove that their method for determining the uncertainty is fit for purpose."

Seems to me these observations also may feed into the lack of reproducible results?

Another (out of the blue) thought is: It is common to expect another view angle (such as in navigation) to confirm a result. A lack of consistency in different methods of analysis, be it manual or auto or os/2 or newer software, suggest that the data don't add up so to speak. From an accounting standpoint, are the books cooked?

Again great job all,

Merry Christmas to all

Russ

Larry said...

Ali -

Sometimes you DO have to draw me a picture.

I am now completely focused on the chart contained in paragraph 139, p. 44 of the majority opinion, showing the original CIR delta-delta figures obtained by LNDD on the S17 samples, and the CIR delta-delta figures obtained during the electronic data reprocessing. I'm confident that I'm looking at stuff here that's been looked at many times before, just not by me. There's some strange stuff going on here.

1. If I'm following your point here, you are comparing the Original Result column on this chart to the Manual column. The Original Result shows the values LNDD received from its SOP manual processing of the S17 data and the blanks data in July and August of 2006, and the Manual column shows the same SOP processing of the S17 data, this time as part of the reprocessing of the electronic data files (EDF) in April and May of 2007. The blanks data reprocesses fairly well, within an error of around +/- 0.3. The "A" sample also reprocessed reasonably well, roughly within the lab's actual uncertainty of +/- 0.8. The "B" sample did not process this well. The 5A-P and 5B-P was within the +/- 0.8 range, but the E-11K was off by about - 1.7, and the A-11k was off by -1.9. And you're quite right, this is an apples to apples comparison of how accurately LNDD can interpret electronic data. This does not factor in any amount of uncertainty for the sample preparation and chromatography, which the lab would need to do in order to properly estimate uncertainty.

2. In paragraph 1 above, I looked only at the amount of uncertainty involved in the processing of electronic data. How much additional uncertainty do we need if we also want to cover sample preparation and chromatography? This is a harder question to answer than the question we addressed in paragraph 1. In paragraph 1, we were able to isolate the data processing component of the CIR analysis from all other components, in part because it's the last step in the process. I'm not a statistician, but there doesn't seem to me to be a similar way to isolate and measure the sample preparation and chromatography, without also factoring in the error introduced by the data processing. However, let's give it a try anyway. Let's compare the Original Result column for the "A" sample to the Manual column for the "B" sample, and the Manual column for the "A" sample to the Original Result column for the "B" sample. This comparison should take in all three sources of uncertainty: sample preparation, chromatography AND data processing. Again, the results for the blanks are good, I'm seeing an error range of around +/- 0.5. For the S17 sample, the original processing of the "B" sample compares very well to the manual reprocessing of the "A" sample - again at an error range of about +/- 0.5. However, the original processing of the "A" sample compares poorly to the manual reprocessing of the "B" sample: the error range here is around +/- 2.4.

3. Next, I'm going to focus on the blanks, reminding myself that blank urine is a "live", "dirty matrix" of what a typical non-cheating athlete's urine is supposed to look like. Skimming through these values, I can see that most of them are mildly negative under LNDD's SOP ... except for the 5A-P (the test that convicted FL). The 5A-P for the urine blanks is significantly more negative than the other three tests. Why should this be? (If we look at the Auto and Masslynk columns - representing data analysis methods not used by LNDD - the blank urine is negative enough to support an adverse analytical finding.) The 5A-P for the urine blanks per the LNDD SOP is between -1.59 and -1.89. That seems way too high to me, and (to my unscientific mind) would justify adding in some more uncertainty. Let's add in enough uncertainty to bring the 5A-P down below -1.0. So ... I'm at an uncertainty of +/- 2.4 for the CIR tests, with a special uncertainty of +/- 3.0 for the 5A-P.

With me so far?

Ali, two quick observations, then I'll pause for a response. First, we don't have nearly enough data in paragraph 139 of the majority opinion to quantify uncertainty. Presumably, uncertainty should be quantified to take into account a reasonable worst case situation. There's no way to tell from the limited data whether this is such a case, or whether this data is anomolous, or whether the real range of error is even higher than suggested by this limited data. Second, at some point in the process, the question of whether LNDD accurately estimated uncertainty fades into irrelevance, and the real question is whether LNDD can perform this testing at all. If an adverse analytical finding is based on a delta-delta of 3 or greater, and if the lab's uncertainty is plus or minus 3, it strikes me that the entire test is a wild guess.

For the moment, I'd appreciate it if you'd focus on the meaning of the data, and leave the ISL to one side. We have to know the facts before we can apply the law.

Larry said...

Ali, following up on my 11:02 AM post, there's another strange thing about the data that I wanted to raise.

I do not understand why the statistics for the urine blanks work out so much better than the statistics for FL's S17 sample. Both the sample and the blanks are "live", "dirty" matrixes, and in theory should be equally difficult to analyze. I don't want to draw too many conclusions from such a limited amount of data ... but it seems like either FL's sample is unusually difficult to process for some reason, or else that the blanks are unusually easy to process for some reason. Care to speculate?

I've been reviewing the transcript of the testimony on this issue. Ayotte claimed in her testimony that the mismatch in the numbers from the original test to the reprocessing was (in her judgment) a problem with the 11-keto integration (transcript p. 898, pdf p. 731). If you ignore the data where 11-keto is involved, the data does improve a great deal (of course, you're ignoring half the data when you do this). Dr. Ayotte appears to have testified that she saw the delta values for each metabolite, and that only the "B" sample 11-keto was "off". I'm not quite sure whether this really means anything - if they had to measure 6 metabolites and they could measure 5 with some amount of accuracy and could not even get close with the sixth, I don't think this is any better for LNDD than if they had equally poor results on all six metabolites. You can give me your opinion on this.

Russ, you are asking about the mismatch (shown on the chart contained in paragraph 139, p. 44 of the majority opinion) between the CIR delta-delta readings obtained with LNDD's SOP manual data process, and the three non-SOP and more automated data processing methods requested by the FL team? That IS troubling.

The testimony on p. 288, pdf pp. 179-180, indicates that the MassLynx software was not able to reprocess the 5A-P sample blank urine (shown on the paragraph 139 chart as -3.66), so you might throw that number out of the mix. There's really no explanation for why these various methods for reprocessing produce such different numbers. Unfortunately, the numbers produced by the non-SOP automatic methods of data processing are often worse for FL than the numbers produced via LNDD's manual SOP data processing.

I DO often get the feeling that chromatography and mass spectrometry are not ready for prime time.

Russ said...

Larry,
"Russ, you are asking about the mismatch (shown on the chart contained in paragraph 139, p. 44 of the majority opinion) between the CIR delta-delta readings obtained with LNDD's SOP manual data process, and the three non-SOP and more automated data processing methods requested by the FL team? That IS troubling."

Yes, that was the main reason for my post but the same logic seems good (from the high level view) for most of the strangenesses.

As to:
"I DO often get the feeling that chromatography and mass spectrometry are not ready for prime time."

It does look that way from this case and especially from LNDD's operations with these machines.

I think I would trust most gc//MS or IRMS results run by labs the way they really should - with due care, discipline and skills, providing they also vetted there results. Of course we would also insist on good chromatography!

BTW Ali commented earlier on noise sources, chemical and electronic, I think. I would suggest that you can ignore the electronic noise with this equipment by HP as being below the horizon. The capability to produce flat lines bottom lines in the plots between well separated samples elutions illustrates this, I think, thus, a non-contributor for practical purposes.

Russ

blackmingo said...

Larry,

Nice discussion. Have you read Arnie Baker's "The Wiki Defense" yet? If you haven't yet got it you should. He has some pages devoted to this topic -I think you'll find pages 74, 185 and 151 address points you've brought up in the last two posts.

We're finally on the way back to the sun -happy solstice.

Dan

Ali said...

Larry,

Regarding your question about the blank urine. In general, there's less going on in the blank IRMS chromatograms than in the sample ones. Less going on so it's easier for them to process the data with reasonable precision (that's precision, not accuracy)

Another important observation is one Brenna made. Good precision provides no assurance of good accuracy when peaks interfere. In other words, just because you make the same measurement each time, it doesn't mean it's right.

What LNDD demonstrated with their >1.9 difference was a lack of precision. Their uncertainty figure should account for their precision and their accuracy. If just the precision is greater than 1.9, where does that leave their uncertainty ?.

Again, the fact that the 5aA results show reasonable precision, it doesn't mean that they are accurate. If you look back a bit, I referenced 7 papers, all of which highlight the fact that overlapping peaks have a detremental impact on accuracy. Not precision, accuracy.

Have LNDD accounted for this ? I think they havn't. So not only have they demonstrated poor precision, they have ignored the impact on accuracy that overlapping peaks have.

We could be talking of systematic errors > 3 quite easily when you start to add all this up.

Ali

Russ said...

Larry, Ali,

Ali said:
"Have LNDD accounted for this ? I think they havn't. So not only have they demonstrated poor precision, they have ignored the impact on accuracy that overlapping peaks have.

We could be talking of systematic errors > 3 quite easily when you start to add all this up."

Well I'd like to rummage around in this for a moment.

First to bring up linearity again.
Consider an airplane altimeter. The kind that use air pressure may have a great precision but unless the air pressure is stable and the airplane only circles the airport, the accuracy is left wanting. If it is night or foggy, the requirment to calibrate the altimeter to the airports local air pressure before and after takeoff (and possibly during a long flight) can be life or death to the SOB's (soles on board - yes they used to report that statistic). Now a military jet with it's unlocked access to dual frequency GPS will have a precision picture of it's altitude which will be at variance with the barometric altimeter according to the local variations of air pressure during the trip. The GPS data would be like the true data accuracy you are searching for but isn't available.

Now I think this paint's a picture of what can happen to the data from the LNDD MS and IRMS equipment when the linearity is not verified regularly (SOP or not).

Given Dr Davis claim (as I interpret it) that the machines go through periods of out of spec non-linear performance and the fact that this is essentially ignored in the actual test time frame it would stack up as another unknown that may contribute to quantization errors or elution time errors in accuracy of results.

Further for the B tests (weren't they actually run on the newer isoprime2?), if the mouse ears changed the machines magnetic path from it's designed path, there is another contributor for those tests.

I seem to remember from Dr Davis that there were questions about pressure monitoring and verification having problems, so in addition to the different ramp and pressure adjustments selected, unreliable use of that may contribute to more uncertainty. This is especially true if the pressure varied even a tiny bit during the tests.

I know some of this has been considered to varying degrees but I do not recall anything that can dismiss them as concerns, as I still see them.

Russ

Russ said...

ooops....

soles on board should have read souls on board :-).

See, an example of how you define your data can make a huge difference (2x in this case).

Russ

Larry said...

Blackmingo, no, I have not read Baker's "The Wiki Defense" yet. I can't keep up with the reading here as it is. If I get a few minutes of extra free time, I plan to start reading other WADA cases, to get a feeling for how the ISL and the WADA rules are being interpreted and to see if there are any good "precedents" we can rely upon in formulating legal arguments. As it is, I can barely keep up with all of the threads here.

Russ, the issue of linearity was addressed by the majority arbitrators, as you probably know. The majority acknowledged that LNDD did not measure linearity on a monthly basis as required by its SOP, and ruled that this was an ISL departure. However, the majority ruled that USADA satisfied the "burden flip" and proved that this departure did not cause the adverse analytical finding, since LNDD apparently did perform a linearity check within 30 days of the testing of the S17 "A" and "B" samples. I have some problems with the majority on how they handled the linearity issue, but they did address the issue, at least in part. There was actually a great deal in Dr. Davis' testimony that I don't think was ever addressed by the majority arbitrators. I was actually looking through this testimony a bit yesterday, to review what Dr. Davis had to say about the EDF reprocessing. I think that Dr. Davis' testimony got somewhat lost in our effort to understand the LDP and the chromatographs. We haven't paid the same kind of attention to how the machines work and how they need to be used and maintained.

Ali, thanks for distinguishing precision from accuracy. Good point. When I tried to raise the +/- 1.9 to something closer to +/- 3.0, I was probably mixing precision along with accuracy. In any event, we should try to stay on one topic at a time, so we'll focus on precision versus uncertainty.

So, Ali, from a scientific standpoing, what would you propose to do in a case where there's such a large gap between the LNDD's stated SOP uncertainty and the uncertainty inherent in the data itself? I can imagine two approaches:

1. The LNDD's SOP uncertainty should be set higher than +/- 0.8. I think this is the wrong approach, because we don't have enough data. I'm not a statistician, but my guess is that you'd set uncertainty based on a wider set of data, and you'd pick an uncertainty on a reasonable worst case basis. So, the mere fact that one result out of 1000 or 10,000 fell outside of your uncertainty range would NOT necessarily mean that your uncertainty was too small.

2. We conclude that the problem is not with LNDD's SOP uncertainty, but with the data they obtained for S17. I think this is the only defensible conclusion. So you have to DO something with this data, and I think the only scientifically defensible thing you could do would be to toss it out the window. I don't see how you could adjust the data or apply a greater uncertainty factor to the data.

Thoughts, reactions? We're pretty close to jumping into the ISL to see whether there's anything the ISL has to say about this situation, but I'd like to know your take from a scientific point of view. Do you follow approach 1 or 2? If 1, where do you reset the uncertainty factor? If 2, is there any way to salvage the data?

Ali said...

Larry,

I'd have to go for option 2. The reason being that, as you imply, there isn't enough knowledge to go for option 1.

I don't know how they managed to get a difference > 1.9. That just indicates that their SOP isn't up to the job.

There's really two issues here, the one which lead to that big difference reprocessing the same data and another which deals with non-baseline separated peaks.

If you process a chromatogram which has interference between peaks, you may get exactly the same result every time you do it. The question is that the correct result.

As everybody else seems keen on analogies, here's mine. Say you're a carpet fitter and you develop a means of measuring the length of carpet you require by measuring the distance from one wall to another. When you developed this technique, you were working with a nice flat floor, so this method always gave perfect results. However, you get a job in an old Victorian house where the floors are twisted and bowed. You go about your business and measure the distance from one wall to another, but when you cut the carpet to that length, it doesn't fit. You make the same measurement a hundred times and it's always the same. But it's always wrong. The reason being that you developed your SOP based on nice flat floors and when you apply it to non-flat floors, it doesn't work. It gives you the wrong answer.

Say you're just an inch short every time on a 30 foot length of carpet. Doesn't sound like much does it, but remember that for the IRMS results, we're talking parts per thousand. So -6 is 6 parts per thousand out. We're not looking for big errors.

Ali

Larry said...

good analogy, Ali.

For my option 2, you agree that from the scientific standpoint, you'd have to throw out the data? That there's no way to salvage it?

Ali said...

Larry,

I think either you take the view that the data is no good (i.e. you're dealing with an uneven floor) and that you're not going to change your SOP. In which case, you accept that you can't measure uneven floors so you don't do those jobs. Throw out the data.

Or alternatively, you recognise that your SOP is only good for flat floors so you change the SOP so that it can measure uneven floors as well.

The question you're asking, is can you do that. Can you accurately measure the true value when you have non-ideal conditions.

Well, we all know that Brenna thinks you can improve matters by using curve fitting, but how much investigation has really been done into that. His research was based on ideal conditions where all variables were under experimental control. No matrix interference, just two ideal peaks.

We also know that Brenna amd WM-A didn't agree over the results of that research. If that's where the current level of understanding is in the field of IRMS amongst the experts, then I certainly wouldn't be prepared to stick my neck out and claim that we should be trying to recover results from poor chromatography.

This is a serious matter we're discussing. In my opinion, there's no option but to throw it out and start again.

Ali

Larry said...

Ali -

I think your uneven floor analogy is a proposed explanation for why LNDD could not measure CIR with the precision specified in the SOP. But I don't want to base a legal argument on this explanation. We don't know if this explanation is scientifically correct without some kind of further testing. Moreover, we can't start our analysis by looking at the unevenness of the floor. We don't have any way to measure this unevenness, and no way to gauge how uneven is too uneven.

We started this conversation from a more certain place, with the difference you pointed out between the SOP CIR readings from the original July-August 2006 testing and the SOP CIR readings obtained with the EDF reprocessing. That difference is objective and quantifiable. Moreover, we can compare this difference to the SOP uncertainty of +/- 0.8, to reach a conclusion as to whether this difference is "too big".

OK, let me go back to the ISL and see if there's anything there to turn an unexpectedly large uncertainty reading into an ISL departure.

Ali said...

Larry,

I've never proposed to understand or expalin why LNDD are not very good at what they do.

I don't understand why you would even ask me that.

If LNDD have demonstrated a failure to meet their own performance criteria, it is they who should be asked to explain it.
Why should the Landis camp have to explain why the lab fecked up. How would they know ?

Ask USADA why their lab couldn't repeat the results within ~2 delta units. What was the reason for that ?

Then wait for an explanation.

Then ask them about overlapping peaks and whether that could contribute further error ?

This isn't a one way street here. Both sides have cases to support. I think it's unreasonable to expect the defendant to have to explain the reason for the shortcomings of the prosecution's case. Surely all he has to do is point them out ?

Ali

Larry said...

Ali, what the ...?

I thought YOU'D brought up the uneven floor analogy to help explain why LNDD could not produce numbers within their stated 0.8 uncertainty. I responded that it's a good analogy, but the uneven floor is a judgment call (what is uneven, what is too uneven, all that doctor versus doktor stuff).

How did we get from "doctor versus doktor" to your thinking that I wanted you to explain LNDD to me? To be sure, we ARE interested in understanding how LNDD screwed up, but we have a lot of information on this -- almost too much information, in fact. It's like trying to explain a traffic accident that took place on an icy road, both cars were speeding, one driver was drunk and the other stoned, no one's brakes were working, the headlights were out, and fog had reduced the visibility to zero. What caused the accident?

Why would I expect you to explain that to me?

Heck if I can figure out what I wrote that you didn't understand, but whatever it was, I wouldn't worry about it.

Michael said...

Larry,

I think Ali brings up a basic question that most of of 'normal people' get: Why do the rules require Floyd to 'prove' the Lab messed up? Shouldn't Floyd be able to point out several reasons why he believes the results are inaccurate and have the Lab/USADA explain why he is incorrect.

The way the rules are now, the labs get to stand behind the following statement, "the results are correct because we say they are", which sure isn't fair to Floyd.

I know I believe Floyd raised several questions that the arb panel conveniently ignored. Heck, that was obvious from reading the majority arbs decision - it didn't flow, didn't make much sense and was WAY too long and scientific. I would have been embarrased to sign it if I were one of the arbs. But I guess they knew no one would hold them to the stake for their decision.

Mike

Larry said...

Mike -

In any litigation I've ever been involved with, at least one of the parties felt like they got screwed. It didn't matter whether it was a litigation or an arbitration or any other form of due process. It didn't matter what was the governing law, or whether there was a judge or a jury. No one loses in litigation and feels that they were treated justly. I've never seen it happen.

Often, both sides felt they got screwed. The process is unpleasant, and it's rare that anyone finishes the process and feels good about it. Even the winners.

How would YOU have reacted to the process if FL had won? I've never met anyone who lost a case and thought that the process was fair. And I'm not just talking about crazies. The process creates partisans, and this is the case even with people who are good, intelligent and rational. Partisans legitimately see the process, from their legitimately partisan point of view, in a particular way.

I'm not certain exactly what happened in the arbitration. I'm not sure the process mattered. It was pretty obvious that two of the three arbitrators were hard cases. The notoriety of the case interfered with considerations of pure justice, as is always true for notorious cases. The fact that FL fought as hard as he did worked against him in some ways -- I don't think the ADA system wanted to leave us with the impression that the way (or the only way) to win an ADA case was to hire a $2 million law firm.

There were a number of small things that hurt FL's case. The retesting of the "B" samples was bad for FL. It pretty much eliminated any idea that the adverse analytic finding resulted from a single "bad day" at LNDD. Also, the Lemond incident was bad for FL, as completely irrelevant as the incident was to the facts of the case. The Lemond affair eliminated any lingering public sympathy for FL, so the arbitrators knew that there would be no public outcry if they sided against FL.

In the last analysis, what may have mattered more than anything else is that the case became about much more than FL. The entire WADA anti-doping system was put on trial. If FL had won, it would have been a body blow to the entire system. On the other side was the feeling, one that seems to be endemic in the ADA world, that sport (and cycling in particular) is riddled with dopers. In such an atmosphere, it's easy to imagine that the arbitrators were not going to side with FL unless he was able to make an overwhelming case.

My own personal point of view is that FL got screwed. I don't have the same sense that Judge Hue has, that the process is hopelessly unfair. But there are certain aspects of this case that get under my skin. I understand the reason why the WADA rules might limit discovery, but so long as the details of a lab's implementation of the ISL is set forth in the lab's SOP, I think that SOP should be available for discovery. I think that the science was badly flawed - I think that any adverse finding needed to be based on a decent identification of IRMS peaks using a retention time analysis and that the full mass spectrum data should be part of the LDP. I agree 100% with Judge Hue that the pool of ADA arbitrators is way, WAY too small - it's a closed club, and the arbitrators knew that the price they'd pay for siding with Landis would be the loss of their club membership.

Doubtless I am now a partisan too, and doubtless my ability to view this case objectively has been compromised. I know this for a fact, because my ability to analyze this case is not what it should be.

I don't know if this answers your question.

Michael said...

Larry,

Great response. You said exactly what I was thinking only more eloquently.

I also think this answers Ali's question.

The scary thing is that in Floyd's case and the Mitchell Report, the only people truly getting screwed are the athletes. There's no reprecussions for the organizations (WADA, UCI, USADA, etc.) or Mr. Mitchell himself. They simply get to destroy the athletes reputations. I wonder if they sleep well at night.

Heck, has anyone heard a peep from the USADA regarding the sprinter that just had her decision overturned in arbitration? Not that I'm aware of.

So much for honesty and 'doing the right thing' in this world.

Mike

Ali said...

Larry.

Mike was correct in his interpretation of what I said. It maybe read like I was a bit pissed off, but that wasn't the case. I was simply trying to higlight something which makes no sense to me.

My floor analogy was a reflection of what I believe to be the truth (at this point in time). I think LNDD can't measure uneven floors.

That's not some absurd accusation that I'm throwing out. Everybody agrees that it's not easy to do and many don't even try. That's what they've published in their research papers.

Given that, why shouldn't the Landis camp be allowed to ask USADA to explain the > 1.9 difference between processing the data on two different occaisions ? (which is actually a different thing from the uneven floor source of error). The only people who could possibly know with any certainty are those who did the processing. We can all sit here and speculate that it may have been this or it may have been that, but even I know that speculation is never going to win a case.

This a clear nonconformance to their own claimed accuracy and if there is even a shred of scientific validity to this case, somebody should have to explain why that happened. To be honest, I see this as more of an "in" to opening up a more detailed inspection of their SOP for the data processing aspects.

There are a number of key questions which I believe the Landis camp should ask and which USADA should be compelled to answer.

For a start, the EDFs are the evidence in this case. Is it acceptable for that evidence to be witheld from the defendant so that his own independant experts can inspect it ? That crap about posible tampering is totally bogus. All they need to do is copy the CD for goodness sake. Anything found will be assessed against the original data.

Seriously, are they really saying that the Landis camp cannot see the evidence ? How normal is that ?

It just makes no sense to me. If the evidence was as hard as they claim, why are they so bashful about showing it to us ?

Ali

Michael said...

Ali,

What get's my goat is the deletion of the hard drive data.

I wouldn't be surprised if CAS throws the whole case out based on this simple fact.

Mike

bostonlondontokyo said...

Larry -
Your general thoughts on the case (above) was brilliantly stated - it's nice to hear what people are 'thinking' aside from just the science discussions. Quoting your take on the case was this piece:

"There were a number of small things that hurt FL's case. The retesting of the "B" samples was bad for FL. It pretty much eliminated any idea that the adverse analytic finding resulted from a single "bad day" at LNDD. Also, the Lemond incident was bad for FL, as completely irrelevant as the incident was to the facts of the case. The Lemond affair eliminated any lingering public sympathy for FL, so the arbitrators knew that there would be no public outcry if they sided against FL."

I think this is very true, and has often been swept under the rug by Landis supporters. For some of us who followed the case from the beginning, there are some who feel that Floyd lied to the public. It's been hard to reconcile. It's been hard to understand why Floyd's best friend and manager would have made such an horriffic phone call, why Floyd did not adequately answer his involvement in Geoghan's witness threatening, why Floyd lied about his natural testosterone levels, I don't think it's just an issue of Floyd losing the Public Relations campaign, it raised serious questions about his honesty and his sportsmanship. That can only be made more brittle and difficult when his supporters rush to a science-failure argument as each of these other arguments failed.

This also points to the fact that humans make mistakes - on all sides. Floyd's numerous suggestions about his elevated levels were wrought with human error. Still, we have inhuman and rather Orwellian expectations on the part of labs, scientists and methods.

I'm just glad that you've raised some of the pertinent points about why some of the public has been turned against Landis, and not placed in a position in which trust is granted without question.

Ali said...

Mike,

I'm sure I read somewhere (and I'm getting confused now about where bits of info are coming from) that, yes, the drive was wiped, but that the data was backed up before that happened ?

Maybe that referred to a different data set ?

Who knows.

Confused Ali

tbv@trustbut.com said...

BLT, I'm not sure I understand what it is you think Landis lied about.

You think The Call is an example, and I can't say that I do -- I know Geoghegan as a bull-dog and a loose cannon, and it struck me at the time as a completely "Will Thing" to do. Landis testified under oath about it, with cross-examination, so I don't understand what he didn't address "adequately". You don't think USADA asked him hard enough questions about it?

The you claim he "lied about his natural testosterone levels". I'm not sure I know what you are talking about. I don't recall a time where he spoke about them in a way that could be called a lie. He was certainly confused and misinformed at the beginning, and if that is what you are referencing, I think the conclusion is not justified.

Again, in your next to last question, you discuss "elevated levels" which Landis never suggested.

That the fumbling public attempts to provide information were ill-formed and taken as excuses was certainly no good from a PR point of view, but I haven't seen any lies in it. That The Call was a fiasco is undeniable, but it hasn't been tied personally to Landis either.

Landis wasn't a star surrounded by a posse trying to protect his interests. The bumbled statement written by Buxeda he read to TV cameras from machine-translated Spanish was a sign of naiivete, not duplicity.

Maybe he should have had a staff of handlers from the start, but he didn't. Using Geoghegan, an old teammate and mentor may have been hoping for too much, since the guy cracked. Up until that moment, it was hard to say they had done a bad job.

Not to put too fine a point on it, but his reputation was totally trashed in August of 2006. That it had been rehabilitated enough for The Call to affect anyone's judgement on anything was quite a comeback in its own right.

None of which has anything to do with the merits of the case.

TBV

tbv@trustbut.com said...

Ali,

As I understand it, the morning before the experts arrived to extract data, LNDD copied what they deemed relevant to some CDs and wiped the drive.

If there was data of interest they didn't deem relevant, it was erased. We don't know if there was full-scan data on the disk or not, but we think it was collected as a matter of course. It may have been deleted earlier.

One can draw a number of hypotheses from this. (1) There was no mass-spec data present, consistent with a disregard for its relevance and importance. They aren't aware they have a specificity issue, and they don't collect it, examine it, or keep it. (2) It was present, and they didn't want it to come to light, and so came up with this plan to eradicate it.

It remains unexplained why they felt the need to copy some data off and wipe the drive that morning. We have not heard an innocent explanation that seems to hold any water.

TBV

Larry said...

BLT, thanks for the nice words.

There are a few particular points in your message that I strongly disagree with. I don't want to get into a protracted discussion of these points, since I'm having trouble enough keeping up with all of my FL posts. But just for the record, (1) I think that FL has been extremely truthful and forthcoming with the public, and (2) while the Geoghegan business was ugly, there's no evidence that FL had anything to do with it. USADA also lied, and USADA also engaged in some practices I found to be ugly.

No one comes out of a prizefight or a litigation looking prettier for the process.

I want to stay as focused as possible on the science, and the application of the ISL to the process. BLT, I respect the things you have to say here, and I hope you understand why I don't want to discuss them in depth.

(TBV, if you want to run with this particular discussion, then of course, go ahead. I'm going to try and stay out of it.)

Mike Solberg said...

Ali, I was looking for some information about specificity, but in an old DPF thread I found a comment on uncertainty by duckstrap.

I know even less about statistics than I do about chemistry, so I haven't been able to comment on all this measurement of uncertainty stuff. But it seems to me that in trying to figure this out you are going to run into the brick wall of non-discovery. It seems like LNDD cannot be forced to divulge the information you need to properly evaluate this.

syi

Larry said...

TBV and Ali -

I don't think that the data handling issue has ever been adequately discussed, not on this forum, not anywhere. (And certainly not in the following post!)

From my review of GDC 1076-79, it appears that the IRMS data from the Isoprime was backed up to CD on October 31, 2006 by an outside firm with a contract with LNDD. The backup was performed by removing the hard drive from the Isoprime and connecting the hard drive to a computer with a CD burner. There was roughly 9 months of data on the CD. The October 31 CDs are sometimes referred to as the "master" CDs. On January 30, 2007, the S17 "A" data was copied to another CD. A further CD was created on the morning of April 26, 2007, just before Dr. Davis arrived at the LNDD for the EDF reprocessing, and this CD contained only the data from the "A" and "B" samples of S17 without additional data. These facts are contained in a document at GDC 1076-79 signed by Simon Davis, so I think these facts are reliable.

Beyond these simple facts, things get murky. My guess is that the data files from the Isoprime hard drive were erased by the same firm that did the CD backup, just after the files were backed up to CD. I suspect that the hard drive on LNDD's Isoprime (limited as it would have to be by the version of the OS2 operating system used by the Isoprime) probably had a capacity of 2GB or less, so the lab WAS probably required to periodically archive this data. Also, given that the LNDD lab techs did not know how to copy files to this hard drive (see testimony of Dr. Davis p. 1764, pdf pp. 1514-1515), it's likely that they had to rely on the outside firm to erase these files. For all these reasons, it seems likely to me that the hard drive was erased on October 31, after the CD backups were made. However, the hard drive data COULD have been erased at any time between October 31, 2006 and April 26, 2007.

There is also the issue of the time stamp for the data files. I've never paid much attention to this in my computer work, but when I copy and paste a file on a Windows computer, the "date created" time stamp for the new file is the time the new file was pasted, and not the time the old file was created. If I delete the old file after pasting the new file, then I lose the ability to tell when the old file was created (or so it would seem). There is software available that can copy a file and give the new file the same create time as the old file ... but it appears that LNDD did not use software like this when it created the backup CDs. So the old file time stamps were lost when the files were copied to CD, and thus the time stamps on these files on the CDs were the same as the times indicated above when the CDs were created. The importance of this lost time stamp information is a topic for a later discussion.

If your focus is on ISL departures, there are enough ISL departures here to occupy us for months. The data handling practices at LNDD were unspeakably shoddy. I'll give you a contrast: one of my clients backs up its complete data set to an off-site location once every two hours, and can reconstruct the data as it existed on any particular day going back about 4 years. Plus they redundantly log every time the data is accessed and every time the data is changed. And they are a tiny company, and IMHO not doing anything all that unusual in today's world! In contrast, LNDD was backing up its data once every nine months, preserving NO changes to these files, and maintaining nothing resembling an "audit trail". The average 10 year old backs up her IM chat history with more care than LNDD took with data that is critical to the careers of thousands of athletes, not to mention to the worldwide effort to combat doping in sports.

I also cannot understand how LNDD could back up 9 months worth of IRMS data on a single CD. In his testimony, Dr. Davis indicated that every IRMS chromatogram contains over 100,000 data points. Right now, I'm working with a data file (simple comma-delimited text file) with about 8,000 records, no record longer than 35 characters, and it's about 700KB. If I expand this to 100,000 records, we'd be talking about a 10MB file. This would mean that LNDD could store the results for about 60 IRMS test on a single CD. Wouldn't you think that LNDD did more than 60 IRMS tests in a nine-month period? This leads me to wonder whether LNDD saved only a portion of the available data to the EDF files. But see the testimony of Dr. Davis at pp. 1890-1895 (pdf pp. 1629-1634), which seems to indicate that all data originally generated on the IRMS was available in the EDF. No question, however, that the data created by the LNDD technicians in their manual processing of the data - moving peak starts and ends and adding new data points - was NOT saved to the EDF.

BTW ... we've focused a great deal on the "missing" mass spectrum data. Nothing I've written above, and nothing I can find on line, speaks to what might have happened to the mass spectrum data. As you know, the mass spectrum data would have been captured by the MS, not the Isoprime. To my knowledge, we have no information on how LNDD handled the MS data. It may still exist, for all I can tell.

Larry said...

Mike, Happy Holidays and congrats on nailing down the hyperlink thing.

Why do you think that your posted DPF thread indicates that Ali lacks the information he needs to prove uncertainty? Admittedly we don't have a lot of data, but the EDF manual reprocessing numbers were so far off from the numbers obtained in the original reprocessing, there must be something we can say about this.

I've read the DPF page you cited, and it's interesting (most every conversation between Duck and OMJ is interesting), but I don't see what it has to do with uncertainty. Can you explain?

Ali said...

TBV,

If data is deliberately deleted in the middle of a high profile doping investigation by a "scientific" lab, it's safe to assume that they are trying to hide something.

If they're just trying to clean up disc space, how much do CDs cost nowadays ? A few pence each ? Hey, let's splash out and use a DVD. What the feck, it's Christmas.

Sorry, but something smells bad here and you know what, I'm going to figure out what that is. We've already had one whitewash on this case. The CAS hearing will not be the same.

The Landis camp needs to pre-order their defence data. If the ADAs can't provide the evidence, the evidence doesn't exist. If they can, then let's see it, in detail, so we can analyse it.

I've spent a lot of my personal time on this now. I don't want that time to be wasted. Let's get moving.

Ali

Mike Solberg said...

Ah, yes, I was wondering if anyone would notice the hyperlink thing! All thanks goes to you for the explanation.

I didn't mean anything specific about the link to the DPF thread. Duckstrap said:

Yet it is true that in the Catlin and Maitre papers, inter-assay variability in the delta values (variation between different determinations on the same urine sample, i.e. sample prep + GC-IRMS, compared over several months) was on the order of 0.3-0.4 o/oo. BTW, this would be the appropriate variation to consider, in my view, when figuring out the sample uncertainty, not the instrument variation arising from different injections of the same previously prepared sample (these were the "QC" samples in the Catlin and Maitre papers)--hence my concern/suspicion that they have underestimated the sample uncertainty.

That just sounded to me like the same sort of subject that Ali has been talking about, so I linked it.

It has sounded like Ali needs more information to understand the demonstrated differences. I just don't think that will be forthcoming.

Ali, with regard to this: The Landis camp needs to pre-order their defence data. Don't hold your breath.

syi

Russ said...

Larry,
How do you know that the Oct 31,2006 b/u to CD was a single cd?

Here it says CDs (2nd ref):
"The October 31 CDs are sometimes referred to as the "master" CDs. On January 30, 2007, the S17 "A" data was copied to another CD. A further CD was created on the morning of April 26, 2007, just before Dr. Davis arrived"


Also of note, if the firm wiped the drive Oct 31,2006 then the isoprime was apparently down and unusable until Dr Davis helped reload it nearly six months later. This seems to suggest that the drive was probably erased just before Dr Davis arrival like has been circulating around.

Russ

Larry said...

Russ -

See GDC 1076.

"[T]he original backup of the data [illegible] the analysis of the "A" and "B" samples of stage 17 was carried out on October 31, 2006. This is the process that was performed by removing the internal hard disk of the [Isoprime] instrument, connecting it to a PC with a CD-writer and then re-installing the hard disk back in place inside the instrument. The data were stored on two CD-ROMs, labelled as "Backup du 31/10/2006 Data: 010206 -> 251006 CD1" and "Backup du 31/10/2006 Data: 010206 -> 251006 CD1". these are to be cosidered the two "master" CDs."

I had assumed that, since the two CDs were given identical labels, that their contents were also identical - in effect, that LNDD made an extra copy of the backup CD, which would have been a good idea (you know, one copy for the lab, another copy to leak to L'Equipe ... ;^) ). However, it's also possible that the data files on the hard drive were too large for a single CD, and that the contents were thus copied onto two CDs (in which case the second CD should have been given a different label).

I did not mean to imply that the entire hard drive was wiped clean, only that the data files had been erased. That could have happened on October 31 and the drive would still have been operational. Also, I don't think Dr. Davis said anything about reloading the OS2 operating system and the original Isoprime software onto the hard drive, which he (or someone else) would have had to do if the drive had been wiped clean. That would have been a neat trick in any event!

Michael said...

How come none of the Arbs stepped up to the plate and asked USADA why the data was deleted?

Nuts!

wschart said...

Michael:

I assume (and this is somewhat supported by Campbell's comments in his dissent) that the arbs were cherry picking what they felt would support a pre-ordained verdict. Why raise questions that would possibly cast doubt on that verdict?

Michael said...

Oh...But what about the 'Search for Truth??''

And add as much sarcasm to that statement as possible!!

Here's what I think is the worst thing that happens to Floyd: CAS agrees with the arbs and reduces his suspension to July '08 to make it look like CAS cares.

Best thing which doesn't really matter: CAS finds for Floyd but he's already served almost a two year suspension so who cares???

Mike

bostonlondontokyo said...

TBV and Larry - Thanks for the comments and Larry, I understand you completely, and my comments were more or less moving into territory beyond your thoughts and more into mine.

I think what we're dealing with here are perceptions, and that's why I couched many of my comments with the conditional phrases, for example: 'there are some who feel that Floyd lied to the public.' That's a very different statement than 'Floyd lied to the public.'

There are opinions that colour the case that do NOT have any bearing on test results or SOPs. These, for those viewing the case strictly as a legal brief, are left out of discussion because they are not relevant. This forum is not only a legal forum, so it seems appropriate that other aspects can be discussed.

Some have complete disbelief that Floyd has in any way shown any 'lack of character', where I feel that this is a very biased reading of the facts surrounding Floyd's actions. Many of these facts are not legally relevant, but they have given more context for people viewing the case from the outside.

A good example of this is the use of language. I notice that (for brevity's sake) the attempt at witness tampering by Geoghegan has, at TBV, been trunkated to the phrase 'The Call' - well, it was much more than a call. People also connect the dots. Floyd was identified as posting a thinly veiled threat to LeMond on a web forum, and the coincidence that both he and his manager/friend were doing the same thing but were completely unaware of each others' actions leaves a HUGE question mark, and to many implies an attempt to harrass LeMond, who was known to be a witness in an arbitration. Just because this never became a legal issue doesn't mean that it hasn't left people wondering just why two buddies would be so crass and juvenile as to engage in these kind of frat-boy threats. It could make people wonder how sportsmanly a person is if he would engage in such behaviour. I don't think it's hard to see how people would come to that conclusion.

The fascinating portrait of Landis in the 2007 article in the NYTimes magazine also raised many questions about the man behind the story. For those who read the article carefully, we're given a portrait of a man who is far from an angel (a good deal of drinking, somewhat obsessed with reporting about him on the Web, disconnection with his family because of aforementioned obsession with the Web, etc.) This was a reporter's view, and Landis did not dispute any of that information, so we can take it for what it is. Many people on TBV commented about this story.

My overall picture here is that while we look at the charts and discuss the facts, it doesn't change the fact that some people do not view Landis as credible. This is not a fact but a perception. I'm still waiting for a thorough statement from Landis explaining the connection or lack of connection between his Web threats to GL and Geoghegan's threats to GL by phone. I don't think we even know why Geoghegan had GL's phone number, or why Floyd would have given it to anyone else who wasn't an official member of his legal team.

Floyd has spoken about this, but always as a politician, using phrases such as 'it was a bad experience for everyone involved,' and never really explained how this could possibly have happened in the first place. I know plenty of mothers out there who found this behaviour inexcusable, whether it be Will's threat-call to GL or Landis' Web threat, and not something that any child would look up to as an example of good sportsmanship. Later referring to this as a 'bad joke' made matters even worse.

Again, Floyd's character is not on trial, and for good reason. The case IS about the test results and whether or not he doped. I can't even say that the case IS about whether or not he doped, because the arbitration dealt mostly with interpretation of the results and questioning the practices behind those results.

I'll surmise with this - Floyd has at times made it QUITE a challenge to view his case without bias. I do not follow the crazy ranting and raving cycling sites that just crush atheletes back and forth, nor do I want to speak ill of Landis, but it's hard to bang the drum for fairness for someone who does not, in my mind or many others, represent someone that I would like to emulate or someone that I respect. However, I keep as open a mind as possible.

Mike Solberg said...

BLT, I hate this whole subject, so I'm not going to say more, but just to answer one insinuation you make:

I don't think we even know why Geoghegan had GL's phone number, or why Floyd would have given it to anyone else who wasn't an official member of his legal team.

Will had GL's phone number because Floyd had GL's phone number. Floyd had GL's phone number because GL called him. Will and Floyd used to sync phones to keep up to date on professional (and personal?) contacts. GL's number was in Will's phone long before the hearing.

Other than that, my only comment is that people are complex, and particularly complex in terribly difficult and stressful circumstances. In my view, 'nuf said.

syi

jrdbutcher said...

BLT,

You are more than welcome to your opinion. A couple of quick points:

1) Floyd is quite connected to his family. The article you cited made an incorrect assertion. My opinion on this is based upon a limited sample of personal knowledge. The article author appearantly has next to no personal knowledge. You've extrapolated from that article an have gotten facts wrong in this instance. IMHO.

2) Not excusing WG for his wrong act wrt GL, Floyd doesn't either. The business relationship was severed as a result. I'm guessing/hoping they are still friends. IMO, Floyd is quite capable of separating/forgiving the bad act from the friendship.

3) GL pressed himself into the subject of Floyd's AAF. Further, his attention was unwanted and unwelcome. For whatever reason, GL persisted and devulged some shockingly private information out of context. GL was found to have no standing at the hearing. He had nothing but inuendo to add. The Majority Award Arbs couldn't even bring themselves to give him any standing. The GL affair was an all around bad show and PR nightmare.

4) I blame USADA for throwing GL under the bus, at the hearing, in an effort to discredit Floyd in a way that had nothing to do with the facts of the case.

5) I feel sorry for GL. He was a great rider. Now he is a USADA/WADA dupe. IMHO.

6) Too much time spent on GL.

Ali said...

syi,

'Nuff said indeed.

Also, if occaisionally drinking too much and spending too much time on the internet makes you a bad person, then I must be one bad mother, because I do both.

Ali

bostonlondontokyo said...

Ali and Mike and others - thanks for the info about synching phones (I didn't know this was possible). I think the only point here is that people view circumstances differently, and people weigh them differently.

Ali, you took my read of the Landis article totally out of context, and very personally - I'm not sure why you did that. I did not say that someone who drinks or surfs the web is a bad person. I know you're not very fond of me, but please don't take my comment like that - look at the way it was twisted around - not cool.

bostonlondontokyo said...

In light of how difficult a subject this is for people on tbv, I'm going to drop discussion of the 'character' issue... I think minds are completely set, although I keep mine own open when people tell me facts, and I thank them for that. However, this isn't about having the winning opinion, and clearly I unnerve some when I even suggest that Landis isn't my hero.

jrdbutcher said...

BLT,
The notion of hero worship is, wrt me, also an incorrect assumption. In fact, I view terminology related to hero worship/me as being derogatory. The source of my motivation to post here can be found elsewhere. If you want to engage in hero worship or to identify yourself through others, best of luck to you. FWIW.

Ali said...

BLT,

I'm not exactly sure how you got the impression that I'm "not very fond of you". I don't even know you and I don't think I've ever responded to more than one or two of your posts before.

As for taking things out of context, if I did (and I'm not sure why you would think that), then it was done unintentionally.

Ali

Jon said...

Blackmingo-
Back to the Brenna testimony since someone inferred that synthetic testosterone detection does not require a threshold.

USADA argument under brief: "Under the ISL and the applicable technical documents, LNDD's statement and it's conclusion was correct because LNDD was not required to take into account uncertainty in the measurement or it's IRMS delta values."

Q: SUH: Do you--let me ask it this way, when you look at this particular difference, [Stage 17 Andro-11Ketoetio -3.51mil] do you see that it doesn't exceed -3 per mil standard?

A: BRENNA: It exceeds -3 per mil; it doesn't exceed -3.8, which would be THRESHOLD for declaring it positive.

It is clear from the testimony that Brenna concluded that the uncertainty had to be included in order for a metabolite to reach the -3 per mil THRESHOLD for detection of exogenous testosterone use. If the uncertainty applies to andro-11ketoetio, +/-.8mils then it is reasonable to conclude that the uncertainty would have to be considered for all metabolites for all tests, manual, auto, and with differing software packages, OS/2 or Masslynx, after the uncertainty had been determined by experimental research. Since testosterone is an endogenous steroid there has to be some criteria to determine the existance of C12 vs. C13 and the -3per mil threshold is the accepted method to determine this difference. USADA's argument that the andro-11ketoetio -3.51 per mil result by LNDD without the inclusion of the uncertainty should have been dismissed.