Thursday, November 08, 2007

Idiots look at Data, Part IV: Baselines

In Part III, we basically counted peaks. Now we'll look at the baselines. We should be wary that the vertical scales are not all the same.

[MORE]


ucla: flat



ex92: sloping, bumpy




ex88: major slope





ex90: major slope




ex86: major slope, bumpy





usada173: sloping



usada349: sloping



ex87: modestly sloping



ex84: flat, bumpy



ex93: flat



ex85: sloping, bumpy



ex89: sloping, bumpy



s3a: flat



s3b: rising



Assesment

Test
Slope
Bumps
S+B
UCLA
1
1
2
Ex 92 3-Jul
3
2
5
Ex 88 13-Jul
5
2
7
Ex 90 14-Jul
5
1
6
Ex 86
5
3
8
USADA 173 20-Jul
3
1
4
USADA 349 20-Jul
3
1
4
Ex 87 22-Jul
2
2
4
Ex 84 23-Jul
1
3
4
Ex 93 control
1
2
3
Ex 85 control
2
3
5
Ex 89 control
2
2
4
Shackleton top
1
1
2
Shackleton bottom
2
1
3

23 comments:

Mike Solberg said...

We really need to see more examples of good quality chromatograms from real life samples. It is hard to judge when we only have one exemplary good sample. I understand you want to stick with that one because it was used in the hearing, but more examples would be helpful.

syi

DBrower said...

We already added the shackletons, making 3 good ones. Find a few more and I will retrofit them.

Tbv

Unknown said...

TBV,

I'm still kind of questioning this whole exercise. You've definitely shown the shoddy work done by LNDD in layman's terms. But does this prove/disprove Floyd used ext. Testerone? Or does this prove/disprove that the testing is very difficult to accomplish?

Mike

PS I'm on Floyd's side.

DBrower said...

The purpose of this series looking at the data is to determine if the data is good, appearing more or less likely to suffer the kind of problems identified in Integration for Idiots.

It's not going to prove anything, but it may lend confidence to other conjectures, and illuminate the issues.

TBV

Unknown said...

Thanks TBV.

I wish we could somehow get this in mainstream media...like a Big Sports Illustrated article!

Mike

m said...

TBV,

I think you are distorting the evidence here.


1) As I pointed out below you should be using the Landis F2 graphs if anything, not the F3's.

2) But even as to the F3's, what matters is the baseline around the 5A and 5B and Pregnane. These are flat in the few that I glanced at. Yes there may be sloping in some of the areas far away from the 5A or 5B, but I don't think this poses a problem.

Larry said...

M -

How is this a distortion of the evidence? We're just looking at a bunch of baselines, one on top of the other, in a format that allows us to compare them. I don't see where TBV and Ali have drawn any conclusions from this presentation.

OK, if you want, you can question the criteria TBV and Ali are using in their assessment chart. Maybe it's the wrong criteria, maybe they're not applying the criteria correctly, maybe adding a "slope" value to a "bumps" value is not the best way to assess the level of background noise. But I don't think this is a "distortion". Frankly, after all the talk in this case of "good" chromatography and "bad" chromatography, I personally welcome the introduction of some criteria here.

Why NOT look at the F3, that's the fraction where LNDD found the doping violation?

Also, I don't understand your point about only looking at a portion of the baseline. I understand that we ULTIMATELY care about the noise under the peaks of interest, since those are the peaks we're trying to measure. But as we learned in Integration for Idiots Part II, you can't SEE the noise under an individual peak. If you have good peak separation (which we don't always get), you can see the noise level at the start and the end of the peak, but the noise UNDER the peak is invisible. The only way to infer the noise under a peak is to look at the general pattern of noise on the graph. Why wouldn't you use the entire graph to help determine this pattern?

DBrower said...

M,

I appreciate your concern, however, I don't think I'm going where you are worried, in no small part because I'm not credible to make specific qualitative determinations. That would be pointless. I'm going somewhere else, and I'm still building foundation.

You have given me some good ideas, and I appreciate the feedback.

TBV

Mike Solberg said...

I need to ask an "idiots" question:

When you talk about "Integration for Idiots" does the word "integration" refer to the meaning used in calculus? Is this "Determining-the-area-under-the-slope for Idiots"? Or is this some more pedestrian use of "integration"?

syi

DBrower said...

It's more-or-less as used in calculus, but I hope not to need that level of math for explanation. Once you are at that level, then you'll never explain it to an arbitrator, much less the public at large.

TBV

m said...

Larry,

To be more precise I'm saying they are distorting the conclusions that can be drawn from the evidence they've introduced. The clear implication of these charts is that the total number of errors in the LNND chromatograms is higher than in the UCLA, and that somehow that "matters".

I'm saying it doesn't "matter" wrt to the doping findings unless it affects the metabolite measurements. If there is some isolated sloping baseline way out in the corner, it doesn't affect the metabolite. That should be made clear.

If that's not their point, then what is their point? Just that the LNND chromatograms are messier than the one isolated cherry picked UCLA chromatogram that they put up? So what?

bostonlondontokyo said...

So, the idiots lessons aren't actually leading to a conclusion? Now I'm very confused - I've been following these and thought that there was a new legal argument coming from the analysis, but no? If it's simply an intellectual exchange, that's fine, but it was said a few times that we should wait, that is was all building up to something. I'll keep reading, but I do hope something comes to light that I haven't seen alread!

Larry said...

M -

I don't know yet what TBV and Ali are going to conclude. In their "Integration for Idiots" course, they built some foundation for what they might conclude here. But I haven't heard them conclude ANYTHING yet. (I get the feeling from some of TBV's posts that they're not sure WHAT to conclude yet, that they're just walking through the science and the evidence.)

So far, I think that TBV and Ali have been reasonable, they've answered all questions (well, most of them, I still have a few unanswered posted questions) and they're willing to go back to earlier posts as needed to explain statements in later posts. I think we can be patient and see where all this is going to lead.

On your more specific points, we DO have TBV and Ali's analysis of background noise in Part II of the Integration series, and they argued there that the more complicated the pattern of the background noise, the more difficult it is to analyze the peaks. No one there had a problem with that statement. But now we're dealing with real-world background noise results, so you're raising questions we could not address back in Part II of Integration.

I think you're right -- we care about noise that affects the measurement of the IRMS peaks at interest for a given test. But to reiterate, we don't know the noise level under any given peak, because the peak subsumes that noise level. We can only make assumptions about the hidden noise level under any single peak by looking at the noise level shown by the graph.

There are a number of cases where I think we'd want to look at the entire graph to determine conditions that might apply under a peak of interest. For example, look at ex88, where TBV and Ali conclude that the noise graph is a "major slope." From looking at the entire graph, I think the noise slopes down most sharply towards the beginning of the graph, and levels off quite a bit (but not entirely) once you reach the middle of the graph. It seems useful to me to look at the entire graph to determine this pattern -- then you can more accurately determine the noise conditions in the immediate vicinity of the peaks of interest. In contrast, if you just look at a small portion of the graph, you're likely to reach the wrong conclusion (i.e., that the slope remains constant and as severe as we see at the beginning of the graph, or that there's no slope at all).

Another example is ex86. TBV and Ali characterize this peak as "major slope, bumpy." THAT's an important characterization, because it raises the possibility that there could be a bump in noise underneath a peak we're trying to measure. Now, when I look at the entire slope, it seems to me that the slope is bumpier towards the beginning and less bumpy towards the end. That's information I'd want to have when doing my analysis. If the peak I'm trying to measure is at the end of the graph, I could argue that there was less chance of interference from a noise bump. In any event, I think you'd want to look at the entire graph to see if there were any patterns like this.

If you're arguing that we should ignore some aberrant, isolated variation in noise that is distant from the peaks in question and that seems unlikely to be repeated under any of the peaks at question, I'd agree. As an example, you can see on the UCLA noise graph that the noise starts flat at the very beginning of the graph, then ramps up. I think we could and should ignore that "ramp up", and that TBV and Ali should not count that ramp-up as a "bump" or in their computation of "slope."

Let's revisit what TBV and Ali are trying to prove here.

1. I think the main point is that there are considerable variations in the noise levels, and in the patterns of noise levels, shown in these graphs. They're not all the same.

2. TBV and Ali have tried to assign criteria to these differences, so that (for example) UCLA gets a "2" and ex86 gets an "8". I don't think you'd argue that there's no difference in the noise level shown in these two graphs (you might eventually argue that the differences are not significant for our purposes). TBV and LNDD have concluded that the noise level and variations in ex86 are the most problematic of the bunch. If you want to see the worst possible effects of noise levels on test results, and if you're looking for a real world example, you could do worse than look at ex86.

I may be reading something in to what TBV and Ali are saying, but I think that's all they've said so far. They have not said, for example, that ex 86 is unacceptable chromatogrphy. (I expect they WILL say this, but they haven't said it yet.)

m said...

Larry,

WRT to 88, I think you have illustrated my point that you should focus on the point of analysis.

The downward slope occurs at the far left beginning of the graph. By the middle around the 5B and 5A, peaks 6 and 7, the baseline actually slopes up just a teeny teeny little bit. Take a look at the the original graph of 88 at part III.

http://bp2.blogger.com/_xX3hgPBOgag
/RzNJDurfVHI/AAAAAAAAA0I/uuH6Vx3Mby4
/s1600-h/ex.88.landis-f3-zoom1.png


As to background removal and noise. I think you are wrong. If the baseline is flat you can correct for the noise. That was what Meier was saying in his testimony about slide 16 wrt to the lower left and middle examples in the slide.

So I don't think the noise around the 5B and 5A peak in 88 poses a problem.

Larry said...

TBV and Ali, you've introduced (I think) a new concept here: the "noise bump". I didn't think there could be a noise bump, I thought that any small bumps I've seen in these graphs were from unidentified substances.

Maybe this leads into a second question, what is noise? I'd thought that noise in chromatography was like static for an FM radio: in the process of searching for signal, you're going to pick up noise. What is this noise? It's just stuff in the background that's hard to avoid because the technology isn't perfect. The noise is the background radiation from the big bang, or it's from the vacuum cleaner running in the next room -- some natural source of interference.

Using the radio analogy, I wouldn't expect the noise level to be constant across the extent of a chromatography graph, just like noise might not be constant across the radio dial. This is probably more an AM thing than an FM thing, but you might have more noise around 550 AM and less noise around 1600 AM. But I would NOT expect a bump in noise at, say, 1050 AM.

Pushing the radio analogy further: the better your equipment, the lower the noise level. But also, if you're trying to pick up a weak station, you have to turn up the radio volume higher and you're going to hear more static.

So ... is chromatography noise like radio noise? What causes it and how do you avoid it? Why would you get a noise bump, and how would you avoid noise bumps? When LNDD did their tests and got bumpy noise graphs, was there someone vacuuming in the next room?

BLT, I think that TBV and Ali are trying to explain to us what is good chromatography and bad chromatography, and why it matters. I don't think TBV and Ali are necessarily leading up to a legal argument. I counsel patience.

m said...

Larry,

Are TBV and Larry qualified to teach us what is good chromatography? Perhaps I should have asked this at the very beginning.

Larry said...

M -

You mean, "are TBV and Ali qualified to teach us", right? LOL! For the record, I am NOT qualified, and was there really any doubt? ;^)

About qualifications, that's a fair question, I don't know. I don't know OMJ's qualifications either. I've always figured that people are free on line to anonymously state their opinions, and that if someone says something on line that's wrong, someone more expert will catch it and correct it. Probably a naive opinion on my part. I've probably gotten away myself with a statement or two that could have been called into question by someone more expert!

Regarding ex88, that's a very interesting example! To make it easier for others to see the graph you're talking about, here's the link: ex 88.

On ex88, there's a slightly up-sloping line drawn under peaks 6 and 7 on the graph you've cited, but the line doesn't look original to the graph. It looks like someone drew it in. And IMHO (very H), I don't think this line correctly describes the noise slope under peaks 6 and 7, which I think is slightly down-sloping. There's a problem I see with this up-sloping line: it's drawn based on the assumption that peak 6 and peak 7 (as shown on the graph) both end at the noise baseline. I think that there's intereference between peak 6 and peak 7, and between peak 7 and the unnumbered peak after peak 7, obscuring the true end of each peak, which is actually lower than shown.

There are other apparent examples of this on ex88, for unnumbered peaks between peaks 4 and 6. Look for the very tall unnumbered peak just to the right of peak 4. We'll call this tall, unnumbered peak "peak 4a". Then look for the tallest peak between peak 4a and peak 6 (it's the first in a cluster of three peaks). We'll call this peak "4b". Look at the apparent line of noise under the three small peaks to the left of 4b - it appears to slope up. Look at the apparent slope under peak 4b and the two peaks to the right of 4b - it appears flat or maybe with a slight up tilt. In both cases, this apparent slope interrupts the down slope we can see in the noise across the graph. But it appears that this interruption is an illusion, caused by peak interference. I'd probably conclude that the noise slope under these peaks continued downward, following the pattern we see across the graph.

For an illustration, see the two peaks drawn on figure 12 at Integration Part III. On figure 12, you don't have a rise and fall in the noise level under these two peaks - the noise level is constant. It's just that the two peaks interfere, and you can't see the point where the first peak end and the second peak beginning reaches the noise base line.

I'd be more inclined to buy into your argument if we could see evidence to the left of peak 6 or to the right of peak 7 that the noise slope had changed from the downward slope we'd seen to that point. I don't see any such evidence. The slope change you're describing is only apparent underneath the peaks themselves. And as I keep saying, the noise level under a peak is obscured by the peak itself - you have to infer this noise level by looking at the noise level on the graph where there are no peaks.

So, IMHO, ex88 is a perfect example of why we SHOULD consider the pattern of noise across the entire graph.

M, this discussion is relevant to the question I asked TBV and Ali about the nature of chromatography noise. Do we expect chromatography noise to follow a general pattern across the graph? Or is it more random (as you're suggesting), where local conditions are a better indicator than general conditions?

(By the way, when we've been asking what TBV and Ali are trying to accomplish here ... maybe it's to get us to actually look at these graphs and think about what they mean!)

Ali said...

m.

We've given you the tool to determine for yourself what is good and bad chromatography. Have a play around with it.

Also, your 11:02 comment says that you think it doesn't matter whether there is a very noisy background. Do you think that the noise is selective ?. That it decides to stop when our peak of interest comes along ?. If you have a noisy background before and after a peak, it is fair to conclude that the noise is contributing during the peak as well. That's the nature of noise.

DBrower said...

I wouldn't dream of trying to teach anyone good chromatography. But I am qualified to look at a graph and count things I see.

More later.

TBV

m said...

Larry,

The blind arguing with the blind is not very fruitful.

I can only ask you to read Meier's testimony with respect to slide 16 where he states that graphs that look exactly like the 5A and 5B in 88, or the figure 12 that you referred to, can be corrected for. Look at the middle bottom graph which shows a sloping baseline just like in your figure 12. He says you can correct for that.

Ali,

The same response applies to your 1:29 PM question. I read Meier to state you can correct for the noise under the 5A and 5B peaks.

Ali said...

m,

WMA is correct. In the middle bottom diagram of his presentation, it shows two non-interfering peaks on a sloping background. Not a problem.

Unfortunately, I believe all 5a peaks on the Landis plots are subject to interference. In fact, in my opinion, they look more like the bottom right picture of WMA's presentation, than the middle one. You can decide for yourself if that's the case.

bostonlondontokyo said...

Ali - you mentioned that consistent interference could also be influencing a peak, which makes perfect sense. I know very little about how these results are achieved, but if it's obvious that the interference could have effected the peaks, wouldn't that have been taken into account in the analysis, or were they ignored? And another question: if you observe the interference, there are still differences (the best thing I could call it is 'contrast') in other words, if a very tall peak is where it is, can we assume that the interference is constant - meaning that even if there is interference, there is still a peak - but simply a 'messy' peak? I'm asking a billion questions here, sorry.

Ali said...

bostonlondontokyo,

You asked if it was obvious that noise could have effected a peak, that would be taken account of in the analysis. Good question. The answer is how would you know ?. All you get back is a plot of the net contribution of all peaks/noise during a particular time period. You can't split them up and say "I got peak A at time T and also got noise/peak X at the same time. You have no idea what caused the response you see. All you know is that the total response at a certain time was X. That may have been made up of a single peak (ideal) or a peak plus many other background/noise/peak sources. So, you may be aware of noise before and after your peak, but can you say exactly what the noise was during your peak ?

Your tall peak question is good. You can certainly conclude that there is a significant response at that time and if you can tie it up to GC/MS plots and identify it, good job. Now to the interference question. The answer is that interfernce is either totally hidden or masks where your peak starts and stops. The effects depend on the magnitude and o/oo value of your intefering peak, relative to the peak of interest. It can cause shifts, making your peak area look smaller, or bigger than it actually is. The measured area is what determines the ultimate o/oo value for your peak. That is very sensitive to the areas you measure for the peaks. Small mistakes lead to big errors. The tool's there for you to confirm that yourself.