trust but verify: idiots

Showing posts with label idiots. Show all posts

Monday, January 28, 2008

Idiots Return to Brenna and WM-A, Part II

Alasdair, once Ali, continues...

In Part 1, we managed to recreate Brenna's results. In Part 2, we're going to explain exactly how they occurred with reference to both the m45 and m44 signals and what happens when peaks overlap. However, before we do that, we need a little more background on what the software does.

Some of you may be thinking "We understand the logic, but m45 leads m44,not the other way round, you idiot."

Well, that's true, but hold on...

[MORE]

Brenna stated that his software compensated for the misalignment between m44 and m45, so we know that an attempt was made to align them perfectly.

That never happened though, because if it had, he would not have observed the results that he did. There are many factors which would make perfect alignment extremely difficult, including: The m44 and m45 signals are sampled quantities; In practise, they are not perfectly the same width and the same shape; The m45 peak is 100's of times smaller in magnitude than the m44 peak and more susceptible to chemical and electrical noise.

So Brenna's software tried but failed to achieve PERFECT alignment.

Older software may not even try to compensate for the m44-m45 time difference. Newer software may try but fail, resulting in a small finite misalignment either one way or the other. Consider this: To achieve perfect alignment and avoid errors when peaks overlap, you need perfect peaks.

At the end of the day, whether your software compensates or not, it looks like you're going to be left with an error in the alignment of m44 and m45. We don't know whether the OS2 software used for Floyd's Stage 17 compensates or not. We don't know if it behaves like Brenna predicted or whether it behaves like Meier-Augenstein predicted. After we've explained the results, we'll tackle that question.

Before we get into what can go wrong, lets see what is required for accurate CIR measurement. Figure 8 shows a m44 peak, with its corresponding m45 component superimposed on it. The two peaks are perfectly aligned. The integration limits are shown as two vertical lines. Remember, m44 and m45 exist as two separate signals, and we need to show both to explain what’s happening. The software needs to calculate the area under the m45 peak, calculate the area under the m44 peak and divide the m45 area by the m44 area. The result is the CIR for this substance.

Let’s say the m45 signal leads (occurs before) or lags (occurs after) the m44 peak by some small amount. As long as there’s no other peak interfering with our peak, that’s not a problem. We just position our integration limits so that they encompass both peaks and we’re good to go

Figure 8. m44 and m45 perfectly aligned. No overlap

Right, lets introduce another peak to the left of our first peak. Figure 9 shows two peaks of the same CIR overlapping by some arbitrary amount. We use the same method as Brenna to set up our integration limits. Lets examine what’s happening with Pk2 . We can see that our integration limits exclude the left hand tail of both the m44 and m45 peaks. That means that we’ve lost those areas from our calculation. That’s not a problem, because the m44 and m45 are perfectly aligned, so we lose the same relative proportions of each.

We’re also including the right hand tail of Pk1. Again that’s not a problem because in this case, both peaks have the same CIR so the contribution from Pk1 has the same relative proportions of m44 and m45 as Pk2 (it wouldn’t matter if Pk1 was much bigger than Pk2, the previous statement holds true).

Figure 9 m44 and m45 perfectly aligned. Overlapping peaks

So with perfectly aligned m44 and m45, it won’t matter how much those peaks overlap, because they have the same CIR.

Figure 10 shows our m45 leading the m44 by some amount (we haven’t labelled the peaks again as they are clearly identifiable from Fig.9). We’ve had to exaggerate this so that you can see what’s happening. What you’re seeing is m45 leading m44 by an enormously large 1 second. [To put that in context, if Figure 4 had used a 1 second lead instead of a 0.1 second lead, the measured value for that peak (which has a true value of –27) would not have been –28.4, it would have been –41.0 !]

We don’t need to look too hard to see that the m44 situation remains the same. We lose the left-hand tail of Pk2, but gain an equal amount of the right hand tail of Pk1. Total area same as before. It’s all going wrong with our m45 peak though. We’ve lost more of our Pk2 m45 than in Fig.9 and we’re gaining less of Pk1’s m45. That’s upset the balance. When we calculate our areas for Pk2 now, the ratio will make it look like we have less m45 than before. That makes our d45 figure more negative and it makes it look more probable that doping was involved.

Figure 10 m45 leads m44. Overlapping peaks

Figure 11 shows the opposite situation, m45 lagging m44. Again, all’s well on the m44 front, no change again.

So what’s happening with our m45 now?

We’re losing less m45 from Pk2 than in Figure 9 and we’re gaining more of Pk1’s m45. That’s upset the balance again, but this time it’s going to increase our m45 areas. That makes our Pk2 d45 figure less negative and it makes it look less probable that doping was involved.

Figure 11 m45 lags m44. Overlapping peaks

There you have it. The mystery of the overlapping peak and why the observations of Brenna and Meier-Augensten were both valid. It depends on the software system you’re using and how it performs with regard to aligning m44 and m45.

All very interesting, but what about the 64,000 dollar question … What is the LNDD system like?

The reprocessing of the Stage 17 EDFs provides our only real insight into the characteristics of the LNDD equipment. Specifically, we’re going to look at the results obtained from the blank sample. This is useful for us because: We know it shouldn’t test positive; It was processed in an automatic mode by the software; It was also processed in a manual mode by the technicians.

The F3 for the blank showed a degree of overlap between the 5bP peak and the 5aP peak (not to the extent that Floyd’s overlapped, but it was there).

In auto mode, the 5aP for blank came out as delta –3.65 (“delta” refer to the numerical difference between the d45 value for our substance under investigation and the d45 value of a reference substance generated by the athlete which is known not to be effected by the doping in question). The threshold for a doping positive is delta –3. Anything more negative is considered indicative of doping, however the uncertainty that LNDD claim is +/- 0.8 so more negative than delta -3 is suspicious, but delta –3.8 is the absolute threshold. As you can see, the known clean blank escaped testing positive only because it was within the uncertainty claimed by LNDD. That was when processed automatically by the software, without intervention. When LNDD redid it using their manual method, they got a result of delta –1.87. It is worth noting that they knew this was the blank and that it should not test positive.

So what do these results tell us?

Suppose the LNDD system behaved like the one Brenna used in his research and made the measured d45 appear less negative than it really is. That would imply two things. Firstly, the blank sample was a "positive", and really more negative than the measured delta –3.65. Secondly the LNDD manual method made the result worse, not better – they moved it in the wrong direction, away from the true value. (It also provides a stunning example of the influence of manual methods to achieve desired or expected results.)

This argues that we can’t accept that the LNDD equipment had the same characteristics as Brenna’s with respect to alignment of the m45 and m44 signals. If it did, the blank must have come from someone who had been doping with testosterone and the LNDD manual adjustment method is laughably inadequate.

Let’s consider that the LNDD system achieves perfect m45 and m44 alignment. Apart from the fact that it’s probably not achievable, it would mean that the blank, non-doping sample only just escaped a false positive. What does that say about the threshold of delta –3 ?. Also, it would mean that the LNDD manual adjustment method changed a correct result to become incorrect by more than delta –1.6. That’s way beyond their claimed uncertainty figure of +/- 0.8 and would call their competence into question.

We’re only left with one other option, and that is that the LNDD system results in the m45 leading the m44 by some finite amount, resulting in the 5aP appearing more negative than it really was. That would explain why the blank almost tested positive. That would explain why LNDD manually altered the result for the blank to make it look less negative. It would also explain why all of Floyd’s 5aP results appeared far more negative than they actually were.

Far more negative? We’ve established that the error is proportional to degree of overlap. Look at the blank F3 chromatograms. The degree of overlap between the 5bP and the 5aP is negligible, but that was enough to make the blank almost test positive. Now look at Floyd’s F3s.

Noticeably more overlap.

For those calling us idiots, let's remember that we are not talking about the actual lag between the peaks, but lags produced when the software attempts to account for the actual lag and fails to to a perfect job. Such software more or less works adequately when there are no overlaps in peaks. As we've shown, when there are minor errors in this compensation, overlapped peaks can be systematically measured with inaccurate results.

We've shown here that it is possible to reconcile Brenna's study and WM-A's theory in a way that leaves it less likely that LNDD got correct results.

Which seems like a perfect place to stop.

Full Post with Comments...

Sunday, January 27, 2008

Idiots return to Brenna and WM-A, Part I

Alasdair has returned from his Hermit's cave having shed his 'Ali' persona, and now revisits some key points in the battle of the experts.

Forward to Part II.

Let's pay a return visit to Brenna’s 1994 paper (Curve Fitting for Restoration of Accuracy for Overlapping Peaks …), which we first looked at in November.

Why are we doing this?

Well this paper formed the basis for a clear difference of opinion between Brenna and Meier-Augenstein during the hearing. All of Floyd’s IRMS F3 chromatograms showed a degree of overlap between the 5bP and 5aP peaks. If this resulted in an unknown error and that error biased the result in a more negative direction, it would/should have put those results in question (more negative equates to higher probability that testosterone was not generated by the athlete’s body).

Meier-Augenstein argued that the error was of unknown magnitude and in a more negative direction. Brenna argued that the error was very small and in a less negative error (i.e. the error actually made it look less probable that the athlete doped).

Meier-Augenstein’s opinion no doubt comes from his years of work and research in this discipline. We can trace Brenna’s opinion back to a paper he published which, amongst other things, detailed his observations when peaks of almost identical carbon composition overlap. His observations were that a systematic error occurs when peaks overlap and that error (not small, by the way) made the CIR for the substance being investigated look less negative than it really was.

What he didn’t investigate in that paper was why that happened. Understanding results is usually a prerequisite to applying them to different situations. Brenna chose not to take that precaution in his testimony.

[MORE]

Enter the idiots, tattered old spreadsheet in hand. We decided to finish this job off by solving the mystery of the overlapping peak and attempting to prise Meier-Augenstein’s fingers from Brenna’s neck.

Let’s start with a review of this test.

The GC/C/IRMS process separates the C12 and C13 isotopes and produces two individual signals, the m44 and the m45 peaks (corresponding to the C12 and C13 isotopes, respectively). The relative size of these two signals is proportional to the carbon isotope ratio (CIR) of the substance under investigation. The CIR is actually calculated as the area of the m45 peak divided by the area of the m44 peak, using integration. In an ideal world, the m45 peak can be thought of a scaled down version of the m44 peak (100’s of times smaller). In practice, these two peaks do not align on the time axis perfectly, the error is very small (typically 150 milliseconds for a peak that may be approximately 35 seconds wide at the base).

Brenna mentions in his paper that his software corrects the misalignment of m44 and m45. He also states that he uses the “perpendicular drop” method to separate peaks (a vertical line positioned at the minima of the intersection of the two peaks and drawn down to the background level). Can our spreadsheet cope with these demands?. Darned right it can !

The goal of this first part is to recreate Brenna’s observations, so let’s dust off our spreadsheet and fire her up. We’re going to define two peaks, each with a d45 value of –27. We’re going to set the time difference between m45 and m44 to exactly zero and then make sure we’re measuring them accurately. Figure 1 shows the output. Two peaks, no overlap, both reporting a correct d45 of -27. [note: d45 defines the CIR of a substance relative to an international standard in parts per 1000, so –27 reads as a CIR which is –27 parts per thousand less than the standard].

Figure 1. Exact m45 and m44 alignment. No overlap between peaks.

It’s worth noting here that we only show the m44 trace, as appears to be common practice when presenting these results, but remember that there’s another trace (the m45) not shown.

Right, let’s introduce some overlap. Figure 2 shows the results we get. Note that we’re still measuring the true value. No error there.

Figure 2. Exact m45 and m44 alignment. Moderate overlap between peaks.

Let’s increase the overlap. Figure 3 shows a big overlap. Unfortunately, there’s still no error. What’s happening here?

Figure 3. Exact m45 and m44 alignment. Significant overlap between peaks.

OK, in our ideal world with two equal peaks, both having perfect peak responses and perfectly aligned m45 and m44 responses, it would appear that overlap has no effect on our results.

The question now, is what factors were at work to produce the results that Brenna observed?

Let’s go back to the drawing board. We know that m45 tends to lead m44 so we’ll scrap our assumption of perfect peak alignment and allow the m45 response to lead the m44 response by some nominally small value, say 100 ms.

Figure 4 shows our two peaks again. Same situation as Figure 2, but with the m45 response leading the m44 response. Now we’re seeing something change. The measured value is wrong and it is more negative (remember, the true value of our peaks are both -27)

Figure 4. m45 leads m44 by 100ms. Moderate overlap between peaks.

Figure 5 shows an increased overlap. Same situation as Figure 3, but with the m45 response leading the m44 response. The overlap has increased and so has our error. This is starting to resemble Brenna’s research, apart from one major difference – our error is going in the opposite direction to his observations.

Having the m45 response lead the m44 response gives the results that Meier-Augenstein predicted – an error which makes the results appear more negative. That was also his stated reason for the response he predicted. He must have observed systems where the m45 signal lead the m44 signal by some finite amount.

Figure 5. m45 leads m44 by 100ms. Significant overlap between peaks.

Right, let’s review what we’ve observed so far. In our ideal world, with no other sources of error other than the alignment of the m45 and m44 responses, perfect alignment of m45 and m44 results in no error when peaks of identical CIR overlap. We can infer that Brenna could not have had perfect alignment of the m45 and m44 signals. Having the m45 lead the m44 results in a systematic error, proportional to degree of overlap, which biases the result in a more negative direction.

Let’s see what happens if we have m45 response lagging the m44 response by some small amount, again 100ms. Figure 6 shows the same situation as Figure 4, but with the m45 response lagging the m44 response. Right, now we’re getting somewhere. An error which makes the result less negative. So far so good.

Figure 6. m45 lags m44 by 100ms. Moderate overlap between peaks.

Figure 7 shows an increased overlap. Same situation as Figure 5, but with the m45 response lagging the m44 response. The overlap has increased and so has our error. This looks just like the results Brenna observed. A systematic error which is proportional to degree of overlap and results in a less negative bias.

Figure 7. m45 lags m44 by 100ms. Significant overlap between peaks.

OK, so we can provide an explanation which satisfies both Brenna’s observations and Meier-Augenstein’s predictions. It’s all to do with the relative positioning of the m45 and m44 responses. Perfect alignment and there’s no error. A small misalignment (in our case, approximately 0.3% of the peak width) and a systematic error is observed proportional to degree of overlap and biased in either the more negative or less negative direction (depending on whether m45 leads or lags m44).

This is probably a good point to establish some ground rules. Perfect alignment is unachievable. This is not an analogue system with continuous signals. It is a digital system which samples the m44 and m45 responses at some fixed frequency. Regardless of what algorithm is used by the software to try and align the two signals there will always be a finite error in that process. There are many other factors which make perfect alignment of m45 and m44 an impossibility in practice, e.g. slightly different peak widths, slightly different peak shapes, etc. All these factors conspire to make the job of the software a demanding one.

The next part will demonstrate why we observe these results and what we can infer from the LNDD results.

Forward to Part II.

Full Post with Comments...

Tuesday, November 20, 2007

An Idiot Looks at [Brenna 94]

Dr. Brenna was an author on a 1994 paper that has been cited variously in the case, both for and against Landis, by the usual suspects. Contributor Ali has gotten a copy, and files this evaluation...

By Ali

Curve Fitting for Restoration of Accuracy for Overlapping Peaks in Gas Chromatography/Combustion Isotope Ratio Mass Spectrometry

by Keith J. Goodman and J. Thomas Brenna

(Hereafter [Brenna 94])

The purpose of this review is to summarize their findings and highlight any aspects that may have relevance to the matter at hand – the Landis case.

The background of the paper is that overlapping peaks in IRMS analysis result in inaccurate calculation of o/oo values, which we've looked at before. It says,

The conventional algorithm resulted in systematic bias related to degree of overlap

And it attempts to offer a new algorithm involving curve fitting to get better results. It also offers some experimental results on the affects of overlaps, which are of interest to us.

[MORE]

The conventional algorithm is separating the peaks with a vertical line at the centre of the valley between their overlap and taking that line down to what is assumed to be the background level. This is then used as an integration limit.

It's what our spreadsheet does, and appears to have been employed on a number of Landis’s IRMS F3 chromatograms to separate the 5B and 5a peaks from either themselves, or more frequently from some unidentified small peak that appears to be between and overlapping both the 5B and 5a peaks.

The paper gives a brief description is presented on the GC/C/IRMS. An integration time of 0.25 s was specified, which we believe to be analogous to the sampling rate. Peak start and stop are detected using only the 44 plot (due to superior signal to noise ratio). This differs from the LNDD process described by Brenna, where the 45/44 ratio plot is used to determine the start and stop times. Peak maxima are detected on each plot (44, 45 and 46) to identify the time shifts between the three detectors and the previously determined integration interval is applied to all three plots. Background is identified by a straight line fitted between peak start and stop points (which may be at the same level for constant background or at different levels for sloping background).

Due to the extra plumbing involved in the GC/C/IRMS process, both chromatographic efficiency and peak shape are detrimentally affected. In other words, you tend to get more peak overlaps and the peaks are generally not Gaussian, but are skewed (exhibiting an extended tail). Therefore, fitting a pure Gaussian peak to real data would not yield the best results.

To compensate for the inaccuracies involved in the conventional algorithm, the paper proposes to assess the ability of four different curve-fitting algorithms to recover the true peak shape and o/oo values of the overlapping peaks. What these four functions are may be of interest to some but aren’t relevant to this review.

Two substances exhibiting near identical o/oo values were used to experimentally generate a series of overlapping peaks. With equal sized peaks, at varying degrees of overlap (between 0% and 70%), it was observed that the conventional algorithm exhibited a depleted C13 ratio for the leading peak (-8 o/oo at maximum overlap) and an enhanced C13 ratio for the lagging peak (+8 o/oo at maximum overlap). This result was described as “unexpected”. The degree of error was proportional to the degree of overlap.

Brenna's testimony at the hearing leaned heavily on these measurements.

Application of the proposed curve fitting functions to the peaks yielded an improvement in peak area and o/oo recovery.

Similar experiments were run with a 10:1 ratio of peaks (leading peak ten times bigger than lagging peak). Using the conventional algorithm, at 40% overlap, detection of the smaller, lagging peak became problematic due to interference from leading peak’s tail. Beyond 40% overlap it was not distinguished as an individual peak.

The general trend of the leading peak being depleted and lagging peak becoming enhanced was observed for those cases where the two peaks were distinguishable.

With the peaks reversed and the leading peak being the smaller, the conventional method detected the smaller, leading peak at all degrees of overlap and it reflected a similar depletion trend as had been previously been observed.

The larger lagging peak exhibited very little error at all degrees of overlap. Curve fitting in all cases appeared to offer some advantages and generally improved the ability to recover the true o/oo of the peaks.

The curve fitting aspect is not strictly relevant to the Landis case, as it would appear that this has not used by LNDD. The conventional method and the results obtained are of more interest.

The first observation is that these results confirm Dr Brenna’s testimony of the leading peak’s C13/C12 ratio becoming depleted and the lagging peak’s C13/C12 ratio becoming enhanced. This contradicts Dr Meier-Augunstein’s testimony.

A significant factor effecting this contradiction is the effect of the m45 signal leading the m44 signal by approximately 150 ms.

Uncorrected, if the left-hand integration limit for the lagging peak is taken as the minima of the valley between the peaks and that is applied directly to both the m44 and m45 plots (not time shifted), then one would expect the C13/C12 ratio of the lagging peak to become depleted, having had a relatively larger proportion of C13 (m45) chopped off, compared to the slightly retarded C12 (m44) signal.

It would appear that performing the correction described by Brenna would resolve this issue but clearly it doesn’t. If we assume identical but scaled down peaks between the m44 and m45 plots, with no time shift (or a corrected time shift), the instantaneous 45/44 ratio would be homogeneous, presenting a constant value across the integration interval. In that case, the overlap of two peaks having identical o/oo values should have little or no impact on the measured value.

So why did overlap have such a significant effect in Brenna’s study and why were the results both contradictory to Meier-Augunstein’s opinion and described as “unexpected” by Brenna?

One possible explanation may lie in the integration time of 0.25 s.

In sampling at 0.25s intervals, how accurately will the peak maxima times on the m45 and m44 plots be identified? That’s what determines the required time shift so that they line up exactly. Remember that the required time shift is approximately 0.15s and we’re sampling at 0.25 s. A degree of error appears unavoidable. What if this error resulted in overcompensation for the time shift, resulting in a correction that had the m44 plot leading the m45 plot by some small but significant amount? This would reverse the effects described by Dr Meier-Augenstein and result in the observations made by Dr Brenna . [un-UPDATE: incorrection removed, don't ask. ]

To this idiot, this seems the most likely reason.

In a later post, we'll look at a paper in the GDC collection, in which Brenna looks at quantization errors. These are ones where you have insufficient or mismatched sample times.

Full Post with Comments...

Sunday, November 11, 2007

Idiots look at Data, Part VIII: The Insecurity Index

Throughout our look at chromatograms and parts of chromatograms, we've been counting things that look like they might be problems in the data set. We are not saying they are problems, we're saying they are things that may cause concern. The higher the number, the more careful we want to be about interpreting the data.

Adding up all the numbers we get an aggregate we'll call the Idiot's Insecurity Index (I3), pronounced, "aye-yi-yi". It consists of:

The Number of peaks
Rating of the baseline slope on a 1 to 5 scale.
Rating of bumpiness of the background on a 1 to 5 scale.
On the peaks of interest, the total count of shoulders, leading or trailing edges, connections to neighbors above the baseline, and the number of neighbors within one peak-width at the baseline.

It is not a threshold; there aren't limits. It applies to all chromatograms. Higher values are cause for notice, and evaluation, and confirmation.

[MORE]

Let's look at two trivial examples, one from UCLA, and one from the LNDD and score them.

Figure 1: UCLA reference pulses.

UCLA: 3 peaks, flat slope = 1, no bumps = 1, total shoulders = 0, total edges = 0, total connections = 0, neighbors in one peak = 0, total = 5.

Figure 2: LNDD reference pulses

LNDD: 3 peaks, flat = 1, no bumps = 1, shoulders 0, edges = 3, connections = 0, neighbors, charitably = 0, total = 8. Given a choice, it might be better for them to be spaced a little further apart, and the trails on the pulses might indicate a problem somewhere in the system, we think.

So given the I3 and the scores we made all along, where are we with our look at the data in previous parts? We could make you go to another post or page, like a computer hardware review sites, but we'll be nice:

Test	Pks	Slp	Bmp	Shldr	Edg	Conn	Nbrs	I3
UCLA	12	1	1	2 2 0	0 1 0	0 0 0	0 0 0	19
Ex 92 3-Jul	36	3	2	2 1 2	0 1 0	2 2 0	3 1 1	56
Ex 88 13-Jul	33	5	2	2 2 1	0 0 0	2 2 0	3 3 1	56
Ex 90 14-Jul	36	5	1	2 1 1	0 0 0	2 2 0	2 2 1	55
Ex 86	47	5	3	2 2 1	0 0 0	2 2 0	2 2 1	69
USADA 173	29	3	1	2 2 2	0 1 0	0 2 0	2 2 1	47
USADA 349	27	3	1	1 2 1	0 1 0	2 2 0	2 2 1	44
Ex 87 22-Jul	39	2	2	1 1 0	0 1 0	1 2 1	1 2 3	56
Ex 84 23-Jul	32	1	3	1 1 0	0 1 0	2 2 1	2 2 3	51
Ex 93 control	22	1	2	1 1 1	1 1 0	0 2 1	0 2 3	38
Ex 85 control	35	2	3	1 1 1	1 2 0	0 2 1	1 2 2	54
Ex 89 control	36	2	2	1 1 1	1 1 0	1 2 1	1 4 3	57
Shack fig3a	17	1	1	0 1 1	0 1 1	0 2 0	0 4 0	29
Shack fig3b	19	2	1	0 0 2	0 1 0	0 2 0	0 1 0	27

This is bogus, you say. However, it follows thinking used in Software Engineering in measures such as the McCabe complexity, or the Halstead volume, or arguably function points. A chromatogram with an I3 of two is like a software subroutine that does nothing: Useless, but absolutely correct. On the other hand, one with an I3 of 200 is like a 10 page software function with a McCabe of 2000 -- it might appear to be correct, but how do you really know without looking very closely indeed?

Bigger numbers mean more stuff.

More stuff means more opportunity for error. The more stuff you have, the more careful you need to be about checking assumptions and pre-conditions.

In an earlier post, M has made comments suggesting it is unfair or improper to make some of the comparisons made here. We disagree; as shown above, the methodology is perfectly applicable to a straight line background or a series of reference pulses. It is a measure of the potential for problems, not an assertion there are problems.

M also suggested one reason it was unfair was that chemistry in the F3 fraction was more difficult than that of the F2 fraction. It is true the F3 chemistry is more difficult, and appreciate that admission from M. It raises the very question we'd like to ask.

How do you tell if the chemistry does the job properly?

One indicator is to look at the I3 of the resulting chromatograms.

Thanks to M's diligence, we found the Shackleton chromatograms that also reveal the 5bA and 5aA, so we do have fair, like-to-like comparisons. They appear to be much cleaner by I3 score than those produced by LNDD.

What did Shackleton do that LNDD didn't? This bears investigation.

When we started this series, we said that the preconditions for correctness in the integration that computes the numbers in a CIR result are:

Clean, unambiguous baselines suggesting good chemical separation of the prepared samples. This is reflected in the count of the peaks in the chromatogram. Good chemistry give fewer peaks to be concerned about, and fewer unknowns floating about.
Significant (a debatable term) baseline (chromatographic) separation of peaks. We've demonstrated that co-elutes can cause unexpected skews of significant magnitude.
Absence of shoulders suggesting unidentified peaks. Where there are shoulders or tails, there may be unidentified co-elutes.
Measurement of nearby peaks to consider their potential for influence. We may back away from this thought, but it seem like you ought to know the CIR of every adjacent peak in case it is co-eluting in some way.

Idiots such as ourselves looking at all the LNDD chromatograms can see all stuff we don't see in the UCLA and Shackleton chromatograms, and some cases of very odd baselines in some cases.

Maybe LNDD's chemistry isn't separating as well as it ought to, and needs to.

If there is lots of stuff around, it is going to go somewhere. A high I3 score makes it prudent to be sure the peaks being measured contain only what they are purported to contain.

As we demonstrated in "Integration for Idiots", presence of unexpected material can invalidate any reported numeric results.

We are thus left with some questions to seed further discussion.

What does UCLA do to ensure purity of peaks?
What did Shackleton do to ensure purity of peaks?
What did LNDD do to ensure purity of peaks?

That is the end of "Idiots look at Data" for us.

Feel free to chop apart individual assessments and argue whether certain pixels represent particular flaws, and whether they have particular numeric significance in a particular test. This doesn't much interest us. At a scientific level, either the protocol is flawed and there is flawed data being processed and reported, or it is good data and good results. At the moment, indications such as the I3 suggest the data may not be good. A good process will be able to demonstrate the data is good.

We have said for a long time, if we can get confirmation the data is clean and pure, we're prepared to accept the numeric conclusions at a scientific level.

If there is no validation the data is clean and pure enough to trust, there is a different, legal question whether the reported results are correct.

Full Post with Comments...

Idiots look at Data, Part VII: Tail-enders

In Part VI, we tried to grade the regions containing mid-graph payload peaks. Here in Part VII, we look at payload peaks near the end of the graph. Again, the idea is to count things that look like they might be possible problems, namely:

shoulders on a peak of interest
leading or trailing edge on a peak of interest.
connection to neighbor above apparent baseline.
neighbor within one peak width

This is not intended to be a definitive count, just a sense of what might be going on and in need of a second look.

As before, click on an image for bigger. For the most part, these are scaled 2x as wide as tall.

[MORE]

ucla: Shoulders 0, edges 0, connect 0, neighbors 0

ex92: shoulders 2, edges 0, connects 0, neighbors 1

ex88: shoulders 1, edges 0, connect 0, neighbors 1

ex90: shoulders 1, edges 0, connects 0, neighbors 1

ex86: shoulders 1, edges 0, connects 0, neighbors 1

usada173: shoulders 2, edges 0, connects 0, neighbors 1

usada349: shoulders 1, edges 0, connects 0, neighbors 1

ex87: shoulders 0, edges 0, connects 1, neighbors 3

ex84: shoulders 0, edges 0, connects 1, neighbors 3

ex93: shoulders 1, edges 0, connects 1, neighbors 3

ex85: shoulders 1, edges 0, connects 1, neighbors 2

ex89: shoulders 1, edges 0, connects 1, neighbors 3

s3a: shoulders 1, edges 1, connects 0, neighbors 0

s3b: shoulders 2, edges 0, connect 0, neighbors 0

Assesment

Test	shoulders	leading or trailing edge	connect above baseline	neighbors within one peak	total
UCLA	0	0	0	0	0
Ex 92 3-Jul	2	0	0	1	3
Ex 88 13-Jul	1	0	0	1	7
Ex 90 14-Jul	1	0	0	1	1
Ex 86	1	0	0	1	2
USADA 173 20-Jul	2	0	0	1	3
USADA 349 20-Jul	1	0	0	1	2
Ex 87 22-Jul	0	0	1	3	4
Ex 84 23-Jul	0	0	1	3	4
Ex 93 control	1	0	1	3	5
Ex 85 control	1	0	1	2	4
Ex 89 control	1	0	1	3	5
Shackleton top	1	1	0	0	2
Shackleton bottom	2	0	0	0	2

Full Post with Comments...

Saturday, November 10, 2007

Idiots look at Data, Part VI: Mid Graph

In Part V, we tried to grade IS regions. In Part VI, we look at the mid-graph regions of interest. Again, the idea is to count things that look like they might be possible problems, namely:

shoulders on a peak of interest
leading or trailing edge on a peak of interest.
connection to neighbor above apparent baseline.
neighbor within one peak width

ucla: Shoulders 2, edges 1, connect 0, neighbors 0.

ex92: shoulders 1, edges 1, connect 2, neighbors 1

ex88: shoulders 2, edges 0, connect 2, neighbors 3

ex90: shoulders 1, edges 0, connect 2, neighbors 2

ex86: shoulders 2, edges 0, connect 2, neighbors 2

usada173: shoulders 2, edges 1, connect 2, neighbors 2

usada349: shoulders 2, edges 1, connect 2, neighbors 2

ex87: shoulders 1, edges 1, connect 2, neighbors 2

ex84: shoulders 1, edges 1, connect 2, neighbors 2

ex93: shoulders 1, edges 1, connect 2, neighbors 2

ex85: shoulders 1, edges 2, connect 2, neighbors 2

ex89: shoulders 1, edges 1, connect 2, neighbors 2

s3a: complete co-elution of 5aB and 5bA; shoulders 1, edges 1, connect 2, neighbors 4.

s3b: shoulders 0, edges 1, connect 2, neighbors 1

Assesment

Test	shoulders	leading or trailing edge	connect above baseline	neighbors within one peak	total
UCLA	2	1	0	0	3
Ex 92 3-Jul	1	1	2	1	5
Ex 88 13-Jul	2	0	2	3	7
Ex 90 14-Jul	1	0	2	2	5
Ex 86	2	0	2	2	6
USADA 173 20-Jul	2	1	2	2	7
USADA 349 20-Jul	2	1	2	2	7
Ex 87 22-Jul	1	1	2	2	6
Ex 84 23-Jul	1	1	2	2	6
Ex 93 control	1	1	2	2	6
Ex 85 control	1	2	2	2	7
Ex 89 control	1	1	2	2	6
Shackleton top	1	1	2	4	8
Shackleton bottom	0	1	2	1	4

Full Post with Comments...

Friday, November 09, 2007

Idiots look at Data, Part V: Internal standards

In Part IV, we tried to grade the baselines. In Part V, we start looking at regions of interest, beginning with the Internal standard. The idea is to count things that look like they might be possible problems, namely:

shoulders on a peak of interest
leading or trailing edge on a peak of interest.
connection to neighbor above apparent baseline.
neighbor within one peak width

As before, click on an image for bigger. For the most part, these are scaled 2x as wide as tall.

[MORE]

ucla: Think the last one is the IS (if any). Shoulders 2, edges 0, connect 0, neighbors 0.

ex92: Umm, shoulders 2, edges 0, connect 2, neighbors 3

ex88: shoulders 2, edges 0, connect 2, neighbors 3

ex90: shoulders 2, edges 0, connect 2, neighbors 2

ex86: shoulders 2, edges 0, connect 2, neighbors 2

usada173: shoulders 2, edges 0, connect 0, neighbors 2

usada349: shoulders 1, edges 0, connect 1, neighbors 2

ex87: shoulders 1, edges 0, connect 1, neighbors 1

ex84: shoulders 1, edges 0, connect 2, neighbors 2

ex93: shoulders 1, edges 1, connect 0, neighbors 0

ex85: shoulders 1, edges 1, connect 0, neighbors 1

ex89: shoulders 1, edges 1, connect 1, neighbors 2

s3a: No IS apparent

s3b: No IS apparent.

Assesment

Test	shoulders	leading or trailing edge	connect above baseline	neighbors within one peak	total
UCLA	2	0	0	0	2
Ex 92 3-Jul	2	0	2	3	7
Ex 88 13-Jul	2	0	2	3	7
Ex 90 14-Jul	2	0	2	2	6
Ex 86	2	0	2	2	6
USADA 173 20-Jul	2	0	0	2	4
USADA 349 20-Jul	1	0	1	2	4
Ex 87 22-Jul	1	0	1	1	3
Ex 84 23-Jul	1	0	2	2	5
Ex 93 control	1	1	0	0	2
Ex 85 control	1	1	0	1	3
Ex 89 control	1	1	1	1	4
Shackleton top	n/a	n/a	n/a	n/a	n/a
Shackleton bottom	n/a	n/.a	n/a	n/a	n/a

Full Post with Comments...

trust but verify

Monday, January 28, 2008

Idiots Return to Brenna and WM-A, Part II

Sunday, January 27, 2008

Idiots return to Brenna and WM-A, Part I

Tuesday, November 20, 2007

An Idiot Looks at [Brenna 94]

Sunday, November 11, 2007

Idiots look at Data, Part VIII: The Insecurity Index

Idiots look at Data, Part VII: Tail-enders

Saturday, November 10, 2007

Idiots look at Data, Part VI: Mid Graph

Friday, November 09, 2007

Idiots look at Data, Part V: Internal standards

What"s here

Calendar

What they say about TBV

About Me

About Us (Admissions)

Links of Interest

Blog Archive