Idiots Return to Brenna and WM-A, Part II
Alasdair, once Ali, continues...
In Part 1, we managed to recreate Brenna's results. In Part 2, we're going to explain exactly how they occurred with reference to both the m45 and m44 signals and what happens when peaks overlap. However, before we do that, we need a little more background on what the software does.
Some of you may be thinking "We understand the logic, but m45 leads m44,not the other way round, you idiot."
Well, that's true, but hold on...
[MORE]
Brenna stated that his software compensated for the misalignment between m44 and m45, so we know that an attempt was made to align them perfectly.
That never happened though, because if it had, he would not have observed the results that he did. There are many factors which would make perfect alignment extremely difficult, including: The m44 and m45 signals are sampled quantities; In practise, they are not perfectly the same width and the same shape; The m45 peak is 100's of times smaller in magnitude than the m44 peak and more susceptible to chemical and electrical noise.
So Brenna's software tried but failed to achieve PERFECT alignment.
Older software may not even try to compensate for the m44-m45 time difference. Newer software may try but fail, resulting in a small finite misalignment either one way or the other. Consider this: To achieve perfect alignment and avoid errors when peaks overlap, you need perfect peaks.
At the end of the day, whether your software compensates or not, it looks like you're going to be left with an error in the alignment of m44 and m45. We don't know whether the OS2 software used for Floyd's Stage 17 compensates or not. We don't know if it behaves like Brenna predicted or whether it behaves like Meier-Augenstein predicted. After we've explained the results, we'll tackle that question.
Before we get into what can go wrong, lets see what is required for accurate CIR measurement. Figure 8 shows a m44 peak, with its corresponding m45 component superimposed on it. The two peaks are perfectly aligned. The integration limits are shown as two vertical lines. Remember, m44 and m45 exist as two separate signals, and we need to show both to explain what’s happening. The software needs to calculate the area under the m45 peak, calculate the area under the m44 peak and divide the m45 area by the m44 area. The result is the CIR for this substance.
Let’s say the m45 signal leads (occurs before) or lags (occurs after) the m44 peak by some small amount. As long as there’s no other peak interfering with our peak, that’s not a problem. We just position our integration limits so that they encompass both peaks and we’re good to go
Right, lets introduce another peak to the left of our first peak. Figure 9 shows two peaks of the same CIR overlapping by some arbitrary amount. We use the same method as Brenna to set up our integration limits. Lets examine what’s happening with Pk2 . We can see that our integration limits exclude the left hand tail of both the m44 and m45 peaks. That means that we’ve lost those areas from our calculation. That’s not a problem, because the m44 and m45 are perfectly aligned, so we lose the same relative proportions of each.
We’re also including the right hand tail of Pk1. Again that’s not a problem because in this case, both peaks have the same CIR so the contribution from Pk1 has the same relative proportions of m44 and m45 as Pk2 (it wouldn’t matter if Pk1 was much bigger than Pk2, the previous statement holds true).
So with perfectly aligned m44 and m45, it won’t matter how much those peaks overlap, because they have the same CIR.
Figure 10 shows our m45 leading the m44 by some amount (we haven’t labelled the peaks again as they are clearly identifiable from Fig.9). We’ve had to exaggerate this so that you can see what’s happening. What you’re seeing is m45 leading m44 by an enormously large 1 second. [To put that in context, if Figure 4 had used a 1 second lead instead of a 0.1 second lead, the measured value for that peak (which has a true value of –27) would not have been –28.4, it would have been –41.0 !]
We don’t need to look too hard to see that the m44 situation remains the same. We lose the left-hand tail of Pk2, but gain an equal amount of the right hand tail of Pk1. Total area same as before. It’s all going wrong with our m45 peak though. We’ve lost more of our Pk2 m45 than in Fig.9 and we’re gaining less of Pk1’s m45. That’s upset the balance. When we calculate our areas for Pk2 now, the ratio will make it look like we have less m45 than before. That makes our d45 figure more negative and it makes it look more probable that doping was involved.
Figure 11 shows the opposite situation, m45 lagging m44. Again, all’s well on the m44 front, no change again.
So what’s happening with our m45 now?
We’re losing less m45 from Pk2 than in Figure 9 and we’re gaining more of Pk1’s m45. That’s upset the balance again, but this time it’s going to increase our m45 areas. That makes our Pk2 d45 figure less negative and it makes it look less probable that doping was involved.
There you have it. The mystery of the overlapping peak and why the observations of Brenna and Meier-Augensten were both valid. It depends on the software system you’re using and how it performs with regard to aligning m44 and m45.
All very interesting, but what about the 64,000 dollar question … What is the LNDD system like?
The reprocessing of the Stage 17 EDFs provides our only real insight into the characteristics of the LNDD equipment. Specifically, we’re going to look at the results obtained from the blank sample. This is useful for us because: We know it shouldn’t test positive; It was processed in an automatic mode by the software; It was also processed in a manual mode by the technicians.
The F3 for the blank showed a degree of overlap between the 5bP peak and the 5aP peak (not to the extent that Floyd’s overlapped, but it was there).
In auto mode, the 5aP for blank came out as delta –3.65 (“delta” refer to the numerical difference between the d45 value for our substance under investigation and the d45 value of a reference substance generated by the athlete which is known not to be effected by the doping in question). The threshold for a doping positive is delta –3. Anything more negative is considered indicative of doping, however the uncertainty that LNDD claim is +/- 0.8 so more negative than delta -3 is suspicious, but delta –3.8 is the absolute threshold. As you can see, the known clean blank escaped testing positive only because it was within the uncertainty claimed by LNDD. That was when processed automatically by the software, without intervention. When LNDD redid it using their manual method, they got a result of delta –1.87. It is worth noting that they knew this was the blank and that it should not test positive.
So what do these results tell us?
Suppose the LNDD system behaved like the one Brenna used in his research and made the measured d45 appear less negative than it really is. That would imply two things. Firstly, the blank sample was a "positive", and really more negative than the measured delta –3.65. Secondly the LNDD manual method made the result worse, not better – they moved it in the wrong direction, away from the true value. (It also provides a stunning example of the influence of manual methods to achieve desired or expected results.)
This argues that we can’t accept that the LNDD equipment had the same characteristics as Brenna’s with respect to alignment of the m45 and m44 signals. If it did, the blank must have come from someone who had been doping with testosterone and the LNDD manual adjustment method is laughably inadequate.
Let’s consider that the LNDD system achieves perfect m45 and m44 alignment. Apart from the fact that it’s probably not achievable, it would mean that the blank, non-doping sample only just escaped a false positive. What does that say about the threshold of delta –3 ?. Also, it would mean that the LNDD manual adjustment method changed a correct result to become incorrect by more than delta –1.6. That’s way beyond their claimed uncertainty figure of +/- 0.8 and would call their competence into question.
We’re only left with one other option, and that is that the LNDD system results in the m45 leading the m44 by some finite amount, resulting in the 5aP appearing more negative than it really was. That would explain why the blank almost tested positive. That would explain why LNDD manually altered the result for the blank to make it look less negative. It would also explain why all of Floyd’s 5aP results appeared far more negative than they actually were.
Far more negative? We’ve established that the error is proportional to degree of overlap. Look at the blank F3 chromatograms. The degree of overlap between the 5bP and the 5aP is negligible, but that was enough to make the blank almost test positive. Now look at Floyd’s F3s.
Noticeably more overlap.
For those calling us idiots, let's remember that we are not talking about the actual lag between the peaks, but lags produced when the software attempts to account for the actual lag and fails to to a perfect job. Such software more or less works adequately when there are no overlaps in peaks. As we've shown, when there are minor errors in this compensation, overlapped peaks can be systematically measured with inaccurate results.
We've shown here that it is possible to reconcile Brenna's study and WM-A's theory in a way that leaves it less likely that LNDD got correct results.
Which seems like a perfect place to stop.
















































