Tuesday, December 05, 2006

Through the T/E tests, looking at calibration

A new correspondent who wishes to avoid the tar-baby effect sends the following exceptionally detailed consideration of the T/E test chronology, with emphasis on calibration steps.

Warning, heavy reading follows.


By an Anonymous Correspondent

I have no particular expertise in GC/MS testing or steroid testing, but I think that I develop some reasonable, understandable arguments. I hope that you and your readers find this interesting. I am not associated with any of the parties in the dispute.

[There are so many page citations here TBV won't link them individually. You can pull up pages by opening another browser window and going to the page index at the archive.org library. -TBV]

I offer some observations and speculations about the analysis of the ratio of Testosterone (T) and Epitestosterone (E) in Landis’s Stage 17 sample.

I observe that the calculation of T/E as the ratio of raw GC/MS responses is demonstrated to be imprecise by a review of the calibration sample analyses documented in the LDP. The concentrations of T and E are known for calibration samples and the ratio of the concentrations can be accurately calculated and compared with the recorded T/E values. The error percentages are chronologically
  • USADA0060: 11.77%;
  • USADA0051: 3.19% (see below);
  • USADA0207: 25.01%, 20.59%, 6.19%;
  • USADA0086:10.78%, -2.98%, -3.15%;
  • USADA0270: 4.43%, -1.08% and -0.47%.
The accuracy of the calculation of T/E as a ratio of the raw responses depends upon the responsiveness of T and E being equal and invariant with variation in concentrations. A review of the data below indicates that this dependency is not met. I speculate that the target responses of calibration samples are not integrated in the same manner as the target responses of test samples; that if the integration of responses for calibration samples adopted the manner of test samples, the errors above would be even larger; and that the larger errors are a present in existing test sample calculations of T/E.

The lab protocol acknowledges 30% uncertainty in the measured T/E results. This reflects allowable error magnitudes of +42.86% and -30% relative to the actual T/E value. The error percentage of 25.01% for a pristine calibration sample consumes a significant portion of the allowable 42.86% error budget. For a field sample, the lab protocol requires substance identification of only 80% of the measured peak, potentially allowing another 20% of error. Other factors will contribute to further uncertainty for field samples. I speculate that the characterization of the uncertainty at 30% for T/E measurement as prescribed for field samples, corresponding to a 42.86% error, is significantly too low.

The alternative method of calculating the T/E ratio is to accurately calculate the concentration of T and E in the sample and to take the ratio of those concentrations. This requires performing an analysis of calibration samples to characterize the GC/MS responsiveness of T and E relative to an Internal Standard (IS), in this case methyltestosterone.

I will use the following abbreviations:

XR is the target response of substance X.
XC is the concentration of substance X in ng/mL.
IR is target response of the internal standard, methyltestosterone.
IC is the concentration of the internal standard, methyltestosterone.
TR is the target response of testosterone.
TC is the concentration of testosterone in ng/mL.
ER is the target response of epitestosterone.
EC is the concentration of epitestosterone in ng/mL.

For a substance X in a test sample which is to be quantified as a concentration, (XR/IR)/ (XC/IC) is calculated for calibration samples. As we will see, the value of (XR/ IR)/ (XC/ IC) may vary with the concentration of X, so potentially multiple calibration samples at varying concentrations must be analyzed. The concentration of X in a test sample can be calculated as ((XR/ IR) * IC) of the field sample divided by (XR/ IR)/ (XC/ IC) of the calibration sample that best matches the expected concentration.

I observe that that the T/E confirmation tests assume that the calculation of (XR/ IR)/ (XC/ IC) is insensitive to the concentration of X. For both T and E, each calibration sequence establishes two values of (XR/IR)/ (XC/IC), one for T and one for E, regardless of concentration. For T, the values are documented in USADA0208, USADA0088 and USADA0273. For E, the values are documented in USADA0209, USADA0089 and USADA0274. The resulting errors in E quantification of calibration samples are
  • USADA0207: 28.55%, 14.36%, -3.19%;
  • USADA0086: 16.51%, 13.86%, -3.01%;
  • USADA270: -3.07%, 3.00%, -0.43%.
The resulting errors in T quantification of calibration samples are
  • USADA0207: 12.53%, 3.78%, -0.90%;
  • USADA0086: 1.98%, 13.79%, -2.89%;
  • USADA0270: -7.80%, 3.43%, -0.62%.
The expected uncertainty for T quantification is 20%, which corresponds to +25.00% and -20.00% errors relative to the actual concentration. The expected uncertainty for E quantification is 30%, which corresponds to +42.86% and -30.00% errors relative to the actual concentration. I speculate that for pristine calibration samples, demonstrated errors of 13.79% and 28.55% with respective budgets of 25% and 42.86% for field samples indicates that the uncertainty budgets are too low.

I observe that the values of (XR/IR)/ (XC/IC) within calibration sequences varies significantly and speculate that such variability is not suitable for being resolved to a constant. For E, the differences of the maximum and minimum values expressed as a percentage of the minimum value are
  • USADA0204: 29.60%;
  • USADA0207: 32.78%;
  • USADA0086: 20.12%;
  • USADA0268: 38.21%;
  • USADA0270: 3.44%.
(USADA0204 and USADA0268 reflect calibrations for which there is incomplete information.) For T, the corresponding differences are
  • USADA0207: 13.56%;
  • USADA0086: 17.17%;
  • USADA0207: 12.19%.
I observe that the documentations of the T/E confirmation tests describe the calculation of the responses for calibrations as to “Extract & Integrate” across a 1 minute interval centered on the expected peak retention time (IS at 29.94 min., E at 18.52 min., T at 19.32 min.). I speculate that this is a different process than used during the analysis of test samples, where the instrument identifies the interval of a single peak and integrates across typically a much smaller interval. (I am surprised that test documentation does not include the interval across which a response was actually integrated.) I speculate that if the calibration analysis uses a different process than the test sample analysis then the accuracy of the test sample analysis is reduced. This discrepancy may be leading to overestimating the E response ratio in some calibration samples and hiding the over estimation of T/E as calculated by TR/ER.

Consider the three calibration runs of “blu t30 e5” (USADA0285, USADA0094, USADA0216). As the number and size of the extra peaks decrease in the interval from 18.01 to 19.01 min., the responsiveness of E per amount relative to T per amount ((ER/EC)/(TR/TC)) decreases. The results are increasingly less responsive to E compared to T, resulting in a higher TR/ER.

For the B confirmation (USADA0285), there are two significant additional peaks at 18.38 and 18.62. The value of ((ER/EC)/ (TR/TC) is 0.958, the value of TR/ ER is 6.266. For the 24-jul-06 A confirmation (USADA0094), there is a significant additional peak at 18.42 and a minor peak at 18.97. The value of ((ER/EC)/ (TR/TC) is 0.903 (lower), the value of TR/ ER is 6.647 (higher). For the 22-jul-06 A confirmation (USADA0216), there is only a small extra peak at 18.39. The value of ((ER/EC)/ (TR/TC) is 0.800 (lowest), the value of TR/ ER is 7.500 (highest). (All of the T/E values should be 6.00.)

I speculate that if the extraction and integration occurred for a single response peak of the “blu t30 e5” sample, the value of ((ER/EC)/ (TR/TC) would be consistently between 0.7 and 0.6 and there would be a consistent overestimation of between 42.87% and 66.76% for T/E by TR/ ER. I suspect this overestimation of T/E is already occurring for the non-calibration samples tested with low concentrations of E. The extent of this error matches or exceeds the uncertainty budget established by the lab protocol without consideration of other sources of errors.


Let’s review the chronology of T/E analysis.

At or about 12:59 PM on 21-Jul-06, a calibration sample (“2107 rcl 028 2007 H1”) containing 40 ng/mL of T and E was analyzed (USADA0060, USADA0061). The value TR/ER can be calculated as 1.117714, exhibiting an error of 11.77% relative to the actual T/E. (TR/IR)/ (TC/IC) is calculated as 1.571831. (ER/IR)/ (EC/IC) is calculated as 1.406291. (These values are used below.)

At or about 3:56 PM on 21-Jul-06, a calibration sample (“2107 blu 2107 H1”) was analyzed (USADA0051, USADA0052). The value of TR/ER can be calculated as 1.031866. The concentrations are documented at 8.5 ng/mL for T and 9.2 ng/mL for E. The values used to calculate the quantities ((TR/IR)/ (TC/IC)) and (ER/IR)/ (EC/IC)) are 1.563029 for T and 1.399506 for E, corresponding closely to the values calculated from the prior day calibration.

I speculate that this sample was actually a blank urine sample spiked to 10 ng/mL of both T and E. If the data is evaluated on this basis it provides more accurate coefficients for evaluating field samples that contain approximately 10 ng/mL concentrations. The value of (TR/IR)/ (TC/IC) is 1.328575. The value of (ER/IR)/ (EC/IC) is 1.287545. The value of TR/ER is 1.031866, exhibiting an improved 3.19% error relative to the actual T/E.

At or about 16:01 on 21-Jul-06, an A sample confirmation calibration occurred (USADA0204). We only have the IS and E responses of that analysis. The maximum (ER/IR)/ (EC/IC) value is 29.60% larger than the minimum.

At or about 7:36 PM on 21-Jul-06, the steroid screening analysis occurred for Landis’s A sample from Stage 17 of the TdF (USADA0054, USADA0055). The value of TR/ER can be calculated as 4.943417 and the T/E value is recorded as 4.9. The quantification of T and E concentrations uses the coefficient established by the 40 ng/mL calibration analysis (1.571345, 1.406035). The value of TC/EC is 4.423358 and is likely to be a more accurate calculation of T/E. The concentration of T is calculated as 60.6 ng/mL. The concentration of E is calculated as 13.7 ng/mL.

I speculate that a more accurate calculation of E concentration is obtained by using the value of (ER/IR)/ (EC/IC) established by the 10 ng/mL calibration analysis, i.e.1.287545. A concentration of approximately 15 ng/mL is closer to 10 ng/mL than 40 ng/mL. The resulting value of TC/EC is 4.049337 or 4.0 rounded to a single decimal place. The value of T/E is 4.0! The concentration of T is calculated as 60.58124 ng/mL. The concentration of E is calculated as 14.961 ng/mL.

The irony is that the most accurate calculation of T/E for the Stage 17 Landis sample with the data available in the evening of 21-Jul-06 is 4.0, significant to a single decimal point. If this value is used, there is no further testing, no AAF, no suspension, no LDP and no TBV. Instead, the less accurate value of 4.9 is used in accordance with WADA protocols and the testing moves on to the A confirmation tests.

At or about 18:02-1832 on 22-Jul-06, the first A confirmation analysis occurs (USADA0212, USADA0213) and includes a test for free T and E (USADA0214).

At or about 20:05 to 21:07 on 22-Jul-06, the calibration analysis occurs for the above confirmation analysis (USADA0207, USADA0216, USADA0217, and USADA0218).

The confirmation analysis is remarkable because the magnitude of the IS response is so much smaller than that of the calibration by a factor of 3 or more. This results in estimates of the concentrations of T and E which are 3 times greater that subsequent testing, but does not appear to have a significant effect on TR/ER or TC/EC. The results probably led to the wrong initial impression that the T concentration level was high in absolute terms.

The calculated value of TR/ER is 10.70794. The calculated value of TC/EC is 9.791359 using a value of 1.09 for (TR/IR)/ (TC/IC) and 0.996 for (ER/IR)/ (EC/IC) (USADA0208, USADA0209). The discrepancy between TR/ER and TC/EC is 9.36% relative to TC/EC.

I speculate that it is more accurate to use the value of (TR/IR)/ (TC/IC) and (ER/IR)/ (EC/IC) from the calibration test that is closest to the concentrations of the field sample, i.e. “blu t30 e5” (USADA0216). (This calibration also has the fewest extraneous peaks within the interval of integration.) The values are respectively 0.968596 and 0.774823. These generate a value of 8.565759 for TC/EC and this is probably a more accurate estimate of the T/E value than 10.7. I am unable to speculate as to the quantitative uncertainty of this better estimate.

In order to convey the potential consequences of the speculated E response integration across a 1 minute interval for “blu t30 e5” (USADA0216), if the value of ((ER/EC)/ (TR/TC)) from “blu t30 e5” is further reduced from 0.8 to 0.7 or 0.6 to reflect elimination of the peak at 18.39 minutes, the resulting values of TC/EC are respectively 7.50 and 6.42. A value of 0.7 for ((ER/EC)/ (TR/TC)) reflects a contribution of slightly more that 10% by the extra peak. (It is impossible to accurately estimate the contribution from USADA0216.)

The analysis for “free” T & E calculates the T concentration to be 1.06 ng/mL and the E concentration to be 0.10 ng/mL. This calculation apparently uses a value of 1.089734 for (TR/IR)/ (TC/IC) and a value of 0.946082 for (ER/IR)/ (EC/IC). Presumably, these values are evaluated from the analysis documented by USADA0222, but the actual target responses are not available.

In the calibration analysis, the (TR/IR)/ (TC/IC) values vary by 13.56% and the (ER/IR)/ (EC/IC) values vary by 32.78%.

At or about 13:28 on 24-Jul-06, the A confirmation test was done again (USADA0092, USADA0093). This time the problem with the response of IS appears to be corrected. There is no documentation of a test for free T & E at this time, although an analysis of 2 ng/mL calibration concentrations appears to have occurred at 11:55 (USADA0100).

At or about 15:01 to 16:02 on 24-Jul-06, the calibration analysis for the above confirmation test was performed (USADA0086, USADA0094, USADA0095, USADA0096).

The calculated value of TR/ER is 11.43506. The calculated value of TC/EC is 11.80192 using a value of 0.895 for (TR/IR)/ (TC/IC) and 0.923 for (ER/IR)/ (EC/IC) (USADA0088, USADA0089). The discrepancy between TR/ER and TC/EC is -3.11 relative to TC/EC.

In the calibration analysis the (TC/IR)/ (TC/IC) values vary by 17.17% and the (ER/IR)/ (EC/IC) values vary by 20.12%.

At or about 17:15:46 on 24-Jul-06, the calibration done above is used to calibrate the GC/MS machine (USADA0083).

At or about 10:42 AM on 25-Jul-06, the Landis sample is re-tested with the steroid screening analysis. The value of TR/ER is 5.097012 and the analysis assigns a T/E value of 5.1. The concentration of T is calculated at 49.7 ng/mL using a value of 2.541328 for (TR/IR)/ (TC/IC). The concentration of E is calculated at 11.1 ng/mL using a value of 2.232433 for (ER/IR)/ (EC/IC). There is no information provided on the calibration analysis that established these coefficients. There is a 13.84% discrepancy between the calculated TR/ER and TC/EC values.

At or about 9:24:50 AM on 3-Aug-06, some T/E confirmation calibration samples were analyzed (USADA0268). This is the morning of the B sample analysis. There is incomplete information and only the IS and E responses are provided. Significantly, the E responses demonstrate the highest variation of all of the confirmation calibrations, varying by 38.21%.

At or about 18:43-20:16 on 3-Aug-06, the B sample T/E confirmation analysis is performed 3 times. An additional analysis occurs for quantification of “free” T and E in the sample. (USADA0277-USADA0284)

At or about 15:01-16:02 on 3-Aug-06, the analysis of the calibration samples for the B analysis is performed. (USADA0270, USADA0285, USADA286, USADA0287).

At or about 17:10 on 3-Aug-06, the analysis of a sample “TP TE 2” is performed (USADA0292). The resulting target response values are not documented, but the resulting values of (TR/IR)/ (TC/IC) and (ER/IR)/ (EC/IC) can be inferred from the quantification of “free” T and E referenced above. The interval across which the responses are integrated, whether it is a single peak or a minute, is unknown.

The results of the analysis of the calibration samples are almost magical in what little variation in (ER/IR)/ (EC/IC) is demonstrated. While this quantity had previously varied from 38.21% to 20.12% within a sequence of 3 calibration samples of varying concentration, this time it varies by only 3.44% and the sample with the least concentration of E is most responsive to E! The (ER/IR)/ (EC/IC) of “blu t30 e5” is 1.557855, while the values of “blu t180 e30” and “blu t360 e60” are respectively 1.466039 and 1.516447. (Larger is more responsive.) I have speculated above that this consistency is the consequence of integrating responses across a 1 minute interval and not across the interval of a single peak. The consequence of such a discrepancy is that the quantity of E in the B confirmation analysis will be underestimated. The variation in (TR/IR)/ (TC/IC) is more typical at 12.19% compared with other values of 13.56% and 17.17%.

The constant value of (TR/IR)/ (TC/IC) is evaluated to be 1.50 (USADA0273). The constant value of (ER/IR)/ (EC/IC) is evaluated to be 1.51 (USADA0274). (In both cases, the calculation is incorrect in calculating the slope of a line that does not pass through the origin at (0, 0), but there are not any significant consequences to this particular error.)

The values of TR/ER for the analysis of the 3 samples used to determine the value of T/E are 10.8964, 11.00193 and 11.14201, demonstrating significant inter-sample analysis consistency. The discrepancies between TC/EC are minor and respectively 2.49%, 2.63% and 2.76%. These results are consistent with a calibration that results in constant values of (TR/IR)/ (TC/IC) and (ER/IR)/ (EC/IC) that are almost identical. The consistency does not mean that the constants or the calibrations are accurate. The concentrations of T are computed at 63.15, 61.64 and 60.18 ng/mL. The respective concentrations of E are computed as 5.94, 5.75 and 5.55 ng/mL.

The analysis of the sample for quantification of “free” T and E uses a value of 0.866643 for (TR/IR)/(TC/IC) and a value of 0.674294 for (ER/IR)/(EC/IC), presumably from the analysis documented on USADA0292. (These values are significantly reduced from the values of 1.50 and 1.51 described above indicating reduced responsiveness to the lower concentrations.) These values are consistent with most of the calibrations in that the T is more responsive than the E. The concentration of “free” T is evaluated to be 1.2 ng/mL. The concentration of “free” E is evaluated to be 0.44 ng/mL. I speculate that if there is an error in the concentration of “free” E, it is probably underestimated as the responsiveness is probably reduced at a concentration that is 22% of the calibration concentration.

The concentration of “free” E at 0.44 ng/mL is greater than 5% of the evaluated concentration of E at 5.7 ng/mL (USADA0288). This indicates contamination, according to some interpretations of WADA protocols. I speculate that the evaluated concentration of E at 5.7 ng/mL is the too low as a consequence of inconsistent analysis of the calibration sample. If the evaluation of the concentration of E is increased by 35.27%, the threshold for contamination is no longer crossed.

That is the end of the story so far. Other commentary has described the issues with the identification of the substances contributing to a response peak.



marc said...

Dynamite. That's to say: Absolutely explosive.


Anonymous said...

ORG here ....


Like I asked about Duckstrap, what is the background of this author?

tbv@trustbut.com said...

I know nothing of this correspondent at this time, so you have to look at the report at face value only, no bona-fides.

Marc thinks it's Dy-no-mite; if so, the implications aren't clear to me yet.


Anonymous said...

For us slow people....Can you give us he executive summary of all of this?????

tbv@trustbut.com said...

I'd try, but I'm feeling slow on the material myself.

There is commentary of sorts at DPF in this thread, but it's still pretty technical.

The import (if any) of this will probably take a while to be understood.