Friday, August 08, 2008

Nature's not into it

The recent opinion piece in Nature by Donald Berry, and the accompanying editorial ought to give pause to the crusaders in WADA World, but probably won't.

Nature is perhaps one of the top five respected science publications in the world. When it reviews a critical piece, and editorializes in support, the conclusions are well-founded, and carry weight with scientists, academics and policy makers who are not the ones having their ox gored.

The editorial makes a conclusion with which we wholeheartedly agree:

[Berry] outlines what he sees as problems with the way doping tests are conducted. He argues that anti-doping authorities have not adequately defined and publicized how they arrived at the criteria used to determine whether or not a test result is positive. The ability of an anti-doping test to detect a banned substance in an athlete is calibrated in part by testing a small number of volunteers taking the substance in question. But Berry says that individual labs need to verify these detection limits in larger groups that include known dopers and non-dopers under blinded conditions that mimic what happens during competition.

Drug testing should not be exempt from the scientific principles and standards that apply to other biomedical sciences.

Nature believes that accepting 'legal limits' of specific metabolites without such rigorous verification goes against the foundational standards of modern science, and results in an arbitrary test for which the rate of false positives and false negatives can never be known. By leaving these rates unknown, and by not publishing and opening to broader scientific scrutiny the methods by which testing labs engage in study, it is Nature's view that the anti-doping authorities have fostered a sporting culture of suspicion, secrecy and fear.

Detecting cheats is meant to promote fairness, but drug testing should not be exempt from the scientific principles and standards that apply to other biomedical sciences, such as disease diagnostics. The alternative could see the innocent being punished while the guilty escape on the grounds of reasonable doubt.

The statistical arguments are that, absent knowledge about true doping rates and those of false positives and negatives , many doping tests as now conceived are untrustworthy. These are compelling to all those who do not have skin in the game of the current enforcement system. It should seem clear that the zeal to catch "dirty dopers" and "protect clean athletes" has led the enforcers to a system where they do not wish to be bothered with the kinds of validation that would lead to truly trustworthy testing. By policy, the WADA Code and its interpretation by the CAS does not entertain arguments about the scientific validity, as supported by statistical validation of its tests. Once the experts of WADA World decide on a test and its positivity criteria, it is "deemed" to be correct for purposes of enforcement. The process of "results management" and adjudication is not designed to determine actual truth, but to provide the appearance of due process to those who have been "deemed" to be guilty dopers.

There is no question there is doping. There is no question the vast majority of doping positives are, in actual truth, signs of doping. In saying so, we repeat that we have never been trying to apologize for those who are, in actual truth, guilty of doping.

We do wish to make it clear that we desire a reliable and trustworthy system, whose results can be seen as fair to the facts and truth of each individual case.

Berry used some of the details of the Landis case as an example, and we included his illustration yesterday, which we repeat now.

(click for larger)

This illustration elegantly shows the four metabolites used in the Landis tests. We can ignore the left hand versions completely, as they are redundant. The right hand versions get at the statistical point as we understand it.

In the top right scatter plot, we have all the reported values associated 5bA and Etio in the testosterone tests LNDD has run. This data is from the tables in Ex 26, LNDD 433 through LNDD 436. The ones that LNDD has "deemed" negative tests are green; the ones it has "deemed" positive are marked in red. The test that condemned Landis is a dot marked in blue. It should be visually apparent that it is a lot closer to the blob of green dots than it is the scattering of red ones. We'll also note a very curious red dot that looks smack in the middle of a bunch of "negative" tests. How would you like to be that accused athlete?

In the bottom right plot, we have the values associated with the 5aA and Androsterone for the same tests; and we see the Landis dot is solidly into the area "deemed" positive by LNDD. Mr. Young, representing USADA, referred to this at the CAS appeal as a "screaming six", which is colorful and assonant to be sure, but what does a single "screaming six" really say?

The question being raised by Berry is, "Are the boundaries where LNDD and WADA have changed colors valid?" Is it actually true that a data point there indicates doping, or do we have reason to believe, statistically, that it can be a "false positive" even with those values?

Berry makes the mathematical observations that we do not have the data to support the conclusions that WADA would have us believe.

And the editors of Nature agree -- we do not have the data to support the conclusions WADA makes.

While Berry explicitly offers no opinion on Landis' actual culpability, he also makes a side point that has been denied by CAS and the WADA regular support group:
In arbitration hearings, the AAA threw out the result of the LNDD's initial screening test because of improper procedures. In my opinion, this should have invalidated the more involved follow-up testing regardless of whether or not sensitivity and specificity had been determined. Nevertheless, the AAA ruled the spectrometry results sufficient to uphold charges of doping.

The WADA support group has raised some other objection's to Berry's analysis in various places, offering more Landis data to attempt to prove their point, and even some of the commenters here have been confused. The claim is made that Landis "failed" other tests at the tour, and therefore the likelihood of the first being a false positive is reduced.

We believe this is a tautology that assumes the conclusion ("he doped"), and not a valid argument.

If one understands Berry's point, repeatedly testing the same or similar samples (as done with Landis) and getting similar values doesn't prove anything about the correctness of the criteria. If the first set of values was a false positive, so are all the other ones. Because the method isn't statistically valid. The defenders of the LNDD testing are trying to use consistency of measurement as a indication of validity, which it is not. It merely indicates that the LNDD can reproduce unenlightening numbers.

We are ultimately back to one of the questions raised in the very first argument on the case, the ADRB submission, whether the WADA criteria is, and should be, a single metabolite, or "all metabolites".

We know that UCLA's validation study, done to a more rigorous standard than that apparently used by LNDD or WADA, concluded multiple metabolites were necessary to reliably determine the truth. We know that the similar criteria used by the Sydney lab also require more than one. This begins to address Berry's concerns, but is not accepted by WADA or by the LNDD, and the AAA and CAS panels rejected the argument. The arbitrators took the position that whatever WADA says is so, and WADA says one is good enough to destroy a career.

Here at TBV, the series Larry's Curb Your Anticipation looked at the validation process in considerable detail, and came away more puzzled than convinced that what WADA does proves what it is presented as being.

Berry's analysis is a succinct, accessible and authoritative summation of why WADA's methods are not reliable, and have been oversold to the public and to its sponsoring stakeholders. This oversell is arguably part of a bluff to scare-off would-be dopers. Any innocents who may happen to get caught up are collateral damage, and well, no one is innocent, they are all dopers, so what matter?

This is an incredibly cynical view. We deserve better of our enforcement agencies, just as we demand better of our athletes.

We know that WADA "deems" its science to be unassailable, and under the WADA Code, it is effectively impossible to obtain valid, independent review.

Nature doesn't buy it, and neither do we.

Selections of relevant comments and mail we have received about the Berry article.

The response from the IOC?

International Olympic Committee (IOC) medical director Patrick Schamasch, contacted ahead the commentary's publication, would not comment directly on the study but said: "What we are doing in the area of doping is the most advanced in terms of certitude."

Wow. Worthy of Dick Pound himself. What a joke. Someone with nothing but an academic interest in the issue, and someone with a huge political and financial interest in the issue - who you gonna trust here? If you are the AAA or CAS arbs, obviously the later.

A radio interview of Berry:

SIMON SANTOW: We hear both from the drug-testing authorities that they're going to catch cheats and that they have a new test and that nobody should assume that they can get away with cheating.

How much of that do you think is bluff and how much of that is actually based on good science?

DONALD BERRY: I (laughs), I don't think any of it is good science, frankly, from what I've seen.

Some of it may be bluff and indeed that may be the best way of handling the circumstances. But when they've got a new test, that new test has to be validated. Indeed they are probably catching dopers but just because there is lots of doping going on, and not so much that their test is all that good.

At Ars Technica:

Herein lies the problem - the Wold Doping Agency has neither conducted nor published the necessary studies to establish sensitivity and specificity in these metabolite ratio tests. An analysis of the case of Floyd Landis from the Tour de France, assuming reasonable values of sensitvity and specificity, indicates that there was between an 8 and 34 percent chance of registering a false positive. Given that he had eight different opportunities to test positive (far more than the average Tour rider because Mr. Landis was a front-runner throughout the race), the case against him suddenly looks substantially weaker.

What's missing is any indication of the trueness of LNDD's lab method. This is the critical point, the point that Berry tried to make with the chart. In order for a lab method to be fit for purpose, and in order to properly determine the margin of error applicable to a given lab method, you need to determine both trueness and precision. Otherwise, all you've determined is that the method is consistent, but you haven't determined if the method is consistently wrong.

Look at that Nature editorial again. They couldn't have used much stronger language. For the most prestigious scientific publication on the planet (arguably) to make such a bold statement is worth noting.

They are not merely saying "we ought to do our tests right". Tell me if I'm reading to much in, but they are actually saying, "current practices are wrong". And it's not just this Berry guy saying it, it's the Nature editorial staff.

Imagine your a WADA scientist reading this. It's not coming from some nutcase on the internet or some Floyd high-paid flunky. It's coming from Nature.

We all have good intentions
But all with strings attached

- Gang of Four


Mike said...

Excellent TBV. Thank you for taking the time to write something so well said.

I am less "confused" now. Perhaps even to point of understanding.

One element of this that interests me is that Berry notes in his article that he was provided with the data by the Landis defense team. I wonder if that means that Suh, et al, are still pursuing further "evidence" for future civil action?

This Berry commentary would be just the kind of thing they would need to get at the fundamental issues behind the Landis case.


syi said...

Maybe I missed it, but when (or has) or picked up on the Nature article? Or are they quietly ignoring it?

bill hue said...

Oh I used to be disgusted. And now I try to be amused....

(The Angels Wanna Wear My) Red Shoes - Declan Patrick McManus

When there is no scientific foundation or validity only faith provides the rickety base upon which the ADA's and Richard Young's principals stand.

No legal system is based in faith, not even religious ones.

No wonder the syncophants react so strongly to the Nature article. The very foundation of their world view has no gravitas.

Thomas A. Fine said...

Thanks for highlighting the Nature editorial. It's been largely overlooked in all of this, but I actually think it's the bigger news here.


jrdbutcher said...

Thank you TbV and thank you to Nature.

Much respect to bill hue. I've gotta love a judge who can quote Elvis Costello and use his given name.

jrdbutcher said...

Practice makes perfect, but only if you practice perfectly. Otherwise, you just get very good at doing it incorrectly.

Floyd Landis was Richard Kimbled and I’m still waiting for Tommy Lee Jones to come along and set the record straight.

apoch said...

IANAL, so hopefully one of the more educated than me contributers (that's about all of you) can answer this question. Is USADA funded by US Tax dollars and if so are they subject to a Freedom of Information Request? Could someone use one of these to force USADA to disclose how the tests they've been using have been verified?

Thomas A. Fine said...

apoch, they are partically funded by taxes. The FOIA requests probably wouldn't work from what I understand - I suggested it long ago.

At any rate, USADA aren't the ones with the knowledge of how the tests are done - the labs have that (each their own it seems).


woody said...

Wow, an 8-34% chance of an FP?

You have better odds playing "The Lottery".

"The Lottery" is a short story by Shirley Jackson.

Mike said...

Donald Berry was just on NPR's Talk of the Nation: Science Friday. I only caught the very end of it, so I don't know how long he was on. I imagine the segment will be on their website before long.

I wonder if Bonnie Ford or Eddie Pells will cover this. I also wonder why the New York Times has not covered it.

I actually looked into an FOIA request when I was trying to get Exhibit 26 last year. I don't remember the details but the description on the government website made me think it was a non-starter.

dailbob said...

Thanks for writing this. Really thoughtful and well done.

I just read the Berry article today. My major issue all along has been with maintainence of peak identification on the IRMS. I never stopped to think about whether this test was fit for purpose beyond repeatability and reproducibility. Not being involved in drug development, I haven't had to think about the statistical concepts of specificity and sensitivity, i.e., it's not something I would discuss at work, because I don't need to do large double-blind, placebo controlled tests to develop personal care products. We have a person at work with a doctorate in statistics, so I hope to get my brain wrapped around this better. In the meantime, one can only hope, based on Berry's article, that the specificity of the LNDD tests for testerone are substantially higher than 99%.