Trial By Error: New Paper Seeks to Reframe Poor Findings in CODES Trial of CBT for Non-Epileptic Seizures

By David Tuller, DrPH

The CODES trial investigated cognitive behavior therapy (CBT) as a treatment for dissociative seizures (DS), a sub-category of what is now called functional neurological disorder (FND). The intervention was a course of CBT specifically designed to address the variety of factors presumed to be triggering the seizures. (I have previously critiqued CODES here, here, and here,)

However, the trial was a bust, with null findings for the self-reported primary outcome–seizure reduction 12 months after randomization. In fact, in what must have been a major embarrassment for the investigators, the group that did not receive the intervention reported a greater reduction of seizures than the group that did, although this difference was not statistically significant.

(In the past, seizures not believed to have been caused by abnormal electrical signals have generally been called “psychogenic non-epileptic seizures.” The new term is meant to be less insulting; patients often resent being told their conditions are psychologically driven.)

Since these null CODES findings were published in 2020, FND experts have tried to reframe them by, among other strategies, suggesting that seizure reduction wasn’t the most appropriate or relevant primary outcome after all. The investigators themselves raised this notion in the initial paper reporting the CODES results. In an accompanying commentary, a colleague of the investigators promoted a similar notion, suggesting that quality-of-life measures were perhaps a better primary outcome than seizure reduction.

Now they’re at it again. In a new paper called “Reflections on the CODES trial for adults with dissociative seizures: what we found and considerations for future studies,” published this month by BMJ Neurology Open, the key CODES investigators present additional analyses of the trial data and try to argue that the results weren’t really so bad. The paper includes the following sentence: “Overall, inspection of our data does not support others’ suggestions that our treatment did not sustainably reduce DS frequency.”

This is a bizarre remark. It obviously implies that the data from CODES support the idea that the specialized treatment did, in fact, “sustainably reduce DS frequency.” However, CODES provided no evidence that the treatment did any such thing. The primary goal of the trial was not even to investigate whether participants in the intervention arm had reduced seizure frequency but whether the intervention showed benefit—that is, whether those who received the intervention did better than those who did not. And that didn’t happen.

In CODES, both arms experienced some seizure reduction, but the intervention did not provide any advantages in that regard. The reduction in seizure frequency cannot be attributed to the intervention, even if the investigators now appear to be claiming otherwise.

(I could be misinterpreting the above sentence, but I don’t think so. I think the investigators truly believe, notwithstanding the evidence, that their trial documented some impact from the intervention.)

With 368 participants, CODES was the largest clinical trial to date of a treatment for FND. The senior author was the factually and mathematically challenged Trudie Chalder, a professor of cognitive behavior therapy at King’s College London (KCL). The press release from KCL scammed the public by burying the disastrous findings for the primary outcome and instead touting the trial as a big success—a claim based on some subjective secondary outcomes with modestly positive findings that really mean nothing at all.

The new paper explains that, in the CODES model, “DS are maintained by a vicious circle of behavioural, cognitive, affective, physiological and social factors of which fear and avoidance are particularly salient.” This framework, the paper notes, “lends itself to the application of CBT interventions, particularly graded exposure to feared (avoided) situations and seizure interruption and control techniques.”

In the new paper, the investigators provide, perhaps inadvertently, a clue into why CODES was destined to be a failure. As they explain, seizure reduction six months after the end of treatment was the primary outcome in a pilot study of CBT for DS, published in 2010: “In the pilot RCT [randomized controlled trial], 6 months after treatment there was an observed post-randomisation difference in favour of the DS-CBT group, but it could not be shown to be statistically significant.”

Exactly–the pilot study had null results. And yet the investigators were able to convince funders that the evidence warranted a test of the intervention in a full-scale trial. Is there something wrong with this picture? Why is anyone surprised that the full trial also had null results for seizure reduction at follow-up?

In the new paper, the investigators again present creative reasons to re-interpret the null findings from CODES. They note that the CODES comparison arm provided more than the standard care patients would have received outside the trial context. The participants in the comparison arm received some of the explanatory information and coping guidance that was available to those in the intervention arm, even though they did not receive the intervention’s active CBT component. From the perspective of the investigators, then, the null results for the primary outcome seem to mean that both arms benefited from the approach embodied by the intervention—not that the intervention was ineffective.

(I think I’m understanding their point, although I can’t be sure.)


Primary and secondary outcomes

The new paper includes a lengthy discussion of the choice of primary outcome. The investigators first mention that funders required it—even though they themselves have a long history of defending seizure reduction as the primary outcome. In the pilot study, the investigators explicitly rejected the idea that other metrics might be more suitable. Presumably they took that step after careful consideration of other possibilities.

Here’s what they wrote in the pilot:

“Our CBT approach is predicated on the assumption that PNES represent dissociative responses to arousal, occurring when the person is faced with fearful or intolerable circumstances. Our treatment model emphasizes seizure reduction techniques especially in the early treatment sessions. While the usefulness of seizure remission as an outcome measure has been questioned, seizures are the reason for patients’ referral for treatment.”

That reasoning still makes sense. Since the investigators specifically designed the intervention to achieve seizure reduction based on their hypothetical understanding of the etiology disorder, it is not immediately clear why seizure reduction should not be the primary outcome. If they are now abandoning this metric as not so important after all, are they also questioning the biopsychosocial theories that informed the creation of the intervention? If not, why not?

Failure of an intervention should lead smart investigators to question their assumptions—but that doesn’t seem to have happened with CODES. The investigators still seem to believe the trial should be viewed as a success, making much of the fact that nine of their 16 secondary measures had findings that were statistically significant. But let’s be clear: This was an unblinded study relying on self-reported (or, in one case, physician-reported) outcomes—a trial design subject to an enormous amount of possible bias. It would be unexpected for the intervention group not to report modestly better outcomes from bias alone.

(The primary outcome and three of the secondary outcomes involved patients’ reports of the number of seizures. The self-reporting of seizures has the appearance as well as some aspects of objectivity, but it is still subjective and potentially influenced by bias.)

My colleague Philip Stark, a professor of statistics at the UC Berkeley, made the following assessment of CODES:

“The trial did not support the primary clinical outcome, only secondary outcomes that involve subjective ratings by the subjects and their physicians, who knew their treatment status. This is a situation in which the placebo effect is especially likely to be confounded with treatment efficacy. The design of the trial evidently made no attempt to reduce confounding from the placebo effect. As a result, it is not clear whether CBT per se is responsible for any of the observed improvements in secondary outcomes.”

I highlighted Professor Stark’s assessment in a 2020 post, which also included my own observations about the secondary outcomes. Here’s the relevant passage:

“The investigators included 16 secondary outcomes in the study, measured either through questionnaires or the seizure diaries, and reported statistically significant findings for nine of them: seizure bothersomeness, longest period of seizure-free days in the last six months, health-related quality of life, psychological distress, work and social adjustment, number of somatic symptoms, self-rated overall improvement, clinician-rated overall improvement, and satisfaction with treatment. Although many of these findings were modest, the array appeared impressive.

Yet the seven outcomes that failed to achieve statistically significant effects also constituted an impressive array: seizure severity, freedom from seizures in the last three months, reduction in seizure frequency of more than 50% relative to baseline, anxiety, depression, and both mental and physical scales on a different instrument assessing health-related quality of life than the one that yielded positive results.

So parsing these findings, CBT participants reported that the seizures were less bothersome than in the SMC group, but not less severe. They reported benefits on one health-related quality-of-life instrument, but not on two separate scales on another health-related quality-of-life instrument. They reported less psychological distress, but not less anxiety and depression. When viewed from that perspective, the results seem somewhat arbitrary, with findings perhaps dependent on how a particular instrument framed this or that construct.

“If investigators throw 16 packages of spaghetti at the wall, some of them are likely to stick. The greater the number of secondary outcomes included in a study, the more likely it is that one or more will generate positive results, if only by chance. Given that, it would make sense for investigators to throw as many packages of spaghetti at the wall as feasible, unless they have to pay a statistical penalty for having boosted their odds of apparent success.

The standard statistical penalty involves accounting for the expanded number of outcomes with a procedure called correcting (or adjusting) for multiple comparisons (or analyses). In such circumstances, statistical formulae can be used to tighten the criteria for what should be considered statistically significant results–that is, results that are very unlikely to have occurred by chance.

The CODES protocol made no mention of correcting for this large number of analyses, or comparisons. The CODES statistical analysis plan included the following, under the heading of “method for handling multiple comparisons: ‘There is only a single primary outcome, and no formal adjustment of p values for multiple testing will be applied. However, care should be taken when interpreting the numerous secondary outcomes.

In other words, the investigators decided not to perform a routine statistical test despite their broad range of secondary outcomes. It is fair to call this a questionable choice, or at least one that departs from the approach advocated by many trial design experts and statisticians, such as Professor Stark, my Berkeley colleague. A self-admonition to take care “when interpreting the numerous secondary outcomes” is not an appropriate substitute for an acceptable statistical strategy to address the potpourri of included measures.

Despite this lapse, it appears that someone–perhaps a peer-reviewer?–questioned the decision to completely omit this statistical step. A paragraph buried deep in the paper mentions the results after correcting for multiple comparisons, with no further comment on the implications. Of the nine secondary outcomes initially found to be statistically significant, only five survived this more stringent analysis: longest period of seizure-free days in the last six months, work and social adjustment, self-rated overall improvement, clinician-rated overall improvement, and treatment satisfaction.

Let’s be clear: These are pretty meager findings, especially since they are self-reported measures in an open-label trial. For example, it is understandable and even expected that those who received CBT would report more “treatment satisfaction” than those who did not receive it. It is also understandable that a participant who received a treatment and the clinician who treated that participant would be more likely to rate the participant’s health as improved than when compared to the SMC group. And a course of CBT could well help individuals with medical problems adjust to their troubling condition in work and social situations.

“None of this means that the core condition itself has been treated–especially since those who did not receive CBT had better results for the primary outcome of seizure reduction at 12 months.”


Even as they present all of their additional analyses, the CODES investigators are ignoring the admonition they themselves included in their protocol—that “care should be taken when interpreting the numerous secondary outcomes.” As this latest paper shows, they have taken zero such care. The new paper doesn’t even mention that only five of the secondary outcomes were statistically significant after adjustment for multiple comparisons—a telling omission. The whole thing reads like a desperate attempt to portray their intervention as having had some meaningful effect. The CODES data tell a different story.

7 thoughts on “Trial By Error: New Paper Seeks to Reframe Poor Findings in CODES Trial of CBT for Non-Epileptic Seizures”

  1. Wow, if this is what “science” is we’ve strayed far from rationality, causality, and falsiability. Into a land of murky correlation where data are reported or withheld based on their results in order to cherrypick a result. When the result has been chosen before the study starts, why run the study at all?

    George Orwell might have added to his famous sentence “Science is Ignorance”: Freedom is Slavery, War is Peace, Science is Ignorance.

  2. “ In the past, seizures not believed to have been caused by abnormal electrical signals have generally been called “psychogenic non-epileptic seizures.” The new term is meant to be less insulting; patients often resent being told their conditions are psychologically driven.)”

    As I understand it, the most significant predisposing factor for the development of these dissociative/psychogenic/pseudo- seizures is having or having had actual seizures.

    It seems almost ineluctably clear to me that these ‘dissociative’ seizures are the brain’s attempt to PREVENT an actual seizure from taking place.

    Surely, the brain must do everything in its power to stave off a seizure when it’s about to take place, but when doctors look at the seizure as a clinical entity in retrospect, they have no way of distinguishing the aspects of the seizure which are the brain’s attempt to prevent the seizure, and the aspects which are essential for a ‘real’ seizure.

    But sometimes, the brain’s attempt to stave off the seizure succeeds, and we call that a dissociative seizure in blissful and I think increasingly willful ignorance that an actual seizure might well have taken place if the brain had not deployed the stratagem of a dissociative seizure in a timely fashion.

  3. JIMM – that is very interesting. I have had several NES and I’ve always maintained that they feel adaptive. I don’t understand why they call them dissociative because I don’t dissociate – I am entirely present, not afraid of anything. It does feel like my brain is going into rest mode, like a computer that’s overwhelmed. I just have to wait them out. But I saw a neuropsych for a couple of years who was a CBT fanatic and insisted I exposed myself to noise, crowds etc that I find overwhelming. My seizures got worse so I stopped seeing him, and decided to listen to my brain instead. I haven’t had a seizure since.

  4. It was the plasmid makers what done it …

    “Laboratory-made plasmids, a workhorse of modern biology, have problems. Researchers performed a systematic assessment of the circular DNA structures by analysing more than 2,500 plasmids produced in labs and sent to a company that provides services such as packaging the structures inside viruses so they can be used as gene therapies. The team found that nearly half of the plasmids had design flaws, including errors in sequences crucial to expressing a therapeutic gene.”

    Does “anythink” work anymore?

  5. It would seem to me that non-epileptic seizures have externally imposed recovery criteria that can’t just be defined away by moving the goalposts, although with a couple of caveats:

    a) In most legal jurisdictions, actually epilepsy, where not kept under control by drugs or whatever, makes you legally ineligible to drive a car. (Taking anti epileptic medication and not had a seizure for X months counts as recovered). Jurisdictions seem to take different views on non-epileptic seizures

    b) Many people with non-epileptic seizures say they feel ok to drive a car, because they have sufficient warning of seizure onset.

    But with those caveats: will the driver licensing agency reinstate the patients driving license? If no, then they are clearly not recovered, at least not in the eyes of vehicle licensing.

    “Sure, I’m still having seizures, but I’m not too bothered by them”, which seems to be the new CBT idea of “recovery”. typically isnt going to cut it for getting your drivers license back.

  6. The clear problem with calling them *psychogenic* non-epileptic seizures is that it puts a postulated cause in the name of the diagnosis without establishing that that is what causes them. The DSM, for example, tends to avoid hypothesizing causes in the names of conditions.

    “Non-epileptic” is more acceptable, as it;s a diagnosis of exclusion … demonstrating that a seizure wasnt epilepsy may be easier than idrentofying an unknown cause,

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top