Collective Action Problems and Comparative-Effectiveness Research: The Case Study of Evidence-Based Rheumatology

The footnotes fell out in copy/paste, so here’s a document with them intact: K.Sack_Problems in Evidence-Based Rheumatology_09-09-09.

Comparative-effectiveness research (CER) has recently been bolstered by over a billion dollars in federal funding under the American Reinvestment and Recovery Act, becoming a household name in the ensuing media blitz. In a series of New England Journal of Medicine articles, leading experts articulated what they perceive as the primary challenge of CER: it must generate and promote increased synthesis, production, and use of information, while personalizing medical decisions instead of anointing some faceless, ration-happy HMO beast.

In this early Obama Administration spotlight, the most foundational challenge of CER has gone unarticulated: how problems are defined presupposes the realm of their possible solutions, and illnesses with treatments examined by CER are no exception. Existing definitions of some problems mask collective action problems that hinder efficient public good production in the learning and doing components of healthcare that CER examines.

Rheumatology presents an optimal case-study of this phenomenon. Autoimmune diseases in the aggregate are a top-ten killer of American women, yet organized groups of actors involved in the learning and doing components of evidence-based rheumatology –researchers, clinicians, patient advocacy groups, and law-makers – do not typically conceive of the diseases as a whole. Although they are frequently concurrent, broadly genetic (so that an individual with one autoimmune disorder is significantly more likely to have a sibling or child with a similar or different autoimmune disorder), manifest overlapping symptoms, and exhibit similar types of disease progression, autoimmune diseases are nearly always studied, treated, and organized around (in terms of patient/family support, education, and advocacy), as if their patients’ interests were completely separate by disease or disease subtype.

This peculiar partitioning – a sort of unintentional divide and conquer of a natural class, akin to the division Offe and Wiesenthal famously describe between blue and white-collar wage laborers – is manifest in the vast majority of rheumatology clinical trial research designs. Rather than utilizing appropriate matching techniques, researchers systematically close relevant patients out of treatment and control groups. This practice cripples the generalizability of results, severely and unnecessarily limiting the public good (knowledge) produced by trials.

A prime example of this designing-down is the use of recursive partitioning to limit many lupus-related clinical trials to the estimated 40-60% of lupus patients with nephritis (a classification but not diagnostic criterion for lupus under American College of Rheumatology guidelines). This closes out patients with less severe disease only in a very limited respect, as other common facets of lupus (such as autoimmune vasculitis and the attendant accelerated atherosclerosis, or coagulation problems from thrombocytopenia to hypercoagulation tendencies) can be equally disabling and life-threatening. Clinical trial inclusion criteria for lupus patients on the basis of disease severity rather than ACR classification criteria would thus look very different from the existing norm. Rather than being geared toward patients with the most severe disease type, most lupus trials are geared toward patients with the particular disease subtype of lupus nephritis. Results are then assumed to be generalizable to the rest of lupus patients (which, if true, contradicts the logic underlying excluding these patients in the first place). Rather than producing this knowledge regarding the general lupus patient population during trials then, this knowledge is produced ad hoc (if it can properly be understood to be produced all) in the rest of lupus patients – in uncontrolled use. In fact, the use of recursive partitioning impairing generalizability, combined with routine multiple treatment inference with concurrent corticosteroid use unproven by randomized controlled trials (RCTs), and a focus on short-term results in a chronic and relapse-remitting disease process means that most of the data gleaned from extant lupus trials is incomplete at best, and empirically invalid at worst. Narrow disease subtyping, combined with other common research design flaws in lupus RCTs, causes suboptimal and inefficient production of medical knowledge as a public good, when it can be said to produce knowledge at all.

Another example is the distinction between patient-participants as having Graves’ disease, Hashimoto’s disease, or autoimmune thyroiditis – autoimmune thyroid disease (AITD) classifications that exist in a fluid, unpredictable triangle of disease progression, remission, and transition. Graves’ can become Hashimoto’s (when the autoimmune attack burns out the gland), which can also be considered autoimmune thyroiditis depending on its presentation (progressive or relapse-remitting) – the latter of which can also include hyperthyroid autoimmune episodes that would generally be called Graves’. Various antibody titers vary in AITD: there is no binary diagnostic. For some cases, diagnosis is easily made on sight (on the basis of clinical presentation); and in other cases, these diagnostic distinctions are more art than science. Thus, treatments with an action appropriate to both hyperthyroid and hypothyroid points of equilibria in diseases that can easily progress to either point (or to temporary remission), most likely by selectively suppressing the immune system that mistakenly attacks the gland in the first place, should be tested on patients with all AITD subtypes.

Instead, the literature contains little randomized controlled experimentation focusing on any of these diseases, and existing outcome comparisons tend to limit themselves to one AITD subtype. The definition of the problem presumes the bounds of the possible set of solutions, such that needed information is not being produced and the information that is produced is not optimally useful. When closely related disease subtypes are separated in research as a norm, the production of medical knowledge is needlessly limited along with the definition of the problem.

Autoimmune gastritis as distinguished from pernicious anemia is yet another such definitional distinction used or assumed in research that unnecessarily limits knowledge production. Both conditions are characterized by anti-parietal antibodies with resultant hindered nutrient absorption and increased risks of nausea, indigestion, anemia, and stomach and oral cancers. Typically, autoimmune gastritis (a chronic disease state sometimes associated with autoimmune thyroid disease) is considered a precursor to pernicious anemia (an acute disease state). Yet information regarding how, when, and why the former (sometimes) progresses to the latter remains to be produced, because the two are seldom related in research designs. Again, if the focus of trials is learning about and tinkering with an underlying disease mechanism, there is simply no known, functional reason to delimit these diseases in research, limiting their participant group and potential generalizability without realizing a gain (and in this case clearly taking a loss) in quality or quantity of information produced. The public good of knowledge is not optimally produced by medical professionals for patients, because the subgroup at issue is not defined as a subgroup and does not organize as one. The definitional problem masks the collective action one, which hinders knowledge production at the learning stage (research design, data generation, analysis, and synthesis).

Concurrency in autoimmune phenomena further highlights the problematic nature of systematically enforcing these somewhat nebulous and arbitrary distinctions among autoimmune diseases in clinical research trials. Over a third of lupus patients also suffer from pernicious anemia, autoimmune gastritis, AITD, or another distinct autoimmune disorder. Arthritis, fibromyalgia (related to arthralgia but generally considered a more diffuse body pain disorder), migraine, and Raynaud’s (a vasoconstrictive anomaly which shares an underlying mechanism with migraine) can be symptoms of several autoimmune diseases, but can also be considered distinct conditions – and treated as such in clinical trials. Such research designs fail to recognize that unit heterogeneity in a RCT with matching and other relatively simple methods is not a threat to the validity of the data generated. Rather, such heterogeneity broadens the scope conditions and overall breadth of the resultant information.

Given substantial overlap in terms of antibody and clinical presentations, genetic risk, and concurrency, it is puzzling that autoimmune diseases are not typically grouped together to test more diverse groups of rheumatology patients in clinical trials. In several of the multi-system, unpredictable, complex autoimmune diseases that frequently take several years and multiple physicians to diagnose on a case-by-case basis, antibodies often shift and evolve during the course of the illness. While there are combinations of potentially definitive lab and clinical results, there are no silver-bullet diagnostics for multiple sclerosis, lupus, mixed connective tissue disease (now generally considered a subtype of lupus), Graves, scleroderma/CREST, and some other autoimmune diseases. Given that the same classes of drugs (and indeed the exact same drugs in dosages and combinations that vary from and within case as much as they do from and within autoimmune disease type and subtype) are often prescribed for all of these diseases, it is inefficient to design clinical trials that pretend these phenomena are completely distinct in known ways.

The tragic inefficiency of the current system of rheumatology subgroup partitioning in clinical trials is that it causes new information on commonalities and differences between autoimmune diseases to be generated at a much lower rate than might otherwise be the case. The recent case of research into the use of the immunosuppressant CellCept (mycophenolate mofetil, MMF) in various autoimmune diseases is illustrative. One or two well-designed trials with samplings of participants with several autoimmune disorders and appropriate matching techniques could have shown (much more quickly and cheaply) what we now know from a large and growing body of piecemeal (disease- and disease subtype-specific) trials and case reports: the drug is relatively effective and well-tolerated across several diseases that share some broadly common underlying mechanism of harmful immune hyperactivity, including Graves’ ophthalmopathy, autoimmune thyroiditis, uveitis, lupus nephritis, persistent lupus myelitis (potentially indistinguishable from MS), neuropsychiatric lupus, MS, systemic sclerosis, CNS sarcoidosis, renal limited sarcoidosis, sarcoidosis, myasthenia gravis, treatment-resistant HIV (which is frequently related to immune hyperactivity), and other, more general autoimmune phenomena including Raynaud’s in a transplant patient and non-specific autoimmune vasculitis.

It is not hyperbole to observe, based on accepted mortality and morbidity rates, that thousands died waiting for new medical knowledge that was slower coming because rheumatology research designers lacked the appropriate working conceptualization of autoimmune diseases as a subfield (rheumatology) sharing a common mechanism (immune hyperactivity) treated by the general mechanism (immunosuppression) of the drug at issue (MMF). Not only do economies of scale apply in obvious ways (administering clinical trials from design to funding, reporting, etc.), but knowing that the same medication works for multiple autoimmune diseases is another piece of information this combination of many trials and ad hoc cases strongly suggests – invaluable information which could have been generated far more clearly and efficiently by one, hypothetical rheumatology-wide trial that included patients with several autoimmune disease types.

Exceedingly narrow disease and patient subgroup definition is fascinating on a case-by-case basis, but it hinders researchers and clinicians – medical knowledge producers – who routinely abuse these distinctions to structure clinical trials for autoimmune disease patients according to tiny subgroups rather than larger samplings across an aggregate. This abuse, in addition to the abuse of short-term proxies for long-term results, and multiple treatment inference due to the rampant use of long-standard treatments that are unproven by RCTs, reflects a collective action problem on the part of the producers, because knowledge as a public good is one of their goals too, and their individual, uncoordinated actions work against this collective interest.

A similar problem hurts rheumatology research designs from the opposite direction: medical care consumers (i.e., patients) also engage in individual, uncoordinated actions that frustrate their collective interests. For consumers, the primary problem is also one of clinical trial structure predetermining the possible bounds of the solution set, but in the distinct terms of outcomes of interest. Outcomes of interest to healthcare producers (researchers and clinicians) are as an aggregate different from outcomes of interest to individual healthcare consumers (patients and patient advocates), but producers as a class structure trials within which the consent of consumers is binary at any given time. There certainly is a conception of patients having the right and the ability (on a spectrum of abilities) to engage in informed consent with regards to what the outcomes of interest are in clinical trials: remission versus symptom management, prioritization of symptoms to be managed, and what constitute acceptable side effects.

However, this construction of consent is always informed by the deeply culturally ingrained hierarchy of doctor over patient. This asymmetry is reinforced by the very high costs to consumers (from patients and their families/advocates) operating as unorganized individuals with incomplete information given the very high costs of becoming fluent in the relevant medical literature while ill. The disparity is also caused in part by other variables bearing on individual patients’ agency: increased disease severity decreases patients’ ability to resist or contest producer preferences, as do lower education level, class, race, gender, and age. Increased disease severity also decreases patients’ ability to opt out of being patients (to decline medical interventions with serious side effects or other costs, and to refrain from fashioning a disease-related socio-political identity for organizing purposes). Decreased disease severity increases those same abilities.

The iterative effects of these tendencies present a not-so-special case of Albert Hirschman’s familiar exit-voice framework: the more patients opt out of defining themselves as patients, the fewer patients organize to demand change to the status quo. Thus autoimmune disease patients with low exit costs (i.e. less severe disease course or type) organize less: there are more MS and lupus patient advocacy groups than there are Graves or AITD groups, although AITD is a far more common set of conditions. This exiting weakens autoimmune patient advocacy efforts as a whole, resulting in under-production of the public good of evidence-based rheumatology.

Collective action problems also hinder the efficient use of existing information in rheumatology. Individual doctors prioritize their individual patients’ health on an ad hoc, case-by-case basis. From this perspective, “evidence” in evidence-based medicine comes from the results of aggregates of cases in which patients agree to treatments, goals, and side effects as defined by researchers. It ignores outliers – which matter 100% if one of their patients is one – and it uses short-term proxies to extrapolate unknown long-term results with uncertain generalizability to any given patient.

There are very real and serious problems with some of the methods, concepts, and research designs common in rheumatology trials. Yet when ad hoc data is privileged over the creation and use of knowledge as a public good, doctors as a group are less capable of serving the interests of their patients as a group. Their micro actions have undesirable macro effects: less information is produced and the available data is not predictably applied, so rheumatology is not evidence-based.

This irrational status quo is inertial in part due to the lack of a central rheumatology professional organization with the authority to enforce recommendations. Patients and patient advocates/groups cannot generally enforce standards in autoimmune diseases, because the diseases are so complex, incompletely understood, and unpredictable; and because hierarchy reigns, and doctors are in charge. Even if they were able to completely dictate doctor behavior, there are so few completed, useful evidence-based rheumatology trials in existence that it is unclear what the rational market demand should be (other than the demand for more and better evidence-based rheumatology).

Insurance companies like Medicare, which still utilizes the vastly outdated ICD-9-CM for rheumatology and other diagnostic coding, are in a similar position to patients in that the transaction costs of obtaining, verifying, and synthesizing full information on the current state of evidence-based rheumatology are so large that on the whole, they satisfice. Individual doctors run the autoimmune disease treatment show, and their micro actions (doing the best they can with every case in isolation) hurt doctor and patient macro interests (systematically gathering more evidence and doing more evidence-based rheumatology).

In a prime example combining all these “doing” problems, the 2007 guidelines for assessing immunocompetency (i.e., screening, monitoring, and reporting infections before and during treatment) in clinical trials for autoimmune diseases failed, because no one enforced them: patient groups did not know to advocate for their mainstream adoption, no governmental body was charged with enforcing them, and no insurance company decided to deny coverage for treatments utilized without explicit compliance with the guidelines. Moreover, front-line rheumatology treatments include anti-inflammatories and immunosuppressants that are either off-label for particular autoimmune diseases, or are simply unproven with RCTs (like Prednisone); the scope conditions of the guidelines ignored the real need for immunocompetency guidelines in rheumatology trials and in clinical practice. The guidelines were written by and for a committee at the Autoimmunity Centers of Excellence (ACE), which was created by the National Institute of Allergy and Infectious Diseases and is cosponsored by the National Institute of Diabetes and Digestive and Kidney Diseases and the NIH Office of Research on Women’s Health. While these associations sound impressive, ACE entirely lacks any manner of enforcement teeth, and the committee that wrote the guidelines within ACE – although individually prominent in their field – was self-constituted.

As national healthcare reform receives more funding and evidence-based medical researchers continue to learn more about what works and what does not, it is more important than ever to ask how problems are being defined before we try to solve them. Autoimmune diseases in the aggregate are a top-ten killer of American women; yet despite significant overlap in their known or theorized etiologies, symptoms, types of progression and concurrencies, and other features, they are not generally conceptualized for purposes of research, education, and patient advocacy as a class. This case study of collective action problems in evidence-based rheumatology illustrates one general instance of a serious framing threat to the efficiency and potentially to the validity of comparative-effectiveness research as an au courant tool for collective and individual good.