HOME PAGE OF SEMMEL-WEIS.ORG Semmel Weis in Landeck, Austria with Die Silberspitze (SilverPeak) in the background Ignaz Semmelweis on Hungarian postage stamp Frodo Ring introduces RED BOX WARNING AGAINST KAPLAN-MEIER Grim Reaper introduces the SAN MIGUEL (2008) TRILOGY IS THERE AN ALTERNATIVE TO INFOMERCIALS? THESE FOUR RATS GAVE THEIR LIVES DEMONSTRATING THAT THERE IS AN ALTERNATIVE. Chinese woman holding giant, bamboo-eating rat Bright future for Johnson&Johnson DARZALEX
FDA Agent Badge gets you into THE JOHNSON & JOHNSON RAP SHEEP Dr ROBERT Z ORLOWSKI, one of the MAGNIFICENT EIGHT represented here as a SUPERMAN in the area of multiple myeloma 83% survival at Months=30 is the inflated survival indicated in a misleading Kaplan-Meier graph Dr Paul Richardson wearing the recommended badge which reflects that Big Pharma's power over him has the strength of $19.6 million Best Hospitals ranking by USNews HOW MANY SUBJECTS PER GROUP? Dr SAGAR LONIAL debates Dr Paul Richardson Meletios Dimopoulos
A VERY SPECIAL INTEREST HERE TO SEE YOU William James Mayo Dr Matt Kalaycio unable to dispute payment Incongruity arrow ODOMZO may exhibit a Kalaycio-Boom in a Mayo Clinic Rochester clinical trial having Dr Francis Buadi as Principal Investigator      

THE SAN MIGUEL (2008) TRILOGY

by Semmel Weis

     First published 05 Jul 2017      Last edited 30 Jul 2017 11:00pm

HISTORICAL PROGRESS IN MULTIPLE MYELOMA THERAPY:
CAN WE BE SURE WE'VE SEEN ANY?

The one dependable regularity that jumps out at us from the Ozaki (2015) Kaplan-Meier graphs below is that Overall Survival of multiple myeloma patients shows historical improvement.  That is, within each section a to d is available a comparison of Overall Survival during 1990-2000 and during 2001-2012, and the curves in the latter are always higher, indicating longer survival during the more recent decade.

More precisely, in section a, the comparison is made within a single graph, in which can be seen that the 2001-2012 red curve is higher than the 1990-2000 black curve.  The same conclusion holds in section b, though now there are two graphs to contend with, the leftmost covering 1990-2000 and the rightmost covering 2001-2012, and where can be seen, say, the blue curve in the right graph being higher than the blue curve in the left graph, and the same within each pair of curves for the remaining three colors.  And the same for the pair of graphs at c and again at d.

The Ozaki graphs, however, tell us nothing about the cause of that evident historical improvement.

Although our first impulse may be to assume that the improved survival is caused by improved treatment, it might instead be the case that an appearance of improved survival is caused by earlier diagnosis of the disease.  That is, the farther back we go in history, the more advanced a patient's multiple myeloma had to be before it was diagnosed, so that the typical multiple myeloma patient a few decades ago might have been a stage III sufferer with little time left to live, which is to say, with only a short post-diagnosis life expectancy.  Today, in contrast, there exist many new tests for early indicators of multiple myeloma, so that the typical multiple myeloma patient of today may be a stage I sufferer with many years of life left, and therefore typified by a much longer post-diagnosis longevity.

And of course the recognition that two variables may affect post-diagnosis survival invokes no obligation to believe that only one can be at work at a time.  Rather, it may be the case that both variables contribute to post-diagnosis longevity, and of course may contribute unequally.  That is, it may be the case that modern therapy does extend life, while at the same time earlier diagnosis makes its contribution by also extending post-diagnosis longevity, and probably with one or the other making a greater contribution.

Furthermore, it may be the case that the two variables work against each other, as for example by contemporary chemotherapy shortening life by one year while early detection expands post-diagnosis longevity by three years, whose net effect would be a post-diagnosis longevity gain of two years today compared to several decades ago.

In short, despite initial impressions to the contrary, the Ozaki (2015) graphs do not rule out the possibility that contemporary treatment of multiple myeloma is ineffective, and do not rule out the possibility that contemporary treatment actually shortens life.  More on these possibilities further below.

Blood Cancer Journal logo

Ozaki et al (2015)       Months=160

Ozaki (2015) Overall Survival graphs extending to Months=160

Figure 1.  Overall Survival according to the time periods (a), age groups (b), initial therapies (c), and best responses (d) in comparison of 1990–2000 vs 2001–2012.      Ozaki et al, Trends of survival in patients with multiple myeloma in Japan:  a multicenter retrospective collaborative study of the Japanese Society of Myeloma, Blood Cancer Journal, Published online 18 Sep 2015.      BLOOD

Two other features in the Ozaki (2015) display might be noticed now because they will be discussed below.

One feature is the Survival Domain, which is the "Months after initial therapy" during which the Ozaki patients were followed, extending in each graph to an admirable 160 months.

The other feature, the Terminal Plunge, is the occasional curve showing initially-high Overall Survival (the curve floats above other curves), but then suddenly plunges all the way to zero — as exhibited to best advantage by the green curve opposite the giant asterisk (added here).  A single curve can be seen plunging to zero in each of the d graphs, though from a decreasing height.  Three curves in the first of the b graphs reach zero, though over such a small distance as to not deserve to be called a plunge.

COMPENSATING FOR THE MEAGER SURVIVAL DOMAIN
IN SAN MIGUEL (2008)

The Ozaki (2015) graphs above, extending as they do to Months=160 (13.3 years), dwarf San Miguel's (2008) reaching only Months=30 (2.5 years), as can be seen in the graph immediately below.  Chemotherapy for multiple myeloma is often criticized for accomplishing no more than weakening disease indicators in the near term, but ultimately failing to extend survival (a failure that would have become evident — goes the criticism — had the research been permitted to probe farther into the future).

CancerNetwork logo

No Association Between Response Rates and Survival in Newly Diagnosed Multiple Myeloma

14 Feb 2017     By Leah Lawrence     CANCERNETWORK

There was no association between conventional response outcomes, such as complete response (CR) or very good partial response (VGPR), and survival in patients with newly diagnosed multiple myeloma, according to the results of a meta-regression analysis published recently in the European Journal of Hematology.  [...]

“The lack of association between response rates and hard clinical outcomes raises concerns about fast track approval of new drugs, based on results from trials utilizing solely these surrogate markers,” the researchers wrote.


In medical research, a "response" is an indication of improved health.  A "surrogate marker" is an easy-to-obtain measurement which is thought to correlate with a measurement that is more important, but more difficult to obtain, in this case long-term survival.

And so San Miguel (2008) cuttingoff an Overall-Survival graph at Months=30 not only fails to address the above concern, but rather deepens it:

New England Journal Of Medicine logo

San Miguel et al (2008)       Months=30

SAN MIGUEL (2008) Overall-Survival Figure 1 Panel B Overall Survival graph

  Figure 1.   Kaplan–Meier Curves for Overall Survival.


San Miguel et al, Bortezomib plus Melphalan and Prednisone for Initial Treatment of Multiple Myeloma, New England Journal of Medicine, 2008, 359, 906-917.      ClinicalTrials.gov number NCT00111319      San Miguel (2008)

The Johnson&Johnson answer to this San Miguel (2008) inadequacy has been to conduct two what it calls "updates" or "updated analyses" or "profound follow-ups" and similar expressions, and which have produced the two Overall-Survival graphs below, which sport ever-expanding domains, in the case of Mateos (2010) the graph caption goes to Months=51, and in the case of San Miguel (2013) to Months=78.

What San Miguel (2008) claims to prove is that adding the chemotherapy drug Bortezomib to Melphalan and Prednisone will bring health benefits to multiple myeloma patients, the most important among them being longer survival.  The Control Group, then, can be referred to as the Melphalan plus Prednisone Group, or MP Group, and the Bortezomib Group can be called the BMP Group.  Since the only difference between the two groups is Bortezomib, it is presumed, then if the Bortezomib Group ends up with less multiple myeloma, and especially if it lives longer, then it can be concluded that it was the Bortezomib that produced these health benefits.

VELCADE being the Johnson&Johnson trade name for Bortezomib explains why Mateos (2010) preferred to call the Bortezomib Group the VMP Group, and why San Miguel (2013) preferred to call it the VcMP Group.  In other words, BMP, VMP, and VcMP refer to exactly the same group, the one that included Bortezomib in its chemotherapy cocktail.

Journal Of Clinical Oncology logo
Mateos et al (2010)       Months=51

Mateos (2010) Kaplan-Meier Overall Survival graph

Mateos et al, Bortezomib Plus Melphalan and Prednisone Compared With Melphalan and Prednisone in Previously Untreated Multiple Myeloma: Updated Follow-Up and Impact of Subsequent Therapy in the Phase III VISTA Trial, Journal of Clinical Oncology, 2010, 28, 2259-2266.     Mateos (2010)

Journal of Clinical Oncology logo

FDA logo

National Comprehensive Cancer Network logo

US DrugBase logo US DrugBase logo

San Miguel et al (2013)       Months=78

J&J VISTA graph which discloses the meaning of Number Of Patients At Risk

San Miguel et al, Persistent Overall Survival Benefit and No Increased Risk of Second Malignancies With Bortezomib-Melphalan-Prednisone Versus Melphalan-Prednisone in Patients With Previously Untreated Multiple Myeloma, Journal of Clinical Oncology, 2013, 31, 448-455.     San Miguel (2013)

The Overall-Survival graph above differs slightly between different reports of the San Miguel (2013) data, and happens not to be the version in the Journal of Clinical Oncology, but nevertheless may be the most widely disseminated version, as for example being found in  FDA  and  NCCN  and  USDRUGBASE,  (on these three sometimes very long pages, searching for "Figure 2" will take you directly to the above graph) and which graph is distinguished in two ways:  (1) it acknowledges that "patients at risk" really means "patients remaining" (see the asterisked footnote immediately above), and (2) it acknowledges that the right-most column of Patients-Remaining counts should consist not of blanks but of zeroes.  Neither of these distinctions was present in the two earlier graphs in the above Johnson&Johnson Bortezomib-promoting trilogy.

The Patients-Remaining counts in the table below are taken from the three Johnson&Johnson, Bortezomib-promoting, Kaplan-Meier graphs immediately above.  Because San Miguel (2013) presents Patients-Remaining at six-month intervals, it is only the corresponding six-month-interval counts from San Miguel (2008) and Mateos (2010) that are reproduced.

Comparing the numbers within any yellow column serves to confirm that Mateos (2010) and San Miguel (2013) present no data other than from the 344 Bortezomib and 338 Control patients who had been originally introduced by San Miguel (2008):

PATIENTS REMAINING IN THREE GRAPHS OF OVERALL SURVIVAL OF 344+338 = 682 PATIENTS
    Months = 0 6 12 18 24 30 36 42 48 54 60 66 72 78
San Miguel (2008) Bortezomib 344 300 235 115 36 0
Control 338 301 220 116 29 0
Mateos     (2010) Bortezomib 344 300 288 270 246 221 124 54 1 0
Control 338 301 262 240 216 185 103 41 3 0                        
San Miguel (2013) Bortezomib 344 300 288 270 246 232 216 199 176 158 78 34 1 0
Control 338 301 262 240 216 196 168 153 133 112 61 24 3 0

Whether or not the above repeated expansion of the survival domain for the 344+338 = 682 patients in San Miguel (2008) can be justified is a question that will be addressed below under the heading, "How Do The Two Groups Really Differ?"

THE PHENOMENON OF THE TERMINAL PLUNGE

We have already noted the existence of a particularly-striking Terminal Plunge in the green curve opposite the giant asterisk in the Ozaki (2015) Kaplan-Meier graphs above.  The significance of the Terminal Plunge begins to be appreciated in the comment by Jim Worthey who reconstructed one such Terminal Plunge from what he had seen at an American Society of Clinical Oncology (ASCO) presentation (graph on the left below), and to which he added three question marks to draw attention to the exact location of that puzzling Terminal Plunge, and to which he also added the sort of grim-reaper imagery that such a graph is capable of conjuring up (look for "grim reaper" farther down on the left below).

[A]t day 589, the probability of survival drops abruptly to zero.  It looks like all the patients dropped dead on that day.    www.jimworthey.com/~ 

From a corporate point of view, such a grim-reaper graph is anathema because it shakes investor confidence:

Wall Street not Impressed:  The poster was presented on Saturday, June 2, and on Monday June 4, GenVec followed up with a news release and made the poster available on their web site.  Later on that Monday, the stock took a big drop.  Company management felt that analysts and investors were misreading the data, and issued a further news release on June 5, announcing a conference call on June 6.  The stock price has not recovered (as of June 15).    www.jimworthey.com/~ 

If The Terminal Plunge
Is Not Erased
If The Terminal Plunge
Is Erased
Jim Worthy Overall Survival curve with Terminal Plunge Jim Worthy Overall Survival curve with Terminal Plunge removed

Images Which The Corresponding Graphs Above
Might Evoke In The Patient's/Physician's/Investor's Mind


The Grim Reaper mowing down cholera victims In short:  This pharmaceutical promises 70% survival into the indefinite future.

Jesus healing the sick

The column headings above ("If The Terminal Plunge" followed by either "Is Not Erased" or "Is Erased") serve to suggest how easily the researcher is able to remove an embarrassing and product-disgracing Terminal Plunge, the removal requiring only the placement of the study-cutoff date before the death of the longest-enrolled patient.

To consider a specific situation — imagine a San Miguel (2008) researcher at the bedside of the longest-enrolled patient in the Bortezomib Group, but which patient is faring poorly, and threatening to die.  If the researcher waits for that patient to die, and then cutsoff the clinical trial (meaning stops the trial and abandons further collection of data), he will be stuck with the Wall-Street-disappointing curve on the left below, Terminal-Plunging wart and all.  If, on the other hand, the researcher cutsoff the clinical trial before that longest-lasting patient dies, the researcher is rewarded with the Wall-Street-pleasing curve on the right below, his longest-enrolled patient having become forever categorized as shed and not dead, and in consequence the Terminal-Plunging wart becoming erased.  That's how a Kaplan-Meier curve works, which can be added to the reasons for avoiding it (which may be one more reason to shun it, in addition to the reasons discussed in KAPLAN-MEIER).

An unexpected cutoff in the course of a clinical trial appears to raise no eyebrows when a justification can be offered, as for example the justification offered by San Miguel (2008) that an interim analysis had shown that the data had already proven the superiority of the Bortezomib Group, so nothing was to be gained by continuing the study.  However, such a superiority of one group over another may prevail for months, perhaps even years, during any given clinical trial, and so which gives the researcher prolonged discretion as to when he invokes said superiority to justify cutoff.  Pushing the researcher in the direction of not cuttingoff early is the need to show Bortezomib benefit extending into the distant future.  On the other hand, pushing the researcher in the direction of cuttingoff early is the need to avoid a Terminal Plunge.  The San Miguel (2008) Survival Domain extending to only Months=30 might be explained by an imminent Terminal Plunge needing to be preempted.

Here is the San Miguel (2008) announcement of that trial's cutoff, the superiority of Bortezomib having been demonstrated according to the outcome of "time to progression", meaning "time to disease indicators worsening", which worsening is widely reputed to be the ultimate fate of multiple myeloma patients:

On the basis of the third analysis (with a data cutoff of June 15, 2007), the data and safety monitoring committee recommended that the study be stopped, since the prespecified statistical boundary (an alpha level of 0.0108) for the primary end point of time to progression had been crossed (hazard ratio in the Bortezomib group, 0.54; P<0.001).    San Miguel (2008) pp. 908-909

Therefore, whenever we see Terminal-Plunge-uncontaminated curves like the two Bortezomib-flattering ones on the right below from San Miguel (2008), or in the Overall-Survival graphs from Mateos(2010) and San Miguel(2013) that we have seen above, we will now be able to appreciate the good luck with which Johnson&Johnson was blessed by its cutoff happening to have taken place while their longest-lasting Bortezomib patient was still alive, and which thereby avoided stigmatizing their clinical trial with a pair of curves as unwelcome and as stock-price crashing as those on the left below:

IMAGINARY and UNFLATTERING
SAN MIGUEL (2008) KAPLAN-MEIER GRAPH
necessitated by research cutoff following
the death of the longest-enrolled patient
in the Bortezomib Group
PUBLISHED and FLATTERING
SAN MIGUEL (2008) KAPLAN-MEIER GRAPH
necessitated by research cutoff preceeding
the death of the longest-enrolled patients
in both groups
San Miguel (2008) Kaplan-Meier graph imagining a possible Terminal Plunge SAN MIGUEL (2008) Figure 1 Panel B

It is possible that two details visible in the Published-and-Flattering graph above support the hypothesis that the Bortezomib Curve had been cutoff to avoid a Terminal Plunge.  The two details are: (1) the Patients-Remaining numbers underneath the graph show Bortezomib patients approaching zero faster than Control patients, and (2) the Bortezomib Curve stops earlier than the Control Curve.

HOW DO THE BORTEZOMIB AND CONTROL GROUPS REALLY DIFFER?

What randomization is supposed to accomplish in San Miguel (2008) is pre-treatment equality of the two groups of patients.  "Pre-treatment" means before any patients are treated with any drugs.  If the groups are equal before treatment, and if only one of the groups is then treated with a drug, then any later health differences between groups can be taken to have been caused by that drug.

The equality that is sought is equality in the average characteristics of the members of each group, as for example their average height and weight and age and gender and hemoglobin count and so on.  Whether the groups are equal in size does not matter, so that if for some reason one group had 80 patients and another group had 100 patients, this would be unobjectionable.

San Miguel (2008) went to impressive lengths to convince the reader that randomization has indeed succeeded in achieving pre-treatment equality of groups.  His Table 1 shows 41 rows of numbers comparing pre-treatment measurements made on Bortezomib and Control Groups, and which measurements do indeed appear to be very similar, as for example the "median Serum β2-microglobulin" level was 4.2 in the Bortezomib Group, and 4.3 in the Control Group and the "median Albumin" level was 3.3 in the Bortezomib Group, and also 3.3 in the Control Group, and so on.

And San Miguel (2008) meticulously laid out the drug protocol (the prescription which despite being tersely expressed is capable of being deciphered) dictating exactly how much of each drug each patient must take on every one of the days in his/her 54-week treatment:

They received nine 6-week cycles of Melphalan (at a dose of 9 mg per square meter of body-surface area) and Prednisone (at a dose of 60 mg per square meter) on days 1 to 4, alone or in combination with Bortezomib (at a dose of 1.3 mg per square meter), by intravenous bolus on days 1, 4, 8, 11, 22, 25, 29, and 32 during cycles 1 to 4 and on days 1, 8, 22, and 29 during cycles 5 to 9.  The planned 54-week treatment corresponded to the standard duration of Melphalan–Prednisone therapy.

The question "How Do The Bortezomib And Control Groups Really Differ?", then, seems really easy to answer — the Bortezomib Group really got Bortezomib (in the pre-specified dosages, and on the pre-specified dates, and over the pre-specified 54 weeks), and the Control Group really didn't get any Bortezomib.  What could be more obvious?

However, the seemingly obvious is not at all what happened.  The reality is that San Miguel only intended to give the 344 Bortezomib-Group patients their meticulously-specified 54-week BMP treatment, but he utterly failed, and for which reason we would be more accurate to describe these 344 patients not as the "Bortezomib Group" but as the "Intended Bortezomib Group".  And, similarly, San Miguel only intended to give the 338 Control-Group patients their meticulously-specified 54-week MP treatment which lacked Bortezomib, but he utterly failed to do that as well, and for which reason we would be more accurate to describe these 338 patients not as the "Control Group" but as the "Intended Control Group".

Let's take a look at how the intention failed.  As our chief interest at the moment is how much Bortezomib got doled out, we will be proceeding mainly down the left-hand column in the Patient Disposition diagram below.

RED DOT  

Allocated, N=344 and 338.  The Patient Disposition diagram starts off with the randomized patients that we have seen in all three of the Johnson&Johnson Overall-Survival graphs above, and that we will continue to encounter below: 344 patients in the Intended BORTEZOMIB Group and 338 patients in the Intended CONTROL Group.

San Miguel SUPPLEMENTARY APPENDIX


Supplementary Appendix to San Miguel (2008)
Bottom of the original diagram cropped out because irrelevant to Overall Survival.
Colored dots and headings added.
New England Journal of Medicine

Fewer Days Of Treatment

BLUE DOT  

Did not receive treatment, N=4.  It turns out that no Bortezomib was ever given to 4 of the patients in the Intended-Bortezomib Group, which is being regarded here as reducing the number of days of treatment to zero.  Let the enormity of this admission not be overlooked — of the 344 patients in the Intended-Bortezomib Group, 4 received no Bortezomib whatever, and in fact did not receive the other two chemicals intended to be administered alongside either, the Melphalan and the Prednisone, and yet these 4 patients remained card-carrying members of the Intended-Bortezomib Group, and their longevity data did contribute to the Bortezomib Curve in the Overall-Survival graph.  The effect of including these 4 was to weaken the seeming effect of Bortezomib, whatever that effect may have been:  if Bortezomib kills patients, here will have been 4 patients who were said to have taken Bortezomib but who did not die; if Bortezomib cures patients, here will have been 4 patients who were said to have taken Bortezomib but who were not cured.

What we have been given a first tiny peek of here is the Alice-in-Wonderland world of Intention-To-Treat (ITT) research, in which deviation from research protocol is swept under the rug, and the research published and trusted as if that deviation had not in fact invalidated its conclusions:

Journal of the American Medical association logo

Under ITT, study participants are analyzed as members of the treatment group to which they were randomized regardless of their adherence to, or whether they received, the intended treatment.  For example, in a trial in which patients are randomized to receive either treatment A or treatment B, a patient may be randomized to receive treatment A but erroneously receive treatment B, or never receive any treatment, or not adhere to treatment A.  In all of these situations, the patient would be included in group A when comparing treatment outcomes using an ITT analysis.  Eliminating study participants who were randomized but not treated or moving participants between treatment groups according to the treatment they received would violate the ITT principle.


Detry, Michelle A., and Lewis, Roger J. The Intention-to-Treat Principle: How to Assess the True Effect of Choosing a Medical Treatment, Journal of the American Medical Association, 2014, 312(1), 85-86.     Detry (2014)

If it were a matter of only 4 patients (or 5 if we include the blue-dot patient in the right-hand column above) out of a grand total of 682 patients in the trial that had not in reality received any treatment but whose health was recorded as if they had, it might indeed be best to consider this deviation too trivial to merit consideration — but we are about to see the numbers climbing.

YELLOW DOT  

Discontinued treatment, N=139.  Turns out that 139 other Intended-Bortezomib patients discontinued treatment, meaning that, for one reason or another, they did not continue treatment for the full 54 weeks.  From being told no more than this, all we can infer for certain for each of these 139 patients is that he/she got anywhere from no more than a single day of Bortezomib treatment, all the way up to one day short of the full 54 weeks of treatment.  But however short of the full 54 weeks of Bortezomib these 139 patients fell, they remained members in good standing of the Intended-Bortezomib Group, and their longevity data did contribute to the Bortezomib Curve in the graph.  Again, the effect of including these 139 patients was to weaken the seeming effect of Bortezomib.  At the two extremes: if 54 weeks of Bortezomib kills, here will have been 139 patients less likely to die because they have received less than the lethal dose; if 54 weeks of Bortezomib cures, here will have been 139 patients less likely to be cured because they have received less than the curative dose.

GRAY DOT  

Treatment ongoing, N=47.  And it turns out also that another 47 Intended-Bortezomib patients were still getting treatment when the study was cutoff, so that for purposes of the San Miguel (2008) graphs, they would be 47 patients whose Bortezomib intake was briefer than the full 54 weeks, and the same cautions that applied to the blue-dot patients, and to the yellow-dot patients, must now be applied to the gray-dot patients as well.

BLACK DOT  

Bortezomib alone discontinued, N=63.  Yet another group of 63 Intended-Bortezomib patients discontinued only Bortezomib, but did continue Melphalan and Prednisone, as is acknowledged in the quote below, but because these patients did keep taking the latter two drugs they were disqualified from being counted under "Discontinued Treatment" in the Patient Disposition diagram above (which is why there are no black dots in that diagram).

Bortezomib alone was discontinued in an additional 63 patients [...].    [San Miguel, 2008, p. 911]

But what does it mean for an Intended-Bortezomib-Group patient to be taking not the prescribed BMP but only MP?  It means that such a patient is being treated exactly like an Intended-Control-Group patient, because MP is what Control-Group patients are supposed to be getting — and yet that patient's data is being fed into the Intended-Bortezomib-Group curve, the BMP curve.  This is exactly the possibility described in the JAMA Detry (2014) box quote above which explained ITT, only we see it happening in San Miguel (2008) not just to one patient but to 63 of them.


Summing the four bold-font numbers above gives 4+139+47+63 = 253 Intended-Bortezomib patients who did not complete the 54 weeks of Bortezomib treatment, which is 253/344 = 74% of all Intended-Bortezomib patients.  Therefore, only 26% of all Intended-Bortezomib Group patients completed the 54 weeks of treatment.

Reduced Dosage Per Day

But getting fewer than 54 weeks of treatment is only one of the two main ways in which Actual Total Bortezomib Load fell short of the Intended Total Bortezomib Load.  The other way is that the dosage delivered per day fell short as well.  That is, whenever adverse reactions were observed, the dosage was reduced.  It would be correct to assume, in the two quotes below, that a dose that was "managed" or "modified" was a dose that was reduced:

The dose of Melphalan or Bortezomib was reduced if there was any prespecified hematologic toxic effect or grade 3 or 4 nonhematologic toxic effect; Bortezomib-associated neuropathic pain and peripheral sensory neuropathy were managed with the use of established dose-modification guidelines.    [San Miguel, 2008, p. 907, italics added]

[P]rompt modification of the Bortezomib dose according to established guidelines is important to avoid severe neurotoxicity and ensure reversibility.    [San Miguel, 2008, pp. 913-914, italics added]

Is the above-quoted list of adverse reactions that justified dose reduction complete, or could the list be extended to include a score of other adverse reactions that also led to dose reduction?  We aren't told.  And what proportion of the patients did have their dosage reduced — as few as 5%, or as many as 95%?  Can't tell.  Maybe all?  Yes, maybe all!  And how large would the dose reductions have been — perhaps some patients were taken down to 50% of their Intended Dosage, and others down to 10%?  Don't know.

Did there exist even a single patient who endured the full 54-week treatment at full dosage?  Can't find any reference to any such patient.

Reduced Total Bortezomib Load

The conclusion that a casual and uncritical reading (as distinguished from the reading we are doing) of San Miguel (2008) encourages is that if future patients are given the San Miguel (2008) Intended Drug Protocol, then whatever adverse reactions and survival durations were reported in San Miguel (2008) can reasonably be expected to be the same in all future therapy that follows the same Intended Drug Protocol.  The reality is, however, that the Actual Total Bortezomib Load delivered in San Miguel (2008) was probably much smaller than the Intended Total Bortezomib Load, but how much smaller is impossible to say.  Therefore, therapists and researchers who do subject their patients to the full San Miguel (2008) Intended Drug Protocol are possibly delivering a Total Bortezomib Load far greater than the Actual Total Bortezomib Load delivered per patient during San Miguel (2008).

And therapists or researchers who in the interests of safety are delivering to today's patients a Total Bortezomib Load smaller than the Intended Total Bortezomib Load described by San Miguel (2008) have no idea of whether their notion of "smaller" bears any resemblance to the San Miguel (2008) Actual Total Bortezomib Load, and for which reason it might be fair to describe their personal concoction of a drug protocol as flying by the seat of their pants.

Johnson&Johnson Research Methodology Underestimates Bortezomib Toxicity

By far the most important information that can be had from any clinical trial is whether or not its treatment kills.  But if whenever a patient begins to sicken, treatment is either reduced or discontinued, then few patients will die.  If San Miguel's (2008) Intended Drug Protocol could be counted on to produce a high death rate whenever actually implemented, the Johnson&Johnson trilogy of clinical trials above would have failed to detect this because they avoided actually implementing it.  This is not at all to say that Johnson&Johnson researchers should have tested their patients at the full Intended Drug Protocol no matter how many they killed; it is to say that they have an obligation to warn that they rarely — perhaps never — gave patients the full Intended Drug Protocol, and so have a poor idea of what the effects of the full Intended Drug Protocol might be.

Does Johnson&Johnson satisfy this obligation to warn that its Intended Dosage Protocol must not be taken seriously?  Quite the contrary.  Johnson&Johnson affirms the opposite.  Whereas the truth is that few, and perhaps even none, of the San Miguel (2008) patients received the full Intended Dosage Protocol, the following Spicka (2011) Clinical Trial Commentary asserts that all of them did:


Clinical Trial Commentary

Altogether 682 patients were enrolled and prospectively randomized in this trial.  All patients received nine 6-week cycles of oral Melphalan (9 mg/m2) and Prednisone (60 mg/m2) on days 1–4, either alone or with Bortezomib administered intravenously (1.3 mg/m2 on days 1, 4, 8, 11, 22, 25, 29 and 32 during the first four cycles and on days 1, 8, 22, 29 during remaining course of therapy).
Ivan Spicka, Maria-Victoria Mateos, K Redman, Meletios A Dimopoulos, and Paul G Richardson.  An overview of the VISTA trial: newly diagnosed, untreated patients with multiple myeloma ineligible for stem cell transplantation, Immunotherapy, 2011, 3(9), 1033-1040.    Spicka (2011)     [Bold emphasis added.  The above reference to "682 patients" serves to confirm that the publication under review is indeed San Miguel (2008)]

Perhaps the five authors of the above erroneous "all" are guilty only of a rushed reading of the original San Miguel (2008), and for that reason overlooked its deviation from the Intended Drug Protocol?  This defense washes poorly — four of the five authors of the above "Clinical Trial Commentary" were among the 21 authors of San Miguel (2008) and also among the 20 authors of Mateos (2010), and also among the 22 authors of San Miguel (2013), namely:

Ivan Spicka
Maria-Victoria Mateos
Meletios A Dimopoulos
Paul G Richardson

Subsequent Therapy Increases Total Bortezomib Load

After Bortezomib patients had completed their clinical-trial treatments, or by one path or another had escaped them, they were free to continue any subsequent therapy they chose, which could have been more Bortezomib therapy.  As the years went by, then, and the original San Miguel (2008) administration of Bortezomib receded into the past, some patients may have ended up receiving the same amount of Bortezomib after cutoff as they had received before cutoff.  Other patients may have ended up receiving twice as much, or three times as much.  Thus, in plotting Overall Survival five years after the first cutoff, not only might many patients have receive vastly less Bortezomib than had been intended (as documented higher above), but other patients may have received vastly more (as is being suggested in the instant paragraph).  The broader conclusion now called for is that because Actual Drug Load may deviate from Intended in both directions, less and more, the possible range of Bortezomib actually received becomes staggeringly broad, and the assumption that Intended-Bortezomib patients have absorbed exactly the Intended Bortezomib Protocoal Load is all the more unjustified.

The Intended Control Group Got Bortezomib

Surprising as it may seem, lots of Intended Control Group patients received Bortezomib treatment.

A statement written in large bold font on the ClinicalTrials.gov web site suggests that Johnson&Johnson, which owns Millennium Pharmaceuticals, was recruiting patients for off-trial Bortezomib therapy in the vicinity of the trial itself:

Millennium Pharmaceuticals, Inc. has indicated that access to an investigational treatment associated with this study is available outside the clinical trial.    ClinicalTrials.gov Identifier: NCT00111319

Among those taking advantage of the above offer to try Bortezomib might have been the Intended-Control-Group patients seeking subsequent therapy after the San Miguel (2008) cutoff:

Further enrollment was halted, and patients receiving Melphalan and Prednisone were offered VELCADE in addition.   USDRUGBASE

It is uncertain whether the Control-Group patients receiving Bortezomib (VELCADE) in the quote above are the same as the 54 Control-Group patients receiving Bortezomib in the quote below, or whether there is even any overlap.

Of 121 patients in the control group who received subsequent therapy, 54 (45%) received therapy that included Bortezomib.    [San Miguel, 2008, p. 909]

It is conceivable that some or many or all of the above 54 Intended Control Group patients who received Bortezomib were patients who dropped out of the clinical trial in order to be able to get Bortezomib, and may have done so early in their treatment, and may have stayed on Bortezomib so long that their dosage experience resembled that of an Intended Bortezomib Group Patient.  Of course the published San Miguel (2008) data are too limited to permit such a detailed question to be answered, leaving the reader free to contemplate some startling worst-case scenarios.

Although the particulars of Intended-Control-Group patients receiving Bortezomib are far from clear, that such did happen on a substantial scale is unmistakable, and given the San Miguel (2008) committment to ITT doctrine, we can be sure that the survival data of this particular group of Bortezomib-treated patients was fed into the Control-Group curve in the Kaplan-Meier Overall-Survival graphs.

In view of the probability that the Intended Bortezomib Group got vastly less Bortezomib than had been intended (but we don't know how much less), along with the freshly-discovered probability that the Intended Control Group got vastly more Bortezomib than had been intended (intended had been zero, but they did get a lot, exactly how much we don't know), we are forced to recognize that it is remotely possible — not likely, but at the same time not absolutely impossible ‐ that as the years went by, the Intended Control Group ended up with a greater Total Bortezomib Load than the Intended Bortezomib Group.

One of the things wrong with Johnson&Johnson extending survival domain up to Months=78, then, is that the labels on the two curves ("Bortezomib" and "Control") become increasingly inaccurate.  It is only during the first few days of treatment that the groups may have been close to the ideal of the Intended Bortezomib Group getting Intended-Protocol Bortezomib and the Control group getting none.  Every additional day that passed, however, saw the patients increasingly deviating from their Intended Drug Protocol, so that the initial Overall-Survival separation between the two curves continuing much the same year after year up to Months=78 (6.5 years) — when in fact the putative cause of that separation had vanished — is capable of striking some readers as incredible.

Confounding By Background Therapy

We see in the passage below from the Adverse Reactions section of San Miguel (2008) that Bortezomib seems to cause adverse reactions, and we may expect that some of these adverse reactions will be treated with medications, and that one of these adverse reactions, "herpes zoster" (commonly known as "shingles"), is managed by the administration of an "antiviral prophylaxis" (which we will assume for the sake of argument is the drug Valacyclovir), and that as Bortezomib triggering "herpes zoster" is expected from the outset, the Valacyclovir was administered from the outset too.  At least that's what the word "prophylaxis" suggests — that the treatment was administered to alleviate some adverse reaction that was expected, and was not merely a treatment to alleviate some adverse reaction that had already begun to happen:

Peripheral sensory neuropathy was reported more frequently in the Bortezomib group, including grade 1 neuropathy in 49 patients (14%), grade 2 in 58 patients (17%), grade 3 in 43 patients (13%), and grade 4 in 1 patient (<1%).  [...]  All grade 3 or 4 gastrointestinal symptoms were more frequent in the Bortezomib group than in the control group (19% vs. 5%), as was any grade of herpes zoster (13% vs. 4%); the incidence of herpes zoster was reduced to 3% in patients in the Bortezomib group who were receiving antiviral prophylaxis.    (San Miguel, 2008, p. 911.   Bold emphasis added.)

Supposing, then, that Bortezomib is expected to spark a herpes zoster eruption, and that Valacyclovir is given as "background therapy" to all Intended-Bortezomib-Group patients to alleviate the expected symptoms, the Valacyclovir therefore plays the role of a confounding, and an early-introduced confounding to boot, giving it more time to produce an effect.  The Intended-Bortezomib-Group might now be more properly called the Intended-Bortezomib-Plus-Valacyclovir Group.  Given how little we know about how much Bortezomib the groups got, and given that we have reason to expect that everybody in the Intended-Bortezomin Group got Valacyclovir, but perhaps only to the 4% of the Intended-Control-Group who came down with herpes zoster in the Intended-Control Group, maybe it would have been fitter to call this Intended-Bortezomib-Group the Valacyclovir Group.

Confounding By Open Label

If patients know what condition they are in, then they will be motivated to take steps which undo the pre-treatment equality of groups, as for example Control-Group patients seeing themselves as wasting time while their cancer grows, and wanting to drop out so they can take advantage of some alternative therapy which does hold out some promise, and if pursuing some other treatment while still enrolled in the bortezomib study, concealing this from the bortezomib researcher.  And what would be a Control-Group subject's motivation for enduring 54 weeks in a therapy that nobody expected to bring much benefit?  The Control-Group patient's only imaginable motivation would be that he has been promised bortizomib therapy as a reward for completing his 54 months of health-destroying servitude — but which from the researcher's point of view should be totally unacceptable because the researcher needs to know how long Control-Group patients live as compared to Bortezomib-Group patients, but what he's going to get beyond the first year or so is Control-Group patients who have switched to taking bortezomib.

And if physicians and nurses and technicians and lab workers and data-handlers all know who is in what condition, each will feel obligated on a thousand-and-one occasions to serve the interests of the sponsor by propelling the data in the direction expected by the sponsor.  Another test of creative thinking seems to be called for, and I will comply by supplying only a single instance of the sort of thing that can happen.

So, then, a research assistant's immediate task is to randomly assign one of a pair of patients to the Bortezomib Group, and the other to the Control Group.  The two patients before him have been selected because all their measures and tests are highly similar, and all the assistant needs to do is, say, flip a coin.  But despite the measurements making the two patients seem very similar on paper, looking at them paints quite a different picture.  John High seems energetic and alert and upbeat and has a spring in his step and sits erect; Jack Low seems feeble and dozy and depressed and drags his feet and slumps in his seat.  The research assistant has a strong sense that John High has a good stretch of life ahead of him and Jack Low is dying.  To put Jack Low into the Bortezomib Group, thinks the assistant, would be a disservice to Johnson&Johnson, as Low's early death will make bortezomib look like it killed him.  And so the assistant does not flip a coin to determine which group each patient will go in, as we imagine him being required to do, but puts John High into the Bortezomib Group, and Jack Low in the control group.  The assistant doesn't feel he's committing any crime, since the tests show the two patients to be very similar — so what difference where they go? — and in any case, he believes bortezomib is a good drug, and so tweaking the data to make it seem just a tiny bit better than it already looks can't hurt anybody.  In fact, thinks the assistant, making bortezomib look as good as possible is a worthwhile, even a noble, thing to do, as that will speed the drug to market so that it may all the sooner begin alleviating suffering and saving lives.

The way the assignment of patients to groups should have been done?  A computer program should have assigned John and Jack to their conditions randomly.

If it is easy to think of one such example of how open-label can skew results, it will not be too difficult to think of a hundred.  But, really, if you won't be convinced without many examples, then put your own creativity to the test — think up examples yourself.  While demanding evidence of the creativity of others, don't shame yourself by proclaiming that you yourself have none.

Double-blind is not a dispensable decoration that the researcher is free to discard if he wishes.  It is the bedrock of solid research, without which the would-be discoverer of scientific truth is left with nothing.

Linear Causal Chain Or Branching Tree?

Does any elaboration need to be added to the simple causal chain below to represent that "Bortezomib may activate shingles, shingles calls for Valacyclovir prophylaxis, Valacyclovir alleviates shingles symptoms"?

Bortezomib  ➡  Shingles  ➡  Valacyclovir  ➡  Alleviation of Shingles

The answer is, Yes, something more elaborate is needed.  The more that is needed is the understanding that the causal sequence above is a single branch in a tree of infinite size and complexity which reflects the large number of adverse reactions that are triggered by the use of any drug, as begins to be revealed in the tree diagram below, which starts out by displaying on its upper-left the three San Miguel (2008) drugs: Bortezomib, Melphalan, and Prednisone.  Pretending that this tree diagram is interactive, it can be said that clicking on Bortezomib has called up the orange column showing 25 Bortezomib adverse reactions, each accompanied by a possible background therapy.  Had it been Melphalan that had been clicked instead of Bortezomib, the orange-column list would have shown 25 adverse reactions to Melphalan, and so on.  The name of each adverse reaction in the orange column is accompanied by a "1" to signify that this is the first time that particular adverse reaction has come to our attention, proceeding as we do from left to right.

The summary statistics at the base of the orange column signify that 25 different adverse reactions have been recognized as possible so far, each of them attributable to only a single cause or "trigger" — Bortezomib.

As our interest at the moment is the confounding drug, Valacyclovir, we click on it in the orange column to show 25 of its adverse reactions in the yellow column.  The uppermost adverse reaction in the yellow column is "anorexia 2", the "2" indicating that anorexia is being seen for the second time in the tree diagram, the first time having been in the orange column.  The second adverse reaction at the top of the yellow column is "chest pain 1", the "1" indicating that this is chest pain's first appearance, there having been no chest pain in the orange column.

The summary statistics at the bottom of the yellow column indicate that the number of different adverse reactions possible so far has shot up to 39, 28 of them triggered by only a single drug (either Bortezomib or Valacyclovir), and 11 of them triggered by two drugs (both Bortezomib or Valacyclovir).

We go on to imagine clicking in the yellow column on "jaundice 1 Plasbumin" which has the effect of displaying 25 Plasbumin adverse reactions in the blue column, and whose summary statistics show the number of different adverse reactions observed so far shooting up to 49, with 8 of them triggered by all three drugs, namely Bortezomib, Valacyclovir, and Plasbumin, and so on.

Obviously, it is possible to keep going without end, as has begun to happen by someone having clicked in the blue column on "nausea 3 Aprepitant", which opened up a green column ready to display 25 adverse reactions to background-therapy-drug Aprepitant, with only the lack of room on the page preventing still another list of adverse reactions being printed in the green column.

BEGINNING OF A
BRANCHING TREE DIAGRAM
IDENTIFYING POTENTIAL ADVERSE REACTIONS IN SAN MIGUEL (2008)

CHEMO
DRUGS
SAN MIGUEL
(2008)

Bortezomib ➡
Melphalan
Prednisone
  ADVERSE
  REACTIONS
  TO
  BORTEZOMIB

MEDICATIONS
FOR
ADVERSE
REACTIONS

  anemia 1 Iron
  anorexia 1 Prozac
  arthralgia 1 Tylenol
  back pain 1 Celecoxib
  breathing difficulty 1 Asthmanefrin
  constipation 1 Constulose
  cough 1 Guaifenesin
  diarrhea 1 Loperamide
  dizziness 1 Primperan
  fatigue 1 Modafinil
  fever 1 Naprosyn
  herpes zoster 1 Valacyclovir ➡
  hypokalemia 1 KCl 
  insomnia 1 Temazepam
  leukopenia 1 Echinacea
  lymphopenia 1 Gamma globulin 
  nausea 1 Aprepitant
  neuralgia 1 Oxcarbazepine
  neutropenia 1 Filgrastim
  pneumonia 1 Levofloxacin
  rash 1 Depo-Medrol 
  swelling 1 Furosemide
  thrombocytopenia 1  Decadron
  vomiting 1 Benadryl 
  weakness 1 Vitamin B12

TRIGGERS PER    CUMULATIVE
 ADVERSE      ADVERSE
 REACTION     REACTIONS
 1          25


CumulativeAdverseReactions=25

  ADVERSE
  REACTIONS
  TO
  VALACYCLOVIR

MEDICATIONS
FOR
ADVERSE
REACTIONS

  anorexia 2 Prozac
  chest pain 1 Nitroglycerin
  chills 1 Aleve
  constipation 2 Constulose
  cough 2 Guaifenesin
  cramps 1 Advil
  depression 1 Zoloft
  diarrhea 2 Loperamide
  dizziness 2 Primperan
  fever 2 Naprosyn
  hair loss 1 Rogaine
  headache 1 Aspirin
  hives 1 Claritin
  jaundice 1  Plasbumin ➡
  nausea 2 Aprepitant
  rash 2 Depo-Medrol 
  seizures 1 Diamox
  speech slurred 1 Yakov's Elixir
  swelling 2 Furosemide
  trembling 1 Xanax
  urine bloody 1  Plaquenil
  vision blurred 1 Vitamin D
  voice loss 1 Paracetamol
  vomiting 2 Benadryl  
  weakness 2 Vitamin B12

TRIGGERS PER   CUMULATIVE
 ADVERSE     ADVERSE
 REACTION    REACTIONS
 1         28
 2         11

CumulativeAdverseReactions=39

  ADVERSE
  REACTIONS
  TO
  PLASBUMIN

MEDICATIONS
FOR
ADVERSE
REACTIONS

 
  analphylactic shock 1  Epinephrine  
  anxiety 1 Valium
  blood pressure zigs 1  Vasotec
  breathing difficulty 2  Asthmanefrin 
  chest pain 2 Nitroglycerin
  chills 2 Aleve
  confusion 1 Aricept
  cough 3 Guaifenesin
  cramps 2 Advil
  diarrhea 3 Loperamide
  erythema (skin red) 1  Colchicine
  fever 3 Naprosyn
  headache 2 Aspirin
  heart rate irregular 1 Cordarone
  hives 2 Claritin
  hypersalivation 1  Hyoscyamine
  itching 1 Elidel
  nausea 3 Aprepitant ➡
  rash 3 Depo-Medrol 
  sweating 1 Glycopyrrolate
  swelling 3 Furosemide
  tinnitus 1 Nortriptyline
  vision blurred 2 Vitamin D
  vomiting 3 Benadryl 
  weakness 3 Vitamin B12

TRIGGERS PER    CUMULATIVE
 ADVERSE      ADVERSE
 REACTION     REACTIONS
 1          31
 2          10
 3           8
CumulativeAdverseReactions=49

How different the view of multiple myeloma treatment that such a tree diagram offers!  Most obvious of all is the increase in complexity compared to the one-line causal chain shown higher up, and than the dozen or so one-line causal chains that medical staff might be keeping in mind when treating a patient.  Let's spend a few minutes exploring just how much more complicated this new view is than even is suggested by the tree diagram above.

In the first place, it would be possible in the left-most column, the gray one except where Bortezomib has been clicked, to click on Melphalan instead, and after that click on Prednisone, so as to produce two other tree diagrams underneath the one we see above, that would be similar in appearance, but of course would have different contents.

And then in the orange column where we have clicked Valacyclovir, there are 24 other background-therapy medications that might have been clicked, and the same for yellow and blue.  If my arithmetic is correct, the total number of tree diagrams that could be drawn underneath each other, each similar in size and appearance to the one showing above, but of course each different in content, is

3 * 253 = 46,875

And that's not all.  So far, we've only seen 2 background-therapy medications (Valacyclovir and Plasbumin) added to the basic chemotherapy of BMP, but my impression is that the number of additional background-therapy medications given a typical Multiple Myeloma patient might be closer to 6.  That would call for the tree diagram above to be extended four color-columns to the the right, say green (as has already been suggested and begun), then maybe purple, red, and black.  The number of possible tree diagrams now soars to more than 18 billion:

3 * 257 = 18,310,546,875

And that's not all.  I'm thinking that my guess of the total number of 6 medications (beyond 3 chemo) might have been conservative.  Below is Mutiple Myeloma patient Tom Brokaw reporting that he takes 24 pills daily, so if three are basic chemo, that leaves 21 extra, so that our tree diagram above needs to be expanded by 19 more color-columns to the right than are currently showing, and so that the number of possible tree diagrams that might describe patients taking that many pills would become astronomical, and for practical purposes indistinguishable from infinite:

3 * 2522 = 1.7053 * 1031

New York Times logo
Tom Brokaw:
Learning to Live With Cancer

By TOM BROKAW     01 Oct 2016
Tom Brokaw

"Combating cancer is a full-time job that, in my case, requires 24 pills a day, including one that runs $500 a dose."    NYT


And that's not all.  The simplistic view adopted above has been that for each adverse reaction, there exists a single background-therapy medication, which is a bit of an underestimate.  Taking as an example an adverse reaction that seemed rare, and for which a diversity of remedies seemed unlikely to exist, I looked up in WebMD the adverse reaction "hypersalivation" (otherwise known as "Sialorrhea") which appears in the blue column above, and found 37 medications used to treat it, despite 29 of them being "Off Label":

WebMD logo
WebMD OffLabel definition
Hypersalivation treatments

Our repeated re-calculations, higher above, of the number of possible tree diagrams having already fully earned the qualifiers "astronomical" and "practically infinite", it would be profitless to continue performing such re-calculations to ever higher values.  But even without re-calculation of the expanding number of possible tree diagrams, there remain further complications that need to be recognized.

For one, it has been assumed above that only one adverse reaction in each list of 25 will be medicated, whereas it is also possible to choose 2 to be medicated, or 3, or 4, and all the way up to choosing 25 to be medicated.  The number of ways of choosing
0 drugs out of 25 drugs is "25 Choose 0" = 25C0 = 1 way of choosing zero drugs,
1 drug   out of 25 drugs is "25 Choose 1" = 25C1 = 25 ways of choosing one drug,
2 drugs out of 25 drugs is "25 Choose 2" = 25C2 = 300 ways of choosing two drugs,
and so on, as is elaborated below, though incompletely:

      25C0 = 1
      25C1 = 25
      25C2 = 300
      25C3 = 2,300
      25C4 = 12,650
      25C5 = 53,130
      25C6 = 177,100
      25C7 = 480,700
      25C8 = 1,081,575
      25C9 = 2,042,975

         [...]

      25C24 = 25
      25C25 = 1

And let's not forget also that the effect of a drug depends on dosage, as for example a drug lacking effect at low dose, then becoming beneficial at a higher dose, but ultimately proving to be lethal at some still higher dose, such that a tree diagram trying to be comprehensive and helpful would need to represent each background-therapy drug at, say, five dosages, which elaboration would make the color columns in the tree diagram five times as tall as they are now.

And let's not forget that a more comprehensive and helpful tree diagram would give space not merely to each drug's destructive effects, but to its constructive effects as well, which would require a doubling of the height of each of the tree diagrams, or perhaps less than a doubling if the constructive effects were fewer.

And let's not forget the complication of interaction — that the effect of a drug may depend on other drugs that may be present within a patient.  And, reciprocally, a drug brings with it not only its own list of reactions, but also the power to alter the reactions triggered by previously-administered drugs.

Also needing to be taken into account is that the effects of some drugs last longer than others.  Illustrative of that range are the biological half-lives of Adenosine being less than 10 seconds, and of Bedaquiline being 5.5 months.

And everything that can be speculated about the effects of introducing a medication comes with a corresponding, but radically different, set of effects of withdrawing that medication.

And perhaps placing the greatest obstacle in the path of comprehension is that every list of adverse reactions consists of little better than some anonymous people's off-the-cuff attributions of causality between whatever coincidence of events (drug taking and patient suffering) happens to have caught their attention at one time or another.

In view of all of the above, are we able to do anything more constructive than cower in awe and dread of the overwhelming complexity that has begun to be laid bare before us?  Cowering in awe and dread at overwhelming complexity, though, is not that bad — at least it is better than guiding cancer therapy while blind to complexity and while relying on belief in a dozen or so single-line causal chains.

Nevertheless, it may be possible to draw a few conclusions even while cowering.

WHAT CAN WE CONCLUDE?

(1) Kaplan-Meier Graphing Is Defunct

The Kaplan-Meier Overall-Survival graph in San Miguel (2008) has already been objected to (KAPLAN-MEIER) for showing 83% survival in the Bortezomib Group at Months=30 while the Patients-Remaining counts under the same graph confess that no researcher had ever seen a Bortezomib patient who had survived 30 months.  A similar incongruity presents itself above in the Kaplan-Meier Overall-Survival Graphs for Mateos (2010) and San Miguel (2013).

A new maneuver that users of Kaplan-Meier have discovered is that of keeping survival curves from plunging to zero by the artifice of invoking research cuttoff prior to their longest-enrolled patient's death.

A fair evaluation of Kaplan-Meier can be arrived at from their admission that in the absence of the subject losses of fled/shed, their curves revert to conventional survival functions:

When no losses occur at ages less than t, the estimate of P(t) in all cases reduces to [...] the observed proportion of survivors.
Kaplan, E. L. and Meier, P., Nonparametric estimation from incomplete observations, Journal of the American Statistical Association, 1958, 53, 457–481, 457.     www.biecek.pl/~

But if that is the case, then we may say that it is only when data are riddled with patient losses that Kaplan-Meier graphing is able to step in and lift survival rates way above the proportion of patients actual seen to be surviving.

But if data marred by non-random loss is the only data that Kaplan-Meier are able to inflate, and if non-random loss ruins a study's ability to justify cause-effect conclusions, then it follows that the only data that Kaplan-Meier can inflate is worthless data.

It bears repeating that legitimate researchers doing quality work involving no subject loss have occasionally found it convenient to employ Kaplan-Meier software for its ability to create conventional "observed proportion of survivors" graphs which are not in the least prettified by Kaplan-Meier inflation.  It might be expected that in the future such researchers will prefer to create the same graphs without using Kaplan-Meier software, and thus without inviting unfair suspicion that their work has been enhanced by Kaplan-Meier prestidigitation.

(2) Cytotoxic Chemotherapy Violates The Obligation To Do No Harm

The Tree Diagram above shows two changes in adverse-reaction measures that result from increasing the number of drugs administered to the patient:

  1. CumulativeAdverseReactions increases with number of drugs administered, as for example reaching 49 when only 2 additional background-therapy medications (Valacyclovir and Plasbumin) have been added to the three fundamental chemotherapy drugs (Bortezomib, Melphalan, and Prednisone), which invites the question of how high CumulativeAdverseReactions will go beyond 49 when a more typical number of additional medications is applied, say 6 additional medications, or even the 21 additional medications which we saw Tom Brokaw taking above.  Perhaps CumulativeAdverseReactions often reaching 60 or 70 would be a conservative guess, and which might suggest the hypothesis that the larger the CumulativeAdverseReactions, the likelier it becomes that one or more of the adverse reactions that were counted in that CumulativeAdverseReactions will coincide with weaknesses in the patient's constitution, or one might say will coincide with gaps in the patient's defences, and so will manifest as serious adverse reactions.

  2. TriggersPerAdverseReaction increase with number of drugs administered as well, as for example in the Tree Diagram above when fever, nausea, vomiting, and weakness each came to be triggered by all three available triggers: Bortezomib (orange), Valacyclovir (yellow), and Plasbumin (blue).  Perhaps the TriggersPerAdverseReaction reaching 10 or 15 for some adverse reactions would be a conservative guess, and which might suggest the hypothesis that the larger an adverse reaction's TriggersPerAdverseReaction, the likelier is that adverse reaction to appear in full strength and to prove unresponsive to attempts at alleviation.

No one knows what the health effects are from a barrage of chemicals triggering 60 or 70 CumulativeAdverseReactions, and simultaneously escalating TriggersPerAdverseReaction for many of these CumulativeAdverseReactions to 10 or 15 — but the best guess might be that the agony of the patient will intensify, and his life will be cut short.

The patient may be in danger of finding himself trapped in a vicious circle in which his currently-strongest adverse reaction is treated with yet another palliative drug, and which new drug is yet again erroneously visualized as triggering only a single beneficial effect, but which new drug yet again triggers twenty or so adverse reactions, and so which yet again boosts both CumulativeAdverseReactions and TriggersPerAdverseReaction, and so which lifts some other adverse reaction to prominence, and so on and so on.

Alleviation by means of drug de-escalation affords the patient no avenue of escape, as it would inevitably trigger withdrawal syndromes which would be indistinguishable from, and would blend with, adverse reactions, creating a hodgepodge of profound suffering beyond anybody's power to comprehend or control.

What can a physician do in this predicament but tinker and futz?  Let's try a little more of this, and maybe a little less of that, and if that doesn't work, we'll try something else.  Year in and year out.

Meanwhile, the suffering patient experiences spontaneous ups and downs, as every patient unexpectedly and inexplicably does, and sooner or later is bound to experience a particularly noteworthy up right after the doctor has performed some dosage tweak, and everybody rejoices in post hoc ergo propter hoc bliss, believing they have witnessed progress in the understanding and control of the disease.

The danse macabre is continued until the patient dies — at least that is unanimously acknowledged to be the promise of chemotherapy in multiple myeloma.

(3) The San Miguel (2008) Clinical Trial Fails To Meet Minimal Standards of Scientific Method

The use that the world expects from the San Miguel (2008) data is that it will answer the simple question of whether the taking of Bortezomib will extend survival, which can be depicted as the shortest of causal chains:

Increased Bortezomib  ➡  Extended Survival

Scientific method offers a sure and simple method of answering this question, which can be done by satisfying The Five Requirements:

The Five Requirements
which San Miguel (2008) needed to obey
  1. Constitute two groups of patients whose random allocation guarantees their pre-treatment equality on every conceivable dimension.

  2. Preserve the pre-treatment equality of subject characteristics throughout the course of the experiment by not losing subjects.

  3. Administer the drug-protocol dosage of Bortezomib to the Bortezomib Group but not to the Control Group.

  4. Introduce no difference between the two groups other than the Bortezomib difference (which is to say, permit no confounding).

  5. Observe which group lives longer.

The logic of the experiment is so simple that it may seem impossible to get it wrong.  The fact of the matter is, however, that more than a few researchers find it impossible to get it right.

Did San Miguel (2008) Satisfy Requirement #1:  Randomly assign subjects to groups?

The process of randomly assigning subjects to groups is easily gotten wrong, but San Miguel (2008) merely states that he accomplished it without describing exactly how he accomplished it, and so it is impossible to say whether he got it right or not.  Giving him the benefit of the doubt, let us say that Requirement #1 was satisfied.

San Miguel (2008) Violated Requirement #5:  Observe which group lives longer

Anyone who can see through the smoke-and-mirrors of Kaplan-Meier graphs to what's really happening to survival possesses a supernatural power that many of us wish we were permitted to share.

San Miguel (2008) Violated Requirement #3:  Administer The Drug Protocol To Only One Of The Groups

Administering Bortezomib to one group but not the other means giving the exact protocol dosage of Bortezomib to every patient in the Intended Bortezomib Group, and giving no Bortezomib to any patient in the Intended Control Group — which has been shown above to be something San Migual (2008) spectacularly failed to do.  Most thoroughly documented is that Bortezomib-Group patients received far less Bortezomib than the protocol dosage, many even as little as none, and that Control-Group patients, who were supposed to get none, often got a lot.  The failure to comply with Protocol Dosage was encouraged by committment to an Intent-To-Treat philosophy which advocates ignoring what drugs a patient actually took and instead acting as if every patient always took the drugs the researchers at the beginning of the experiment hoped he would take.

San Miguel (2008) Violated Requirement #4:  Permit No Confoundings

Where our outline of a proper experiment requires administration of Bortezomib to one group but not the other, the meaning intended is that the only differential treatment of groups must be the Bortezomib difference, because then the cause of any later health difference would have had to be the Bortezomib.  If the researchers introduce other differences between groups, extraneous differences known as confoundings or confounding variables, then those confoundings, and not the Bortezomib, could be responsible for any health differences that followed.

Furthermore, it is to be expected that the introduction of even a single background-therapy confounding — like Valacyclovir administered to Bortezomib Patients because they are expected to suffer shingles eruptions — then other background-therapy confoundings will follow.  What these are, we can't say, but we can see in the yellow column in the Tree Diagram the sorts of things that can happen.  For example, noticing in the yellow column the adverse reactions of jaundice, seizures, and bloody urine, we recognize that these may erupt and be treated with new background-therapy confounding medications (perhaps Plasbumin, Diamox, and Plaquenil) which same adverse reactions plus corresponding medications might not happen in the Control Group because it had not received Valacyclovir.  Allowing confounding by Valacyclovir is already bad enough, but it can be expected to trigger its own unique adverse reactions demanding to be treated, leading to further confounding medications, and so on.

Valacyclovir is relied on here only as an example of the large number of ways that a confounding avalanche can be triggered.  Bortezomib can be counted on to trigger adverse reactions unique to the Bortezomib Group, and these are not considered to be confoundings because they are the health effects which are expected to differ between groups, and which are counted not as undesirable extraneous causes, but as interesting health consequences.  It is the background-therapy treatment of these health differences, particularly their treatment by means of medications, that constitute confoundings.  If health differences are observed over the course of the study, then the researchers won't know whether to blame, or credit, Bortezomib or Valacyclovir or Plasbumin or Diamox or Plaquenil, or some particular two of them acting together, or some particular three of them acting together, or some particular four of them acting together, or all five of them acting together.  We have already witnessed how quickly alternative ramifications lead to astronomical complexity, and can anticipate that the same could be shown erupting in the instant case of a simple and straightforward experiment being invaded by a confounding Valacyclovir whose introduction seemed to not matter because by itself it was expected to have only a single and beneficial effect on herpes zoster, and expected to have no effect on survival, and expected to not trigger a cascade of adverse reactions which might in turn require background therapy.

On the question of exactly how many confounding medications were administered, or to how many patients, or in what doses — San Miguel (2008) cannot be said to be informative.  In fact, no background-therapy medication is ever named.  The Valacyclovir discussed here is only a guess prompted by the San Miguel (2008) acknowledgment of using an "antiviral prophylaxis".  It would not matter if it had been some other antiviral prophylaxis that had been used, as that other antiviral prophylaxis would come with its own list of adverse reactions, and whose substitution in our arguments would not change our conclusions.

San Miguel (2008) Violated Requirement #2:  Preserve The Pre-Treatment Equality Of Subject Characteristics Throughout The Course Of The Experiment By Not Losing Subjects

We need to look a little more closely at randomization guaranteeing pre-treatment equality of groups.

Randomly-constituted groups are said to be equal in every conceivable respect.  Of course, what is really meant by this abbreviation is that randomly-constituted groups become more and more equal in every conceivable respect as their N-per-group increases.

"Randomly-constituted" does not mean carelessly or haphazardly or irregularly or unsystematically or arbitrarily or casually or impulsively constituted.  It means — practically speaking — being constituted under the guidance of a string of digits, conventionally referred to as "pseudo-random", and produced most typically by a computer program.  I will employ a reliance on computer-generated random numbers as the sine qua non of random assignment — if computer-generated numbers are relied upon, it will be assumed that they are relied upon properly and correctly, and so that the groups thus produced qualify as randomly-constituted, and so that the assumption of pre-treatment equality can be trusted.  And if groups are constituted without reliance on computer-generated numbers, the groups will be said to be not randomly-constituted but naturally-constituted, meaning that "nature" somehow created them employing her own unfathomable procedures.  Naturally-constituted groups cannot be assumed to be equal, but rather can be expected to be unequal in a large number of ways, several of the inequalities often seeming large and important, and many other inequalities seeming small and less important.

Small And Large Numbers Of Patients Per Group

It is furthermore the case that a large health effect can be demonstrated with an N-per-group even as small as, say, 5.  A large health effect might be control-group mice surviving on average one week, and experimental-group mice surviving on average six weeks.

An N-per-group of 15 or 20 is enough to demonstrate moderate health effects — say control-group mice surviving on average one week, and experimental-group mice surviving on average two or three weeks.

An N-per-group in the hundreds could be justified in only one way — that the effect being demonstrated is teensy — say control group mice surviving on average seven days, and experimental-group mice surviving on average nine days.

Same goes for experiments on people.

As conducting experiments with many subjects is slower and costlier than conducting experiments with few subjects, and as the demonstration of small effects makes less of a contribution to medical science than the demonstration of large effects, then the report of an experiment involving hundreds of subjects can be expected to elicit the question Why?

Loss Of Subjects In One Group

Let us imagine that 200 patients were divided randomly into 100 in GroupA and 100 in GroupB, and that GroupA immediately lost 20 subjects, resulting in three groups:

In such a situation, the researcher's inclination will likely be to carry on the experiment with the two groups to which he continues to have access, GroupAkept having N=80 and GroupB with N=100.

If the 20 GroupAlost subjects had been removed from the original GroupA randomly, which is to say by relying on computer-generated numbers, then all three resulting groups would continue enjoying the status of randomly-constituted groups, and the average characteristics of the patients in the three groups could be assumed to continue being equal, and then the experiment would not be ruined because GroupAkept N=80 would enjoy pre-treatment equality with GroupB N=100, the inequality of their N-per-group not mattering.

However, a truly random loss of patients is a very artificial event that can be imagined, but which would happen in real life under only the most peculiar circumstances, which for practical purposes might be summed up as never.  Patients are often lost during clinical trials, as we have seen in San Miguel (2008), but the loss is not dictated by computer-generated numbers, and so the groups created by the loss must be considered to be naturally-constituted, and so a retention of group equality cannot be assumed.  The three groups now must be regarded as unequal.  Most importantly, GroupAkept does not equal GroupAlost and therefore GroupAkept does not equal GroupB.  The experiment is ruined because pre-treatment inequality has been manufactured.  The experiment is ruined because GroupAkept is a naturally-constituted group, which brings with it the obligation to expect that it differs from GroupB in several large and seemingly-important ways and also in many other small and seemingly-less-important ways.  Research relying on GroupAkept and GroupB is failed research because whatever health differences appear could have been caused not by the drug administered to only one group, but by the many other differences that can be trusted to exist between all naturally-constituted groups.

The critic of such failed research is not obligated to discover what these loss-created differences between GroupAkept and GroupB actually were, or to demonstrate that these differences could plausibly affect health.  For one thing, he cannot discover loss-created differences because he does not have access to the data, and for another, his ability to imagine plausible differences having plausible health effects amounts to him passing a creativity test, and it is not part of scientific method to approve defective research whenever a critic fails a creativity test.  To discredit the research, it is sufficient to point to the researcher's failure to guarantee pre-treatment equality of groups.

Nevertheless, even though it is inessential to effective criticism, it does not hurt to illustrate how subject loss may have ruined the experiment.  For example, it may be that the 20 patients who were lost quit because their first dose of the drug was so toxic that they were convinced that continuing to take it would kill them, and so that the 20 patients lost tended to be patients who were particularly sensitive to the drug, and so more likely to be killed by it, and so their loss removed 20 or so early deaths from the experimental-drug group, and so which made the experimental-drug group look healthier and longer-living than it would have looked had the twenty stayed in.

And if it is possible for me to almost-instantaneously imagine one way that subject loss may have affected the health results, it must be possible for a more creative person willing to give it fifteen minutes attention to imagine twenty ways, and it must be possible for a team of five creative people working for a couple of hours to imagine a hundred ways — and we may be grateful that such demonstrations of critic creativity and tenacity are not required.  It is sufficient for the critic to no more than point out that it is impossible to prove drug effectiveness in an experiment relying on naturally-constituted groups.

Notice that in the above example, the subjects lost are ones carrying something like a sensitivity to the experimental drug, or in other words an inherent characteristic present in the subjects on the day they were being randomly assigned to Experimental and Control Groups.  If on that pre-treatment day a test of drug sensitivity had been given, and the 20 most sensitive subjects had been expelled from the experimental group right then, then that would most clearly have been a destruction of the pre-treatment equality of groups.  Or, if exactly that same culling had taken place immediately after the first drug dose, or immediately after the tenth, the destructive effect would have been identical, although now the culling would not have been pre-treatment.  Therefore, losing equality of subject characteristics is destructive no matter when it occurs, and starting off with pre-treatment equality but losing it can be just as bad as never having had it in the first place.

Equal Loss Of Subjects In Both Groups

Might an experiment seemingly invalidated by subject loss be redeemed by pointing out that subject loss took place in both groups — say particularly in a case where the number of subjects lost was exactly the same in both groups?

All right, let's start all over again with randomly-constituted GroupA and GroupB, each having N-per-group = 100.  Let us imagine also that in the middle of the experiment, 20 patients in each group suddenly walk out of the experiment, thus creating four new groups: GroupAkept N=80 and GroupAlost N=20, also GroupBkept N=80 and GroupBlost N=20.  Now it should be easy to remember that GroupAkept does not equal GroupAlost, and to conclude also that for exactly the same reason GroupBkept does not equal GroupBlost — but it may now be tempting to fall into the error of imagining that the 20 patients in each of the two lost groups were equal, and so that the 80 patients in the two kept groups must also be equal, and so that the experiment is not ruined, because personal-characteritics equality of the two kept groups that are continuing on in the experiment has been maintained, the personal-characteristics equality of GroupAkept and GroupBkept, and with the reduction to 80 patients per kept group not mattering.

By way of justification of the opposite view (that personal-characteristics equality of subjects in GroupAkept and GroupBkept has been lost), it should be sufficient to invoke the rule-of-thumb that we have adopted — the rule-of-thumb that to get randomly-constituted groups, and therefore groups whose equality can be trusted, requires reliance on a computer printout of random numbers.  No computer printout being credited with dictating subject loss in our example means no random assignment can be assumed, and therefore means no equality of groups can be trusted.

But there remains the creative-imagination test — inessential though it should be — which requests that at least one scenario be imagined in which the loss of an equal number from each group might leave behind unequal groups.  All right, how about this?  The twenty who walked out of the Experimental Group walked out after their first infusion of chemotherapy revealed to them what agony the drug causes; and the twenty who walked out of the Control Group walked out because they discovered that they were in the group in which not even an alleviation of symptoms was expected, let alone a cure.  Entirely different reasons for walking out indicates entirely different personal characteristics of those who walked and which makes GroupAlost unequal to GroupBlost, and from which it follows that GroupAkept is unequal to GroupBkept, and so that the latter two cannot be used in an experiment.

And so the San Miguel (2008) meticulous attention to establishing pre-treatment equality of patient characteristics within the two groups turns out to have been for nothing, because a day or so later there began to be opened floodgates which allowed that equality to be washed away — the floodgates of permitting non-random subject loss.

San Miguel (2008) Flunks The Test Of Research Integrity

Of the Five Requirements that had to be satisfied to achieve scientific validity, San Miguel (2008) blatantly violated four.  Violating just one would have proven fatal to the validity of the study's conclusions, but violating four actually imbues San Miguel (2008) with an alternative utility — as exercise material in scientific-method courses teaching students to distinguish scientific research from Madison-Avenue-inspired infomercial creation.

ELABORATING THE RED BOX WARNING

To the earlier unofficial RED BOX WARNING at KAPLAN-MEIER redbox might now be added the following:

This unofficial
RED BOX WARNING
is not yet an
FDA logo
BLACK BOX WARNING,
but should be:

Cancer patients are also advised that what commonly passes for scientific research can be divided into two categories — genuine research complying with the requirements of scientific method and whose conclusions are valid, and drug-promoting research which disregards requirements of scientific method and whose conclusions are unfounded.  Signs that research may be promotional are:

  1. the publication having as many as 20 authors,
  2. the research being open-label instead of blind which would have been better, and instead of double-blind which would have been best,
  3. a very large enrollment of subjects,
  4. large subject loss (many fled/shed),
  5. researcher creation of shed subjects by cuttingoff research,
  6. reliance on the distorting power of Kaplan-Meier graphs,
  7. Kaplan-Meier survival curves not being allowed to hit zero,
  8. deviation from protocol dosages,
  9. adoption of Intent-To-Treat (ITT) categorization of patients,
  10. background-therapy treatment of adverse reactions while the clinical trial is in progress, especially if the drugs used in such adverse-reaction-alleviating therapy are not disclosed,
  11. the research being funded by pharmaceutical companies,
  12. several authors admitting receiving payment from, or being employed by, or having an equity interest in pharmaceutical companies who have a financial interest in the clinical trial.

Whereas publication in peer-review journals, or approval by regulatory agencies, or recommendation by health authorities should be signs of valid research, such at the moment happens to be only undependably the case.


Life-and-death decisions are being made hourly the world over on the basis of the above Johnson&Johnson research trilogy, and other studies which use similar methodology and which are thus open to similar criticism.  Therefore, it is incumbent on those who bear responsibility for such research to either

  • publish rebuttals so that the criticism can be qualified or withdrawn, and so that patient anxiety concerning the integrity of oncological care can be allayed,
or to
  • broadcast acknowledgements that the research at issue is in fact defective and its conclusions undependable, and to dedicate efforts to upgrading future research.   ▢

HOME PAGE OF SEMMEL-WEIS.ORG