Aynsley Kellow: COVID-19 and the Problem with Official Science

 

All models are wrong but some are useful, which cannot be said of the casualties projected for a disease that has proven about as lethal as a seasonal flu. Recall that 1969’s Hong Kong flu killed an estimated one million worldwide but did not stop the Woodstock festival. Yet driven by fear and folly we have trashed an entire economy.

No lesson seems to be so deeply inculcated by the experience of life as that you never should trust experts. If you believe the doctors, nothing is wholesome: if you believe the theologians, nothing is innocent: if you believe the soldiers, nothing is safe. They all require to have their strong wine diluted by a very large admixture of insipid common sense.  —Lord Salisbury, 1877

The desideratum in modern governance is evidence-based policy. It is not always achieved or, as we have seen with the coronavirus panic, possible. It is far preferable, however, to policy-based evidence, which was to be found with the rushed production of new evidence that had not met the US National Oceanographic and Atmospheric Administration quality assurance processes in a desire to influence Donald Trump’s deliberations over whether to withdraw the US from the Paris Agreement (as I describe in my forthcoming essay in Climate Change: The Facts 2020).

 

In climate science the problem appears irrevocable. Scholars like Roger Pielke Jr in his book The Honest Broker have pointed out that there is no linear relationship between science and public policy: scientific findings rarely lead to single policy conclusions. Despite this, the epistemic community in climate science seems to think that even more scary model-driven scenarios will lead policy-makers to do “the right thing”.

Pielke has recently exposed just how disgraceful this conduct has become, with a series of articles in Forbes and now an article submitted to an academic journal. He gives chapter and verse on the Intergovernmental Panel on Climate Change adopting an extreme emissions scenario, Representative Concentration Pathway 8.5 (or RCP8.5) to produce an alarmist special report last year in the lead-up to the annual Conference of the Parties to the Framework Convention on Climate Change. This scenario, in which more coal was burned than was known to exist, was developed as an extreme case, yet it has become adopted as “business as usual” and underlies claims by environment groups that we are facing a “climate emergency”.

Pielke also chronicled the role of billionaire renewables investors Tom Steyer and Michael Bloomberg in having RCP8.5 adopted by the US and then by the IPCC as business as usual. Both were unsuccessful Democratic Party candidates for the presidency of the United States, and while Bloomberg featured in Michael Moore’s cinematic own goal, Planet of the Humans, Steyer did not. Steyer made a fortune in coal, including Whitehaven in Australia, before undergoing a damascene conversion while hiking in the Adirondacks with Bill McKibben, founder of the group 350.org that militates for divestment from fossil fuels, a call many (including universities) have answered, despite ample evidence that divestment does not work—it simply cheapens shares for less-moralistic investors. McKibben does feature in Planet of the Humans.

As this example shows, good science upon which to base public policy is sometimes hard to come by, which raises the question of how we identify “good science”. If we draw on the philosophy of science, we can find science advancing by the Kuhnian predominance and displacement of paradigms. But this activity is clearly political and, as Paul Feyerabend pointed out, scientists engage in all manner of undesirable behaviour in order to prevail, enhance their reputations and secure tenure and research grants. Such conduct extends to noble-cause corruption where they think it might advance some good cause in a mistaken belief that the science-policy connection is linear.

Karl Popper put his faith in a kind of scientific liberalism, pointing out that empirical propositions, unlike those of mathematics, cannot be proven and should only ever be accepted as true provisionally, and it should be anticipated that they might be proven wrong. Disagreement is therefore an essential part of the scientific method. Scientific propositions have to be falsifiable, and attempts at falsification should be encouraged. We should prefer those propositions that have survived repeated attempts at falsification.

The US Supreme Court, in Daubert v Merril Dow, developed some rules for judging science in expert testimony that owe much to Popper, with a hint of Kuhn. The five Daubert factors are:

1/ the science must use methods and procedures that may be tested

2/ it must take into account the known and potential rate of error in these processes

3/ it must have been subjected to peer review by the expert’s peers in his or her field

4/ there must be standards that control the operation of techniques, and

5/ the methods must be accepted for use in the relevant scientific community. The emphasis is very much on science as method.

 

Official science

Feyerabend went further than Popper, and advocated a kind of “scientific anarchy”. As far as he was concerned, “official science” was anathema, since it inherently quashed dissent and produced a kind of scientific monoculture. He was not alone. In his famous farewell address, Dwight Eisenhower warned against a powerful scientific elite along with the military-industrial-political complex that dominated defence policy. Feyerabend would certainly not approve of intergovernmental official science, such as that delivered by the IPCC.

Official science, usually provided by Chief Scientists, Chief Medical Officers and the like, has been to the fore in the COVID-19 pandemic, with politicians relying on their advice almost totally. This provides political leaders with the authority that comes with expertise, but it filters out other perspectives. It is almost as if politicians have not heard of the expression advising against putting all one’s eggs in one basket.

The justification for channelling scientific advice through a single adviser is that this does not confuse policy-makers, who want “clear and simple statements about the need for action, and that the admission of the inherent uncertainty blunted the capacity to win support for urgent implementation of mitigation policies”. Thus the late Ian Castles described the views of then Academy of Science President Jim Peacock in 2009, reporting on the relative lack of policy influence of an Academy of Social Sciences policy paper on “Uncertainty and Climate Change: The Challenge for Policy”, to which I contributed a paper (along with meteorologist John Zillman and economist Warwick McKibbin).

The paper argued (quite correctly) that “An in-depth understanding of the nature and significance of … uncertainties is essential for the formulation of properly informed national and international action on the greenhouse issue.” This was correct if for no other reason than a cap-and-trade policy was not appropriate if there was uncertainty surrounding what size the cap should be. For this reason, the economists who developed emissions trading for sulphur dioxide rejected it for carbon dioxide. An appreciation of uncertainty, in other words, was central to developing an appropriate policy response—even if Jim Peacock was correct in his assertion that governments would be more likely to act with clear, compelling simplified advice.

That is fine—as long as the advice is accurate. The problem is, that requires a kind of scientific infallibility that is profoundly anti-scientific, because good science is that which looks best after vigorous contestation.

 

Official science and the COVID-19 crisis

The reactions of various governments to the emergence of the SARS-CoV-2 virus (which causes the COVID-19 pandemic, but was too closely identified with SARS, and therefore China, for the World Health Organisation) have highlighted the dangers of relying on limited sources of information and advice. “Official science” has the effect of narrowing down possible sources of information.

The modelling from the Doherty Institute in Melbourne that has informed the Australian government, for example, cites basic data on a WHO Situation Report that is dated April 4, 2020. Interestingly, this is dated after the government introduced lockdown measures. The report also cites literature dated March 27. It is possible, of course, that an earlier draft of the report was the source of guidance to the government., but only on April 16 did modelling using actual Australian data commence.

The Doherty report explicitly used several assumptions that had been used in the Imperial College London report by Ferguson et al that had spooked the UK government into a strict lockdown—though the UK did not close the borders or require testing and quarantine. Doherty also effectively benchmarked against another Imperial College report, stating, “Our findings are consistent with a recently published model that relates the clinical burden of COVID-19 cases to global health sector capacity, characterised at high level.”

This underscores the extent of the reliance by several governments on the Imperial College modelling. Ferguson and his colleagues used the report of the WHO-China Joint Mission on Coronavirus Disease 2019 (published on February 28), and they based their Infection Fatality Rate (IFR) on a paper published on March 13: Verity R., Okell L.C., Dorigatti I. et al, “Estimates of the severity of COVID-19 disease”.

The modelling by Ferguson et al was influential in the UK and the US. Its projections were 2.2 million deaths in the US, and it helped lead to the policies adopted there. Clearly, it was also influential in Australia. It has been the subject of extensive criticisms by other scientists, including a team at Oxford.

As many have pointed out, the results of the Imperial College modelling were not subjected to peer review, and nor was the code used in the model made available, so they would not have met the Daubert standard if the coronavirus was the subject of litigation, rather than public policy-making. The use of the IFR of the Verity, Okell, Dorigatti et al paper suggests that their modelling was based on a wider base than it actually was, because this paper again turns out to be from the Imperial College team. All but one of the thirty-three co-authors of this paper were from Imperial College, with the single outlier from Queen Mary College, London. Last on the authors list is Neil Ferguson, the head of the model team, and he was listed as one of two corresponding authors.

The Imperial College modelling paper was unrefereed, and so too was the Verity, Okell, Dorigatti et al paper. It was posted at medrxiv.org, and that website made it clear what this meant: “This article is a preprint and has not been peer-reviewed … It reports new medical research that has yet to be evaluated and so should not be used to guide clinical practice” (original emphasis). While not fit to guide clinical practice, it somehow ended up guiding public policy.

The problem with estimating an IFR or Case Fatality Rate (CFR) is that accurate data upon which to base the numerator and denominator have been elusive. The high rate of asymptomatic cases has been a problem with the denominator, though it has been diminished once testing was extended beyond those who were symptomatic. But even in jurisdictions like the UK and US there has been pressure to count all deaths with the coronavirus, and even some suspected to be with, to be counted as deaths from the coronavirus.

The Imperial College team looked at the Diamond Princess data as well as WHO and other Wuhan data. Nevertheless, they estimated the crude CFR from China at 3.67 per cent and their “best estimate” of the overall CFR was 1.4 per cent. This seemed modest compared with the Wuhan rate, but Stanford University epidemiologist John Ioannidis was able on March 17 to use the Diamond Princess data (the best early data that was based on a closed population with testing and reliable mortality) to derive a mid-range CFR of 0.3 per cent—less than a quarter of the figure used by Verity et al and the Imperial College team.

There are at the time of writing fifty-one studies based upon polymerase chain reaction or seriological studies that give a mean IFR of 0.27 per cent—less than a quarter of the rate assumed in the Imperial College modelling. Ioannidis also published online on March 19 in the European Journal of Clinical Investigation a peer-reviewed opinion piece, “Coronavirus disease 2019: The harms of exaggerated information and non‐evidence‐based measures”, but this and other warnings appear to have been ignored. Jonathan Fuller, writing in Boston Review, discusses the contrast between those epidemiologists who prefer models (Ferguson) to those in the “evidence-based medicine” movement who maintain high standards of clinical evidence and rely on randomised trials (Ioannidis). The models have clearly prevailed over the “evidence-based medicine” with COVID-19.

The modelling commissioned in Australia from the Doherty Institute employed several assumptions used in the Ferguson modelling and reported that their findings were “consistent with a recently published model” that was, indeed, the Imperial College model. The Doherty report cited a WHO Situation Report dated April 4, but it ignored the Ioannidis paper in the European Journal of Clinical Investigation and the essay Ioannidis published on March 17 on the Statnews website, “A fiasco in the making? As the coronavirus pandemic takes hold, we are making decisions without reliable data”.

In that paper, Ioannidis first presented his analysis from the Diamond Princess data, the only reliable (non-Chinese) data based on extensive testing of a population that had been confined in close contact, while warning of the dangers of proceeding to make rushed policy on the basis of poor quality data. Doherty modellers Professor Jodie McVernon and Professor James McCaw, in an article in Cosmos on April 1, stated, “This virus, if allowed to spread uncontrolled, is much more infectious and severe than any recorded influenza pandemic.” They were wrong—at least on severity—and Ioannidis was correct.

In other words, an analysis based on higher-quality data was not preferred to that of Neil Ferguson and his team at Imperial College, which was self-referential, unrefereed, and did not disclose its methods (model code). The code was disclosed later, and was roundly criticised by an anonymous software engineer, “Sue Denim” (“pseudonym”) published on the Lockdown Sceptics website, for its use of Fortran (first developed in 1953 and considered outdated) and for elements that meant even running the same data through it yielded different results. In other words, the model itself violated a key Daubert standard! Moreover, the code disclosed was a later, patched-up version, rather than the original code used in the influential report—presumably because that was in even worse shape. “Sue Denim’s” conclusion was scathing:

All papers based on this code should be retracted immediately. Imperial’s modelling efforts should be reset with a new team that isn’t under Professor Ferguson, and which has a commitment to replicable results with published code from day one.

 

Track records and cultural influences

The question arises as to why anyone would exhibit a preference for Ferguson’s model results over the evidence-based medicine approach of Ioannidis. The question is even more astounding when one considers the track record of modelling by Professor Ferguson and the Imperial College team.

In 2001 the Imperial College team produced modelling on foot-and-mouth disease that advocated widespread culling of animals in neighbouring farms, even without evidence of infection. In excess of six million cattle, sheep and pigs were slaughtered, at a cost to the UK economy of £10 billion. A subsequent critical review published in a veterinary journal concluded:

The models that supported the contiguous culling policy were severely flawed, being based on data from dissimilar epidemics; used inaccurate background population data, and contained highly improbable biological assumptions about the temporal and quantitative parameters of infection and virus emission in infected herds and flocks.

In 2002, Ferguson predicted that between fifty and 50,000 people would likely die from exposure to BSE (mad cow disease) in beef, and up to 150,000 if it spread to sheep. In the UK, there have been only 177 deaths from BSE.

In 2005, Ferguson told the Guardian that up to 200 million people could be killed from bird flu. Only 282 people died worldwide from the disease between 2003 and 2009.

In 2009, advice by Ferguson and his Imperial College team formed the basis for a government estimate that swine flu would lead to 65,000 UK deaths. Swine flu killed 457 people in the UK.

There is a pattern here, and it should have sounded alarm bells (but did not) about their modelling for COVID-19. Why not? One answer lies in the fact that the issue carried with it a great deal of uncertainty, which made important the cultural beliefs that those dealing with it brought to the task.

The Canadian ecologist C.S. “Buzz” Holling once identified three “myths of nature” that he observed in ecosystem managers and that were later adopted by scholars of risk like John Adams. The myth “Nature Benign” sees nature as predictable, bountiful, robust and resilient—able to absorb disturbances with little harm and suggesting a laissez-faire management approach as appropriate. (The myth can be depicted visually by a concave surface on which a marble rests.) Adams calls those who subscribe to this myth “Individualists”.

Those who adhered to a myth of “Nature Ephemeral” saw nature as existing in a delicate balance which was likely to be disrupted by human intervention, plunging it into catastrophic collapse. (Imagine a marble on a convex surface.) Adams calls those who subscribe to this myth “Egalitarians”.

Those who adhere to the myth of nature labelled “Nature Perverse/Tolerant” essentially combined the first two myths: nature is robust within limits, but large enough shocks can lead to catastrophe, suggesting an interventionist management style. Adams calls those who subscribe to this myth “Hierarchists”.

Such myths are important in filtering our appreciation of the world, and are especially important when uncertainty is great. We can begin to see the relationship between some political ideologies and stances on questions of environmental risk, but more importantly for our subject here, we can see why the Victorian Chief Health Officer, who has published on the “climate emergency” that is the product of a misguided representation of an extreme emissions scenario (RCP8.5), might lead his government down a more catastrophist path—especially when his deputy has tweeted her support of Greta Thunberg.

Cultural filters affect what assumptions are made and what estimates to make when there is little hard evidence to go on, and the central analyses in this case are replete with assumptions and estimates. A content analysis reveals that the paper reporting the Imperial College modelling refers to assume/assumption (or related words) twenty-seven times and estimate (or variant) thirteen times. The Verity et al paper on which it draws includes thirty-six references to assume/assumption and a staggering 120 references to estimate. For the analysis from the Doherty Institute the numbers are forty-four references to assume/assumption and twenty-two to estimate.

When there is little hard evidence, one’s cultural disposition to risk becomes all the more important in deciding what assumptions and estimates to adopt. The disagreements that lie at the heart of Popperian standards of science become all the more important as quality assurance checks.

A further problem lies with the nature of epidemiology as a discipline, because many of its practitioners know little of human behaviour and rely heavily on mathematics, often having qualifications outside medicine. Neil Ferguson, the leader of the Imperial College team, for example, holds degrees in physics. Their models assume little or no voluntary action to mitigate risks—as if humanity will stand still, frozen in the headlights of the oncoming epidemic—and then use the model results to justify compulsory action. Moreover, they are heavily incentivised to exaggerate their models.

Michael Levitt, Professor of Structural Biology at Stanford and the winner of the Nobel Prize for Chemistry in 2013, has explained why their predictions tend to be so apocalyptic. If they underestimate the death toll likely to result, they face catastrophic reputational damage: if people die, they get the blame. On the other hand, if they overestimate the death toll, they face zero consequences. Indeed, Professor Ferguson has enhanced his reputation with his past exaggerated model results, receiving an OBE in the New Year’s Honours list in 2002 for his modelling on foot-and-mouth disease that resulted in the unnecessary slaughter of millions of animals—in advance of the critical reviews that drew attention to the errors of exaggeration that plagued that modelling. The leader of that effort, Professor Roy Anderson, received a knighthood in 2006.

I have elsewhere described the noble-cause corruption of science, but here the basis of corruption is more venal, involving rewards of kudos and career advancement. Even worse is possible: Ferguson no longer has any beneficial financial relationships with drug companies, though he once did, having ended them in order to play a more active role in official science. The financial relationship between modellers and those who stand to make a fortune from marketing the drugs and vaccines that will solve the virtual crises, enhanced by the modellers, deserves an essay in itself.

As an interesting aside, Sunetra Gupta, a professor of theoretical epidemiology who led an Oxford study critical of the Imperial College modelling, won a complaint against Professor Anderson in 1999 when he falsely accused her of being appointed to a readership at Oxford because she was sleeping with a professor on the committee. After a unanimous vote of no confidence in him, Anderson left Oxford for Imperial College, with Ferguson. Together with Ferguson’s resignation from the Scientific Advisory Group for Emergencies (SAGE) early in May after breaking lockdown rules to keep assignations with his married lover, while preaching to the public how important it was to keep them, there is plenty of flesh here on the bones of the inevitable Hollywood movie.

On March 22, Ferguson admitted that Imperial College London’s model of COVID-19 was based on undocumented, thirteen-year-old computer code that was intended to be used for a feared influenza pandemic, rather than a coronavirus. As a group at the University of Virginia led by Professor Stephen Eubank put it, “Despite the progress, one must ask: Why we are still using models developed fifteen to twenty years ago?”

 

Evaluating policy measures

The fact that the Imperial College model was developed for influenza is significant, because when we consider the evidence base for the policy measures taken by governments to address the COVID-19 crisis, many have questioned whether there is any evidence that any of them are efficacious. There is, however, such a body of evidence—though for flu. But if a model of the flu was a good enough basis upon which to close down economies, flu is arguably good enough—or as good as we are likely to get—as an evidence base upon which to consider policy responses.

In October 2019 the WHO Global Influenza Program produced a report, Non-Pharmaceutical Public Health Measures for Mitigating the Risk and Impact of Epidemic and Pandemic Influenza, that systematically reviewed the evidence base for eighteen non-pharmaceutical interventions, covering: personal protective measures (such as hand hygiene, respiratory etiquette and face masks); environmental measures (surface and object cleaning, and other environmental measures); social distancing measures (contact tracing, isolation of sick individuals, quarantine of exposed individuals, school measures and closures, workplace measures and closures, and avoiding crowding); and travel-related measures (travel advice, entry and exit screening, internal travel restrictions and border closure).

The actions recommended in that report, for pandemic and epidemic, ranked by severity, are summarised in Table 1 (above), which is taken from the report. The report noted that each WHO member state and each local area would need to take into account the feasibility and acceptability of proposed interventions for their circumstances, as well as their anticipated effectiveness and impact.

What leaps out of the table is that “Quarantine of exposed individuals”, “Entry and exit screening”, “Border closure”, and “Contact tracing” are not recommended in any circumstances. This is bad news for the government’s attempt to sell their contact tracing app to a public that is at least somewhat sceptical.

The WHO has come under criticism for its soft treatment of China during this crisis, but one notable (and praiseworthy) feature of this WHO assessment of policy responses is that it considers factors that many governments have simply ignored: Quality of evidence; Values and preferences; Balance of benefits and harms; Resource implications; Ethical considerations; Acceptability; and Feasibility.

The negative assessment of contact tracing was because of the very low overall quality of evidence and uncertainty about the values and preferences of contact tracing among the community. Contact tracing on a large scale was seen as leading to ethical issues such as leakage of information, and inefficient usage of resources, including human resources. It was considered there were high costs. There were privacy concerns and the acceptability of contact tracing among the public was seen as uncertain. It was considered to be more feasible in higher income countries and was likely to be hampered by the short incubation and infectious periods of influenza.

The longer incubation period with COVID-19 is one reason why contact tracing might be considered more favourably. The ethical, economic and political acceptability concerns have, however, been swept aside by the alarm created by Professor Ferguson’s modelling.

The report also recommended against border closures, which is somewhat surprising, since it appears that closure of or severe restrictions on border crossings have been the keys to the success in Australia restricting the spread of the virus. In contrast, Europe has suffered adverse effects from the lack of border controls, thanks especially to the Schengen Agreement, which essentially removed all borders within the European Union. The WHO report stated, “There is limited evidence for the effectiveness of border closures, and it has legal, ethical and economic implications.” It did, however, expect that strict border closures would be effective “within small island nations”.

Australia is vast geographically and its low overall population density, though high urbanisation, appears to have been beneficial. Being an island has undoubtedly helped Australia and if we net out the spread from passenger disembarkations from cruise liners and nursing homes, there has been very little community spread.

The value of other measures remains questionable, especially lockdown measures. Most of the deaths globally have been in hospitals and nursing homes, cruise liners and even among those who have stayed at home. Recent data from New York found, much to the surprise of Governor Mario Cuomo, that 66 per cent of hospital admissions for COVID-19 were people who had obeyed the order to lock down. There is no evidence that outdoor activity is a great risk, and time in the sun can provide a little welcome vitamin D, which evidently has some prophylactic value.

 

What is to be done?

Policy-makers in Australia and elsewhere have been spooked by highly questionable modelling, which has been preferred to actual evidence produced by those in the “evidence-based medicine” school of epidemiology.

In doing so, they have simply ignored the consequences of their actions—not just the economic costs, but the costs in lives that will be lost as a result of their actions. Intensive care beds that were predicted to be overwhelmed have not been needed. The 2200 beds in Australia have not had more than 100 COVID-19 occupants. Even the alarmist premier of the moment, Daniel Andrews, abandoned plans to create more ICU beds. The temporary Nightingale Hospital, built in the ExCeL centre in London, has now been mothballed. Modelling in Australia suggests a human cost of 750 extra suicides as a result of the 10 per cent unemployment levels already reached, and double that if unemployment reaches 16 per cent.

The opportunity costs in medical procedures abandoned to make way for the tsunami of COVID-19 cases that exist only in Professor Ferguson’s model will be immense. Diagnostic procedures that would have identified cancers early have not been conducted, and one estimate put at one million the number of tuberculosis cases that would result from the abandonment of testing. The leaders have fallen into the pitfall identified by Frédéric Bastiat in his Parable of the Broken Window in the middle of the nineteenth century: ignoring the unseen, while focusing on the seen. Not only have they failed to see the problem in regarding model results as “evidence” while ignoring actual evidence (a common phenomenon in climate science), they have not looked at the consequences of their actions that are not immediately apparent. (Perhaps the 200 or so economists who wrote an open letter recommending that the economic costs should not be a concern should read some Bastiat while working at home, removed from their models, and recall that he basically developed the notion of opportunity cost in all but name.)

“Sue Denim” has suggested what should flow from this flawed and (both economically and socially) costly debacle:

all academic epidemiology [should] be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage, and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people, and the results speak for themselves.

We would probably exempt clinical epidemiologists from that, and it is a recommendation not likely to come about. As the saying goes, all models are wrong, but some are useful. Reliance on modelling and the uncertainty aided by the internet have been important in fuelling the needless panic. All the evidence on the IFR of COVID-19 suggests it is about as lethal as seasonal flu, and we would do well to recall that the 1968-69 Hong Kong flu killed an estimated one million worldwide and about 100,000 in the US, but did not even lead to the cancellation of the Woodstock festival.

As Jonathan Fuller also wrote (and as Popper would endorse), “Institutionalized scepticism is important in science and policymaking.” That is especially so with “virtual” science, where there is considerable uncertainty and worldviews play a role. “Official science” needs quality assurance, especially by the application of sceptical minds from other disciplines.

Unfortunately, in this case (to distort an aphorism often wrongly attributed to Churchill) a dodgy alarmist model went halfway around the world before Professor Ferguson had got his trousers on.

Aynsley Kellow is Professor Emeritus of Government at the University of Tasmania. He is the author of International Toxic Risk Management: Ideals, Interests and Implementation; Science and Public Policy: The Virtuous Corruption of Virtual Environmental Science and Negotiating Climate Change

Comments are closed.