Paste your Google Webmaster Tools verification code here

Was Michael Gove right? Have we had enough of experts?

Was Michael Gove right? Have we had enough of experts?

Experts are finding it harder to be heard. But is that because of how they communicate? And how solid is their much-vaunted evidence base anyway?

Using evidence to assess the outcomes of policies is a vital part of good governance. Whether it is examining how a Budget will affect those on low incomes, or how well fishing quotas are managing stocks, no one but the most bumptious ideologue would deny it. The plastering of demonstrably dodgy statistics on the side of the Brexit battle bus last year stoked indignation on the part of many who think of themselves as rational and well-informed. The arrival of Donald Trump, an American president who feels no compunction about disseminating falsehood, has further darkened the mood among the liberal intelligentsia. There is a strong sense that the forces of reason must now rise up and see off the purveyors of the “post-truth” world.

We must, however, also grapple with one other contemporary reality. Underlying the great turmoil of politics at the moment is precisely the view that “the experts” are less trustworthy and objective than they purport to be. Rather, their considered opinions are seen as a self-reinforcing apparatus for putting themselves beyond challenge—to advance their holders’ status, their careers or, most damaging of all, their political views over those of the less-educated classes. The great popular suspicion is that an elite deploys its long years of schooling and “the evidence base” to make itself sound more knowledgeable as it rationalises the policies it was going to prefer all along.

Is that a fair charge? Well, that is an empirical question, and definitive evidence for answering it is in short supply. What we can usefully do, however, is interrogate where the “evidence base” comes from, and how solid it is.


“Agreeing to referee academic papers
yields neither monetary reward nor esteem,
but it subjects you to a range of human temptations”


Back in 2010 we wrote a piece arguing that an over-emphasis on empirical evidence in political rhetoric was alienating the public. The increasing reliance on the expert stamp of authority was eroding a sense of shared values between governors and the governed. Unless you were familiar with the latest nuance in academic evidence, we warned, you were automatically unqualified to have a valid opinion.

We thus see the current defenestration of experts as a reaction to long-term trends in public life. If it is true, as Michael Gove said during the European Union referendum debate, that people “have had enough of experts,” it is because empiricism locks non-experts out of discussions that impact on, but may not capture, their day-to-day experience. Last year, many members of the public formed an impression, whether fairly or not, of experts attempting to settle an important and emotive matter over their heads. A fault line between “the people” and those who think they know what’s good for them, which has been there for some time, became apparent. The June election was another reminder of this, as certain policies that many experts felt didn’t stack up—universal pensioner perks, free university education for students and costly nationalisations—turned out to be rather popular.

As paid-up members of the quantitative-expert class we share some of the current foreboding that a dystopian future awaits, where objective truth is not respected. There are many good examples of evidence influencing policy. But there are bad examples too, and if deference towards expert opinion goes too far, democracy ceases to operate as it should. Experts may see it as their role to uphold truth, facts and evidence, but they can only do so if they maintain public trust. That implies many things—better communication, for example. But before anything else it implies experts adopt a reflective approach to their own work, and open it up to outside scrutiny too.

There is a particular onus on social scientists here, because there is often more subjective judgment and interpretation in their fields than there is in measuring physical reality, leaving more scope for biases to entrench established views. Many social scientists are meticulous; but there are others who need to get their own house in order where “the evidence” is concerned. If it is going to be used to close down arguments it needs to be rock-solid, but how often is that the case?

Scarcely a day goes by without the press featuring some research, polished by a university PR team, purporting not only to establish that sausages cause cancer or that the people of Basingstoke are happier than the people of Burnley, but also that Something Must Be Done About It.

Academic papers from the social sciences and health are now an important foundation of what has come to be called the “evidence base.” Who could be against evidence? But this is a rather telling phrase. The “base,” when you stop and think about it, is logically superfluous; its function is purely rhetorical—suggesting that the evidence in question (unlike any other) is rested on something that shores it up. But, as we shall see, there are often question marks around its solidity, especially in the social sciences.

The magic concept invoked to define it—and to separate the priesthood from the laity has become that of “peer review.” Peer review is the process by which submissions to academic journals are scrutinised by the academic peers of the authors—the “referees.” Only papers deemed suitable by referees will be published.

The process of this scrutiny, of peer review, may conjure up images of scholars carefully examining the article line by line, checking every single piece of analysis verifying its claims. Very occasionally, this Platonic ideal may exist. When, for example, Andrew Wiles claimed to have proved Fermat’s Last Theorem, his manuscript was subjected to the most thorough investigation imaginable by the world’s leading experts in the relevant areas of maths. An error was indeed discovered, one which Wiles was happily able to fix after months of wrestling with the problem. As a result of the peer review process in this case, we can be entirely confident that Wiles proved Fermat’s Last Theorem.

In almost all other cases, at least within the social sciences, the reality of peer review is rather different. We should think of a harassed academic, pressured by the need to do his or her own research, by the demands of both students and the university administrators and being pestered by the journal editors to submit the review.

Refereeing is both unpaid and anonymous. The referee receives neither monetary reward nor the esteem that comes with getting one’s name in print. The task is seen as a tedious chore, and procrastination is widespread. In the social sciences, there are frequently delays of a year, and more occasionally two, between submitting the manuscript and receiving the referee reports.

One might ask why academics agree to referee papers at all? In part, it is convention: it is simply part of the everyday life of being an academic. But once a year many journals will publish a list of the names of their referees. This incremental addition to your CV, just might—perhaps, eventually—be part of the package that lands you a promotion or a job at a better university.

But serving as a referee under these conditions subjects you to a range of human temptations. Does the paper support or undermine one’s own work, for example, or does it appear to be written by a rival? Does it cite enough of the papers of the reviewers and his or her friends, because the number of citations of your own work by other academics is an important metric by which you are judged? Here, at long last, is the chance to slap down, under the cloak of anonymity, the smartarse who slapped you down at that conference five years ago.

Then there is the question of who chooses the referee. Enter the editorial board, which is made up—once again—of academics typically paid little or nothing. Again, the human factor creeps in. Years ago, one of the present authors submitted a paper to a leading American economics journal, a critique of a published article that had gained a certain kudos. One of the authors of the criticised piece was an editor at that journal—and, as was discovered by chance a few years later, he gave it to his co-author to referee. Needless to say, the negative article wasn’t accepted.

Once a paper is published, the chances of it being subjected to further scrutiny are remote. A tiny handful of articles become famous, and are downloaded thousands of times. Many receive no attention, and most will be read by very few scholars. Yet the mere fact that a paper has gone through peer review confers on it an authority in debate, which the lay person cannot challenge. So, all too often, there is no post-publication challenge within the academy, and no licence for challenge from outside. Locked out by the experts, some laypeople may start to feel like they have had enough.

So how might we improve peer review, and build “the evidence” on a firmer foundation? Economics has rightly been subjected to many criticisms, especially since the financial crisis. But the discipline has one extremely powerful insight, perhaps the only general law in the whole of the social sciences: people react to incentives. They may not always do so with the complete rationality described in economics textbooks. But thinking through the rewards on offer in any given situation helps to understand why people behave as they do.

Ideally, the incentives around research should be structured so as to maximise constructive scrutiny of every claim that is made. Instead, the rising pressures on academics to publish has created a set of incentives that exacerbates the need to negotiate the peer review process and appear in academic journals. The rising demand to publish has been met by a large increase in the supply of academic journals. One recent estimate is that there were 35,000 peer-reviewed journals at the end of 2014, many of them of decidedly doubtful quality. Why? Because incentives are everywhere.

A paper in the 23rd March edition of Nature by a group of Polish academics mercilessly exposes the problem. The title neatly captures the content of the article: “Predatory journals recruit fake editor.” The authors begin in an uncompromising manner: “Thousands of academic journals do not aspire to quality. They exist primarily to extract fees from authors.” They go on: “These predatory journals follow lax or non-existent peer-review procedures… researchers eager to publish (lest they perish) may submit papers without verifying the journal’s reputability.”

They adopted the brilliant strategy of creating a profile for a fictitious academic Anna O Szust, and applied on her behalf to be an editor of 360 journals. Szust is the Polish word for “a fraud.” Her profile was “dismally inadequate for a role as editor,” yet 48 of the journals offered to make her one, often conditional on her recruiting paid submissions to the journals.

This new study follows on from a 2013 piece by the journalist John Bohannon in which his purposefully flawed article was accepted for publication by 157 of the 304 open-access journals to which it was submitted, contingent on payment of author fees. That was a warning sign, and things have got worse since. The Nature authors state that “the number of predatory journals has increased at an alarming rate. By 2015, more than half a million papers had been published in them.”


“Once a paper is published,
the chances of it being subjected
to further scrutiny are remote”


None of this means that academic journals have moved into a post-truth world. There are clearly journals where high standards apply. The Polish academics approached 120 journals on the respected Journal Citation Reports directory as part of the 360 in their experiment. None of them accepted “Mrs Fraud” as an editor. And one can imagine specific reforms to get rid of those sorts of journals that are profiting through the equivalent of vanity publishing.

Even in serious journals, however, and even where referees do try their best, the scrutiny of just one or two people provides scant security. The mere fact that a paper has been peer reviewed is no guarantee of its quality or, indeed, its reliability.

The problem is nicely illustrated by a paper that appeared in Science at the end of 2015, in which a team of no fewer than 270 authors and co-authors attempted to replicate the results of 100 other experiments that they had published in leading psychology journals. The involvement of the original authors should have made it easier to reproduce the results. Only 36 per cent of the attempted replications led to results that were sizeable enough that one could be confident they had not arisen by chance. In other words, almost two-thirds of the attempts to replicate published, peer-reviewed results of papers in the top psychology journals failed completely.

The veneration of peer review has simply gone too far. The connected concept of “evidence based” has permeated policy discourse, and is sometimes used to lock out non-experts. But in psychology at least, as we have seen, there are papers whose findings could not be replicated that could have been flourished as part of an evidence base in support of one policy stance or another. The evidence is not “based” on any firm foundations; it rests on sand.

So conventional review is flawed; but fortunately, there are alternatives—some of them already in use. One other test of academic papers is by their ability to make successful predictions. This is not infallible. Someone may strike lucky and carry out the scientific equivalent of successfully calling heads 10 times in a row. But consider, say, coronary heart disease. A tiny handful of the thousands of papers on the topic published each year may eventually lead to the development of drugs that successfully pass all the stringent tests set out by the authorities and be licensed for use. To get that far, their insights about what makes the condition better or worse has to be borne out in clinical testing in the case histories of real patients. They do real good, and we can be confident they have some validity.


“Experts need to show some humility:
they can’t diagnose and prescribe
for all of society’s ills”


Another recent alternative is to open up the peer review process, so that it actively invites challenge, by letting scientific merit be determined by the esteem of the peer group as a whole, not just by two or three selected referees. One example is the physics e-print archive arXiv.org (pronounced “archive”). Authors can post their papers here prior to publication in a journal if they like, though some feel no need. The site has grown to embrace not just physics but maths and computer science, and, in a small way, quantitative finance.

To post a paper, an author must merely be endorsed by someone who has already published on arXiv. Moderators refuse papers which are obviously not science at all. But scientific importance emerges from the process of downloading and citation. So peer review really is carried out collectively by the relevant scientific community. The more downloads and citations, the chances of an error going undetected become very low indeed. The context is different, of course, but there is an echo here of the logic with which Google has conquered the world.

It is, however, only in the harder sciences where there has been a serious embrace of something approaching the marketplace—of the consumers, other academics in the field, deciding on the worth of a paper. In most disciplines the only model remains a monopoly supplier—the prestigious journal and its editorial board.

But it is in the social sciences that the suppression of challenge can have most political effect. A paper may be brandished purporting to show that all family structures are of equal merit, or that mass immigration does not reduce real wages, perhaps conflicting with religious convictions, personal experience or vernacular conceptions of how society functions. Whatever one’s views, the impression created that expert findings on such contentious political issues are immutable fact is bound to breed cynicism and “expert fatigue.”

An over-emphasis on expert opinion has already had insidious effects on democracy. One of these is a view among some in the intelligentsia, as described in Tom Clark’s Prospect piece (“Voting Out,” February), that the fundamental purpose of democracy is optimal, rational decision-making; if the electorate cannot manage this they—and by implication the democratic system—are at fault.

There are two obvious problems with this. Firstly, in order to rationally optimise society, someone would need to decide what the objectives are. And that is clearly a matter of political opinion. Secondly, it is flagrant mission creep. Democracy is, first and foremost, a mechanism for managing disagreement in society without bloodshed, chaos or repression. To boot, it allows the people to peacefully throw out those in power if they’re doing a bad job.

Expert analysis is of limited use in these tasks. Its recommendations cannot capture what the polymath Michael Polanyi called “tacit knowledge”—knowledge that is based on experience, which shapes people’s habits and beliefs without being codified. This doesn’t get a look-in.

Indeed, in modern social science it is very often only that which gets counted that is deemed to count. And who decides on that? It is, overwhelmingly, the “experts” who get to write the surveys that feed so much social science its raw material. If, for example, they are more interested in what someone’s ethnicity does to their views than they are in whether the respondent lives in the countryside rather than the city, then that is what scarce slots in the questionnaire will be used to find out. Through such means, priors and prejudices about what merits counting can colour the data, even before it has been crunched.

Democracy is a very crude system for giving decision-makers feedback about the quality of our lives. But this most basic process of consultation can never be replaced by data. For quantitative metrics are often very “lossy”—some things are not counted, and thus cease to count. Where experts imagine they can settle a fundamentally political argument through such empirical evidence, the consequences can fast become absurd.

In the UK, the Office for National Statistics has, encouraged by David Cameron in his early tieless phase, measured “well-being” and “happiness,” to guide public policy. This sort of data conflates a very great number of causal factors, which dilutes its value in guiding public policy decisions. And yet one commentator even suggested that, because well-being data showed high levels of contentment in Britain, the vote for Brexit need not have happened.

More generally, the result of putting empirical analysis on a pedestal can be intolerance towards others who start with different views. That was in evidence in some of last year’s sneering at “Leave” voters as dupes who couldn’t understand the arguments. Furthermore, if evidence is everything, but many don’t have the training to process it properly, then unscrupulous characters will spot a chance to make up the odd little, self-serving “fact” of their own—after all, only a minority will know the difference. The rise of empiricism in a world where we are bombarded with information might thus have actually contributed to the post-truth phenomenon.

Why did experts become so prominent in the decades before the crash? The narrowing of disagreement in politics after the end of the Cold War was surely important, as was the associated rise in managerialism. For example, central banks, which make hugely political decisions that shape the relative fortunes of borrowers and savers, were suddenly held to be above politics, and given independence. Huge faith was vested in their predictions, and those of associated technocrats at institutions like the International Monetary Fund, until the crash showed these could not be always depended on.

In the more austere times that have followed, the fundamental conflicts over resources and priorities—between natives and foreigners, between social classes—that probably never truly went away, are now back with a vengeance. The experts certainly have misgivings about all forms of populism and especially about Jeremy Corbyn’s Labour Party, with its cavalier assumptions about how much revenue it could easily raise.

But the backlash against “experts” is, nonetheless, still principally associated with the right. The more educated, liberal-leaning section of society needs to understand why this is. It is not because, as is commonly assumed, the right is simply the political wing of the dark side.

The right’s great insight is that the left can create a political apparatus with good intentions but the wrong incentives, and that this apparatus can become impervious to challenge. It argues that political choice is based on economic self-interest, and that this can apply, perhaps unconsciously, even to people apparently motivated by the public interest. These suspicions, articulated as “public choice theory” by the Nobel Prize winner James Buchanan, have most often been applied to bureaucracies with noble theoretical aims that go awry in practice, but the same analysis can be extended to universities and research institutes too—or indeed “the evidence base.” The Buchanan analysis can easily morph into an intransigent view that pursuing practically any collective goal will lead to empire-building bureaucracies, which also fall prey to “capture” by self-serving lobbyists. Taken to extremes, it promotes a profoundly destructive, atomistic worldview that leaves society paralysed in the face of the most serious moral questions. One only has to look across the Atlantic at the way the American right is responding to climate change and healthcare to see that.

Those who reasonably resist this worldview can counter it in two ways: either through bitter “with us or against us” polarisation, or by having the foresight to avoid the charges that public choice theory would lay at the academy’s door in the first place. That means at least examining the possibility that policies that come blessed with an expert stamp are serving the interests of those who put them forward, rather than dismissing it out of hand.

Truth and evidence must obviously be upheld. But there is a real danger in expert elites studying the electorate at arm’s length and seeking a kind of proxy influence without having to worry about gaining political support. We must not denigrate evidence-based thinking, a bad habit of thuggish regimes, but we must subject it to more “sense-checking,” and in communicating it must pause and give thought to what the broader public will make of it. The alternative is a dialogue of the deaf between the know-all minority and a general populace which some may caricature as know-nothings. In such a stand-off, real evidence soon becomes devoid of all currency.

To avert it, the experts need to show some humility: we can’t diagnose and prescribe for all of society’s ills. We also need to recognise that to be persuasive we must actually persuade—and not simply hector. The great mass of voters are not, after all, under any obligation to accept expert authority. We need to reflect critically on the problems in academia that can block the testing of ideas on the inside, and dismiss all challenge from outside our walls. And we need to show self-awareness: deep intimacy with a subject can, on occasion, lapse into a tunnel vision that blanks out culturally-rooted perceptions and the lived experience of voters. Those things can’t be ignored. They are, after all, the lifeblood and raison d’être of politics, and can only be gauged by asking people, unschooled as well as schooled, for their opinions, and ultimately relying on their decisions.

As published in August 2017 edition of Prospect Magazine
by Helen Jackson and Paul Ormerod

Image: Michael Gove by Policy Exchange is licensed under CC by 2.0
Read More

Beware the dysfunctional consequences of imposing misguided incentive systems

Beware the dysfunctional consequences of imposing misguided incentive systems

Following the disclosure of salaries at the BBC, it has hardly seemed possible to open a newspaper or switch on the television without being bombarded by stories about pay.

By pure coincidence, an academic paper entitled “Pay for Performance and Beyond” has just appeared. So what, you might ask? Except that it is one of the 2016 Nobel Prize lectures, by Bengt Holmstrom, a professor at MIT.

Holmstrom’s work began in the 1970s on the so-called principal-agent problem. This is of great practical importance. For example, how should the owners of companies (the “principals”, in economic jargon) design contracts so that the interests of the directors (the “agents”) are aligned as closely as possible with the interests of the shareholders?

Many aspects of economics have a lot of influence on policy making. But this is not yet one of them. We have only to think of the behaviour of many bankers in the run up to the financial crisis. Stupendous bonuses were paid out to the employees, and, in examples such as Lehman Brothers, the owners lost almost everything.

It is not just at the top levels that scandals occur. Towards the end of last year, Wells Fargo had to pay $185m in penalties. Holmstrom cites this prominently in his lecture. The performance of branch managers was monitored daily. They discovered that one way of doing well was to open shell accounts for existing customers. These were accounts which the customers themselves did not know about, but they counted towards the managers’ bonuses.

A culture of pressure to perform against measured criteria can lead to problems even when the organisations involved are not strongly driven by money.

The education system in the UK has many examples. But the one given by Holmstrom is even more dramatic. The No Child Left Behind Act of 2001 in the US was very well intentioned. But the test-based incentives eventually led, around a decade later, to teachers in Atlanta being convicted of racketeering and serving jail sentences for fixing exam results.

Holmstrom is in many ways a very conventional economist – his Nobel lecture rapidly becomes full of dense mathematics. He believes that, given the right information and incentives, people will make rational decisions.

This is why his conclusion is so startling.

He writes: “one of the main lessons from working on incentive problems for 25 years is that, within firms, high-powered financial incentives can be very dysfunctional and attempts to bring the market inside the firm are generally misguided”.

The whole trend in recent years has been to bring even more market-type systems inside companies, from bonuses for meeting potentially counter-productive targets, to devolving budget authority away from the discretion of mangers and handing it to specialised departments.

Holmstrom’s conclusion implies the need for a pretty radical rethink of the way incentives are structured, in both the public and private sectors.

As published in City AM Wednesday 26th July 2017

Image: Lehman Brothers Headquarters by Sachab is licensed under CC by 2.0
Read More

Believe it or not, Britain is getting happier

Believe it or not, Britain is getting happier

The dominant economic narrative in the UK is a pretty gloomy one just now.

True, employment is at a record high. But, counter the whingers and whiners, zero hours contracts and low pay proliferate.

The political discourse is full of the struggles of the JAMs – the Just About Managing The public sector moans about its pay. During the election, Labour played ruthlessly on the fears and anxieties of the elderly about inheritance and the value of pensions.

All in all, the picture seems bleak. But a much more positive vision is given by the Office for National Statistics (ONS) in its measure of well-being.

The Measuring National Well-being (MNW) programme was established in November 2010 under David Cameron. It is not without its critics. But if we take it at face value, compared to a year ago the country is definitely happier.

As the ONS puts it: “the latest update provides a broadly positive picture of life in the UK, with the majority of indicators either improving or staying the same over the one year period”.

There seems to be a bit of a glitch. The ONS boasts of using no fewer than 43 separate indicators to measure well-being. But they go on to state, in the very same sentence, that of these 43 measures, “15 improved, 18 stayed the same and two deteriorated, compared with one year earlier”. Perhaps the relevant statistician here received his or her basic training at the Diane Abbott School of Arithmetic.

No matter, it could be that some of the series have simply not been updated at all. Certainly, many people might not be too concerned to learn that “on environmental sustainability, the proportion of waste from households that was recycled fell over a one-year period, while remaining unchanged over the three-year period”.

But compared to a year previously, on some key indicators, as a nation we were more satisfied with our jobs, felt our health was better, and enjoyed our leisure time more.

This does not fit readily with political discussion recently in the mainstream media.

One possible reason is that many of the ONS measures rely on conventional survey techniques. These can take some time to carry out. So the ONS only release new data every six months, and the latest one was in April. The indicators could just be out of date.

However, a very similar story is told by a real-time analysis of Twitter data, which I have been carrying out with my UCL colleague Rickard Nyman since June 2016 (admittedly just for the London area).

We use advanced machine learning algorithms which essentially measure the sentiment level of a tweet as a whole, rather than relying on the now obsolete approach of looking for specific positive and negative words.

Sentiment in London started to rise quite sharply last autumn, dipped down slightly in April and May, but is now back up again.

Many conventional economic statistics are not really designed for the modern economy. So, despite, all its faults, the ONS well-being measure may be a step in the right direction, and regardless of what the media tells you, Britain may indeed be getting happier.

As published in City AM Wednesday 19th July 2017

Image: Happiness by Geralt is licensed under CC by 2.0
Read More