The UK government’s Behavioural Insights Team has announced it has been experimenting with data science methods in school inspections. In partnership with the Office for Standards in Education (Ofsted), it has designed machine learning algorithms to predict and rate school performance.
Originally established as part of the Cabinet Office in 2010, the Behavioural Insights Team—or ‘Nudge Unit’ as it informally known—became a ‘social purpose company’ in 2014, jointly owned by the UK Government, the innovation charity Nesta, and its employees. Its staff have academic grounding in economics, psychology or a background in government policymaking, and it has expanded its office from London to Manchester, New York, Singapore and Sydney. It has always been closely associated with the ‘what works’ model of policy, employing randomized control trials (RCTs) to test out policy interventions. The BIT has also established a revenue-generating arm that uses ‘behavioural insights to design and build scalable products and services that have social impact.’ One of its advisory panel is Richard Thaler, the behavioural economist recently awarded a Nobel economics prize for his work on the application of behavioural science and ‘nudge’ techniques in public policy.
The BIT’s Data Science Team published details of its experiments in a new report in December 2017. The team’s defined aims are to ‘make use of publicly available data, web scraped data, and textual data, to produce better predictive models to help government’; ‘to test the implications of these models using RCTs’; and ‘to begin developing tools that would allow us to put the implications of our data into the hands of policymakers and practitioners.’
The report, Using Data Science in Policy, detailed a number of projects the Data Science Team had undertaken to apply behavioural insights to diverse areas of public policy:
over the past year we have been working to conduct rapid exemplar projects in the use of data science, in a way that produces actionable intelligence or insight that can be used not simply as a tool for understanding the world, or for monitoring performance, but also to suggest practical interventions that can be put into place by governments.
The experiments were in policy areas including health, social care and education.
In its education project with Ofsted, the BIT described how it used ‘publically available datasets to predict which institutions are most likely to fail and thereby target their inspections accordingly. We showed that this data, married to machine learning techniques such as gradient boosted decision trees, can significantly outperform both random and systematic inspection targeting. … We are excited to be working with Ofsted to put the insights from this work into action.’
In order to apply data science and machine learning to school inspection, the BIT compiled publicly available data from the year before an inspection happened. These data, its report said, included workforce data, UK census and deprivation data from the local area, school type, financial data (sources of finance and spending), performance data (Key Stage 2 for primary schools and Key Stages 4 and 5 for secondary schools) and Ofsted Parent View answers to survey questions. Parent View is Ofsted’s online tool to allow parents to record their own views on their child’s school. These data are then considered in Ofsted inspections.
According to a report of the Ofsted experiment in Wired magazine, its ‘school-evaluating algorithm pulls together data from a large number of sources to decide whether a school is potentially performing inadequately.’ By matching statistical data to the Parent View data, which includes textual information that can be analysed for sentiment, BIT claims it can predict which schools are not performing well and are likely to fail an inspection. The system ‘can help to identify more schools that are inadequate, when compared to random inspections’ and may even be used to automate decisions made by Ofsted in the future.
So far, the Nudge Unit’s trial with Ofsted has not been used to inform any real-world decisions, although the two organizations plan to extend their partnership in 2018, and are considering the use of further datasets, including data that are not open to the public.
An important aspect of the experiment with Ofsted is that the BIT doesn’t want schools to know how the algorithm works, as the project’s director told Wired. ‘The process is a little bit of a black box—that’s sort of the point of it,’ he said. In other words, schools are to be kept in the dark about the school-evaluating algorithm so that they don’t have the opportunity to ‘game’ their data in advance, which would result in skewing the predictive model.
It’s not the first time the Nudge Unit has been involved in education in the UK. Earlier in 2017 it was reported that the Department for Education was recruiting a permanent behavioural insights manager and an adviser. The aim was to change the culture of the department with psychology specialists applying behavioural science in strategic policymaking processes, and to commission research, trials and interventions drawing on behavioural insights to ‘improve our education and children’s services’.
The Nudge Unit’s experiment with Ofsted, and the DfE’s recruitment of behavioural scientists, exemplify the increasing role of behavioural science agencies to produce policy-relevant science in public education in recent years, as Alice Bradbury, Ian McGimpsey and Diego Santori have previously documented. This raises a number of issues.
The first issue is how public education is increasingly being influenced by arms-length agencies. As a co-owned entity of the Cabinet Office and Nesta, the Nudge Unit is strictly independent of government but now acting as an outsourced contractor within public policy. It was reported in Wired that if Ofsted rolls out the school evaluation algorithm developed by BIT, local authorities would be required to pay between £10,000-£100,000 to implement.
Nesta itself, the co-owner of BIT, is an often-overlooked organization in UK education. As a ‘policy innovation lab’ it has successfully campaigned for ‘coding’ to be included in the English National Curriculum and for data science to be applied in the analysis of public services. A core hub in a global network of policy labs, Nesta and similar organizations worldwide are seeking to innovate in public policy, often using technological innovations as models for government reforms.
As the US GovLab has reported, the application of data science in public policy by ‘data labs’ can help create a ‘smarter state.’ Indeed, Nesta and the Cabinet Office have previously collaborated to develop ideas about a ‘new operating system for government,’ using data science, predictive analytics, artificial intelligence, sensors, autonomous machines, and platforms to redefine the role of government.
As such, organizations such as Nesta and the Nudge Unit, which perceive data science as a new model for enacting government, are now seeking to locate data science methods within the institutions and processes of educational policymaking and school evaluation. The Ofsted project is part of their wider ambitions around digital governance using data science to drive policymaking. They are seeking to attach arms-length ‘data labs’ to centres of public policy, bringing new forms of technical and statistical expertise—as well as economic, behavioural and psychological science—into policy processes, including education. This exemplifies what I have elsewhere described as ‘digital education governance’—the use of digital data to make education visible for inspection and intervention.
Second, as part of this shift, the Nudge Unit is seeking to transform the way school inspections are performed. Rather than inspection through embodied expertise, school evaluation is now to be enacted predictively, before the inspector arrives. Jenny Ozga has previously written of how digitally recorded data increasingly surrounds the inspection process. The Nudge Unit is seeking to pre-empt the inspection process through the application of machine learning algorithms which have been trained to spot patterns and make predictions from pulling together a wide range of multimodal data sources about schools and their contexts.
These deliberately ‘black-boxed’ and opaque systems, which schools would be unable to understand, could be significant actors in practices of school accountability. If, as anticipated, some of Ofsted’s tasks are automated by the Nudge Unit’s intervention, then it may be unclear how certain decisions have been made in relation to a school’s overall evaluation. Although the BIT claims it doesn’t wish to replace the professional inspector, it is clear that school inspection will become a more distributed task involving both human and nonhuman decisionmaking and judgment, with data science methods perceived as more objective and impartial means for producing evidence than professional observation. In this sense, it is entirely consistent with behavioural science claims that human decision-making is less rational and evidence-based–and more emotionally-charged, cognitively-biased and subjective–than is commonly assumed.
At a time when there is increasing political, public and legal concern about machine learning opacity and its lack of ‘explainability’ or transparency, it seems ethically questionable to create systems that are deliberately black boxed, not least as their algorithms may well contain biases and potential for statistical discrimination. The cognitive bias of the school inspector is to be combated with systems that may have their own encoded biases. If a school is predicted to be inadequate by the algorithm, its stakeholders will expect and need to know what factors and calculations produced that evaluation.
It is notable too that BIT claims ‘missing data’ is predictive of a failed inspection, presumably the consequence of human error in the data-inputting process, and that it is seeking other non-public data sources to improve its predictive models. It remains unclear how deeply BIT intends to scrape schools for data, or which additional data would be included in their calculations, raising methodological questions about reliability and commensurability of their analyses.
The third issue relates to the application of behavioural science within education. Mark Whitehead and coauthors describe how ‘behavioural government’ has proliferated across public policy in many countries in recent years—especially the UK and US—through the application of ‘nudge’ strategies. Nudging involves the design of ‘choice architectures’ that can shape and condition choices, decisions and behaviours, and is deeply informed by behavioural and psychological sciences. The Nudge Unit exemplifies behavioural government.
In its project with Ofsted, the BIT is seeking to use data science as a way of constructing choice architectures for inspectors. The results of the data analyses can identify particular areas for concern, as predicted by the algorithm, that may then be targeted by the inspectors, thus creating a more efficient and cost-effective machinery of inspection. The BIT is, in effect, nudging Ofsted to make strategically informed choices about how to conduct inspection. This, claims BIT, would reduce the number of inspections required and free up Ofsted staff to work on improvement interventions with schools. (Though it might also lead to Ofsted staff reduction and cost-savings.) In these ways, the Ofsted inspector is being reimagined as a nudge operative, intervening in schools by offering them targeted improvement frameworks. At the same time, it is seeking to supplement subjective human judgment, with all the flaws that behavioural science claims comes with it, with algorithmic objectivity.
The Nudge Unit also makes extensive use of psychological insight. Perhaps the most obvious use of psychological data in the Nudge Unit’s project with Ofsted is the sentiment analysis it is performing on Parent View data, with aggregation of parents’ subjective feelings into patterns that can be used as objective indicators to supplement the statistical inspection.
More innovative sources of psychological data, however, could be used by the Nudge Unit to undertake algorithmic school inspections in the future.
Behavioural sciences have already amalgamated with data science in relation to the policy area of ‘social-emotional learning.’ The logic of the social-emotional learning movement is that ‘non-academic’ qualities are strongly linked to academic outcomes; students need to be trained to be socially and emotionally resilient if they are to succeed in school. There are close ties between many of the major voices in the social-emotional learning movement and the behavioural sciences, and educational technologies have become central to efforts to monitor and nudge students’ non-academic learning.
As I’ve documented elsewhere, a range of technical innovations to support social-emotional learning has been proposed and developed, such as behaviour monitoring apps and wearable biometric monitors, that might be able to detect indicators of student emotions such as engagement, attention, frustration and anxiety. Data from these devices could then be fed back to the teacher, who would be able to prompt the student in ways that might generate a more positive response.
Real-time student data, it is possible to speculate, could well become part of the school inspection process under a Nudge Unit style of data scientific experimentation. Student sentiment data, tracked to progress and attainment, would then become a measure, defined by behavioural economists, to be used for purposes of school accountability
Real-time psychological data, as well as more mundane user data scraped from the web or captured by mobile smartphones, argue Mark Whitehead and colleagues, now appear to present rich opportunities for behavioural scientists to both record and nudge behaviours and emotions. They cite an article from Forbes claiming ‘the proliferation of connected devices—smartphones, wearables, thermostats, autos—combined with powerful and integrated software spells a golden age of behavioral science. Data will no longer reflect who we are—it will help determine it.’
Behavioural government is, then, informed by the testing culture of the tech sector, which constantly experiments on its users to see how they respond to small changes in design. As such, ‘behavioural design’ is the application of behavioural science to technological environments to influence and determine user behaviours. With increased behavioural science influence in education, twinned with massive escalation in data-processing ed-tech applications, the culture of testing and behavioural design could significantly impact on policy, schools and professional practitioners in years to come.
The education experiment
As with its work in other sectors, the Nudge Unit’s involvement with Ofsted and the Department for Education is bringing the methodological logic of data-driven experimentation and behavioural design into education. Increasingly, automated algorithms are being trusted to perform tasks previously undertaken by embodied professionals. Their opacity makes the decisionmaking these systems perform difficult to scrutinize. Ofsted has long been a source of concern for schools, of course. It is hard to see how transforming the inspector into an algorithm which is better at identifying more inadequate schools will reduce teachers’ worries about performance measurement. Data and the culture of performativity in education have a long history.
More generally, the application of data science to education policy is indicative of how the education sector itself is becoming the subject of increasing levels of experimentation with data science methods. The Department for Education is currently seeking to reintroduce baseline testing into pre-school settings. A previous trial of early years baseline testing in 2015 collapsed amid concerns over the methodology of the original contractor. In the tender for the second version of baseline assessment, however, the DfE has more carefully specified the testing methods it expects the contracted assessment company to use. In an experimental education sector, schools, professionals and students look increasingly like laboratory specimens, repeatedly subjected to tests and trials, inspections and interventions, as part of the pursuit of identifying ‘what works’ in education policy.