By Lyndsay Grant
Jenny Ozga’s presentation at the second Code Acts in Education seminar showed how data is becoming an increasingly important actor in governing education, internationally, nationally and in schools. But what is it about data that makes it so powerful?
Following Ozga’s presentation, participants began discussing how and why quantitative data had come to be seen as providing a more objective and reliable way of knowing what was going on in schools and in children’s learning than other ways of understanding. Some of those in the discussion, with a natural science background, described how their training led them to be very sceptical about ‘raw’ quantitative data. Before drawing conclusions, they would consider the accuracy of their data collection instruments and processes, what was missed in their data sets, the subjective choices they had made about which statistical analysis procedures to use, and seek possible alternative interpretations for the data they produced. Data never just ‘spoke for itself’. Yet data in education often appears to be quite uncritically accepted. As a group of researchers, used to the idea that both quantitative and qualitative data could both provide useful ways of understanding education, we wondered why it is that large-scale quantitative data seems to be becoming the only sort of data that matters. In educational policy-making and regulation, it seems that ‘big data’ is able to lay claims to greater legitimacy and authority than other ways of knowing.
I gained some insight into this process at a recent conference on the use of the Department of Education’s National Pupil Database in England. Amanda Spielman, Chair of Ofqual and Advisor to the Ark chain of academies, discussing the use of data for target-setting, described what she called a strange process of “transubstantiation”. At the point when data was collected, teachers were well aware that it was affected by context, it might contain errors, and only provided a single snapshot of a particular learner. But when the data was entered into the official RaiseOnline database, it seemed to undergo a mystical transformation, and was treated with such great authority that teachers would disregard their own judgement in favour of RaiseOnline when setting pupil targets.
The way that data gets decontextualized for entry into a database is one of the ways that it acquires this apparent objectivity and associated authority. Jenny Ozga had described how, in the Department for Education in England, they have a large screen in a central area called ‘the bridge’, showing at a glance performance data across local authorities, as a way of facilitating central performance management. Ozga described how, when inspectors challenged a school’s performance data, there would be opportunities for head teachers to re-contextualise their data, to provide the missing, unquantified information necessary for a fuller interpretation. At an aggregated national level, this level of specificity simply isn’t possible – the overview provided by ‘the bridge’ depends on the standardization and decontextualisation of data. To generate an overview, databases must exclude large amounts of information, forgetting far more than they remember. If everything were included we would only be able to see every case as individual and unique. But when interpreting this data, we would do well to acknowledge what has been excluded when we come to consider how much weight to put on it, and how much interpretation it can bear when it is used in performance management or policy formulation purposes.
Tarleton Gillespie describes how data objectivity is regularly performed as a feature of algorithmic systems in the same way that statistical data boosts scientific claims, partly because ‘human hands’ seem to be removed from the system. Claims based on data are seen as more legitimate precisely because they are divorced from the messy, specific, local and contingent, distant from the biased or incompetent failing of human hands. Yet a lot of the work goes in to the processes of standardising, decontextualizing and cleaning remains the product of human judgements. Data must be worked on to fit smoothly into a database. It must be standardized and ‘cleaned’ to make it comparable, removing local idiosyncrasies and extraneous detail. A lot of work goes into this process; several presentations at the National Pupil Database conference focused entirely on research methodologies for cleaning data to make it easier to operate on. Cleaned and standardised data can be made to ‘talk’ to data in other archives; information about pupils in the National Pupil Database can be connected to information held about them in HEFCE, HESA, and BIS archives, with ambitions expressed to link to data held about them by the Department of Health, Ministry of Justice and data collected through Universal Credit (should it ever see the light of day). These links allow for even greater level of analysis, tracking pupil data across more spheres of their lives.
The processes of standardising, decontextualizing and cleaning data means that it can be more mobile and fluid; data can connect into other databases freed of its local, specific and contingent context. In The Rise of Data in Education Systems, Martin Lawn has described the work done by statisticians, NGOs, governments and educationalists to standardise educational data, in order to allow data to flow across national borders and create international databases informing the PISA study that systematically compare and evaluate diverse educational systems.
One of the ways that data had become so powerful is in the way it is presented as unassailable, legitimate and objective knowledge, and in the way that it is able to move across boundaries and articulate with other data sets unencumbered by messy local context. Yet a lot of work must be done to data to enable these characteristics, through selecting and excluding, standardizing, decontextualizing and cleaning. Data is translated through many processes, into an authoritative knowledge object, such as DfE’s ‘the bridge’, or a RaiseOnline report, or a data dashboard, imbued with characteristics of objectivity and legitimacy. As with the quote that Ozga opened her talk with suggests, “to talk about knowledge is to talk about governing”, data-driven educational knowledge lends its legitimacy and objectivity to processes of governing and regulating educational performance.
Educational data produces objects by which education can be known including data tables and graphs. It makes up education, learning and learners as things that can be known and governed in a particular way. But it also produces knowing subjects: the civil servants, policy makers, teachers and children who are invited to adopt a data-driven perspective from which to understand education and their own educational engagements. How do the subjectivities produced through these ways of knowing compete or cohere with other subjectivities? This could be seen as a further way in which data acquires its power in governing education: by inviting diverse groups of people to take up similar positions as knowing subjects in relation to understanding education.
Data is, in part, such a powerful form of governing knowledge because it is imbued with characteristics of objectivity and fluidity, seeming to be distant from messy, human, local idiosyncrasies. Yet it takes a lot of deliberate, political work to bestow these characteristics upon data sets.