Genetics, big data science, and postgenomic education research

Ben Williamson

Emily Willoughby_Genetics of educational attainment_2018A diagram visualizing the genetic variants associated with educational attainment. Image by Emily Willoughby.

An international consortium of genetics researchers has established a link between genes and educational attainment from a study of over a million people. One of the largest genetics studies ever published in a science journal, it represents a significant step forward for the emerging field of educational genetics. The growth of genetics expertise in education also, however, raises substantial concerns about biological determinism and new forms of eugenics, and reanimates long-standing debates about the genetic inheritance of intelligence and cognitive ability.

In this post I outline some key findings of the study, but primarily focus on the significant implications and issues it raises for education research more widely. The implications of the study are that it: (1) establishes genetics as a powerful new front in educational knowledge production; (2) positions big data science as a methodological apparatus for future educational studies; (3) surfaces extreme political polarization regarding genetic factors in education that will be difficult to reconcile as genetics enters education policy debates; (4) potentially opens up a new market for commercial educational genetics products; and (5) reveals the need for new social scientific forms of engagement with, and critique of, genetics research and postgenomic science in the education field.

Gene discovery
Published in Nature Genetics at the end of July 2018 by the international Social Science Genetic Association Consortium (SSGAC) in collaboration with the consumer genetics company 23andMe, the paper ‘Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals’ reports findings showing that genetic patterns across a large population are associated with years spent in school. According to its 80 authors, ‘educational attainment is moderately heritable and an important correlate of many social, economic and health outcomes,’ and is therefore an important focus in a number of educational genetics studies.

Specifically, the scientists  identified over a thousand genetic variants linked with educational attainment, particularly those involved in brain-development processes and the formation of neuronal connections in foetuses and newborns. These biological factors, the scientists claim, influence psychological development, which in turn affects how far and for how long people continue at school.

The SSGAC has been careful in reporting the results. They do not claim to have identified any single genes for education, and the data don’t predict educational attainment for individuals. The research also found that genetic variants have a far weaker effect than environmental influences on educational attainment, and was restricted to analysis of a homogeneous sample people aged in their 40s and 50s of white European descent (the study failed with a sample of African-Americans). The authors produced a massive Q&A document—longer than the paper itself—to help explain and clarify the results, methods and conclusions, while downplaying the policy and practical implications of its findings. As such, the paper has been carefully published in acknowledgement of the potential controversy it could cause, and to anticipate misinterpretation and misreporting of its findings.

Nonetheless, the paper has catalysed significant media interest and social media commentary. Three days after publication, the paper had been Tweeted 1000 times, blogged multiple times, and reported in news media around the world—picking up an enormous Altmetric score in the process. There is useful coverage in the New York Times, Atlantic and MIT Technology Review reporting the key findings.

Clearly the paper is a massive advance for genetics science, in education and beyond. For those education researchers and social scientists outside of the genetics field, however, it has major implications in terms of knowledge production, methods, policy influence, and the commercialization of educational genetics.

Powerful genetic knowledge
Along with other recent advances in genetics in education, the SSGAC study instantiates the emergence of a powerful new field of knowledge production. Such research is only possible now owing to the complete sequencing of the human genome–the entire genetic structure of human DNA–over a decade ago, and since then studies in human genomics have expanded rapidly. As a result, science studies researchers claim we are now in a postgenomic age.

As a research field, educational genomics seeks to unpack the genetic factors involved in individual differences in learning ability, behavior, motivation, and achievement. Importantly, researchers of educational genomics do not assume either that there is any single genetic factor that determines learning ability, cognition or intelligence, or that genetic factors entirely explain the complexity of learning. Identifying an individual’s genotype—the full heritable genetic identity of a person—and its relationship to learning, intelligence or educational outcomes remains complex. Practitioners of educational genomics and behavioural genetics look for patterns in huge numbers of genetic factors that might explain behaviours and achievements in individuals, by studying the interaction of genotypes and environmental influences on phenotypical behaviours and traits (such as intelligence etc).

The SSGAC has positioned itself as a leading consortium for such postgenomic education science with the publication of their paper, but another key figure bringing genomics research into education is the behavioural geneticist Robert Plomin, co-author of the controversial G is for Genes: The Impact of Genetics on Education and Achievement. Plomin has extensively studied the links between genes and attainment using ‘genome-wide polygenic scoring’ (GPS), a method also employed in the SSGAC study. A polygenic score is produced by analysing huge number of genetic markers, and their interactions with environmental factors, in order to predict a particular behavioural or psychological trait. As computer processing power, data storage capacity, and data analytics technologies have advanced in recent years, it has become possible to correlate huge quantities of genotypical data with a host of phenotypical traits.

Under the banner of a ‘new genetics of intelligence’, Plomin and colleagues have used polygenic scores to predict academic achievement in schools. The substantial increase in heritability they found ‘represents a turning point in the social and behavioural sciences because it makes it possible to predict educational achievement for individuals directly from their DNA,’ thereby ‘moving us closer to the possibility of early intervention and personalized learning.’

While the SSGAC avoids calling for interventions based on its data, the results open up possibilities for further studies and analyses. These include: studies that control for genetic influences in order to generate credible estimates of how changes in school policy influence health outcomes; study why specific genetic variants predict educational attainment; and study how the effects of genes on education differ across environmental contexts. As such, the research itself is a catalyst for further educational genomics studies.

Although educational genomics remains in its infancy, it seems likely to advance considerably in coming years, linking genotypes to phenotypical traits, behaviours and other outcomes. It will link more closely with psychology and neuroscience as associations are further established between genes and neurons, personality traits and so on. As more findings emerge, further support will grow for evidence-based scientific perspectives on learning. New forms of genetic and genomic expertise in educational matters are already emerging, and challenging existing forms of social scientific and philosophical educational research which have challenged the biological determinism of genetics for decades.

Big data science
The methodological apparatus of the SSGAC study, and other research in educational genomics and behavioural genetics, is huge—it dwarfs the technical, methodological, financial and expert resources of other forms of educational research. The SSGAC study itself is the accomplishment of a well-funded international team of 80 scientists working in departments of psychology, sociology, behavioural genetics, behavioural science, neurogenomics, economics, biosciences, health sciences, and many others. A core part of the team included more than 20 scientists from the commercial organization 23andMe, the Silicon Valley company backed by Google. The research, then, was distributed across public universities and commercial labs at huge scale and significant cost.

Beyond the big size of the team and its funding, the study is also typical of the big data methods of genetic science. The data on its sample of over a million people was from two sources. One was the UK Biobank, a huge open access health resource based on a living population of over 500,000 volunteer participants, which was established by the Medical Research Council and the Wellcome Trust and opened up to scientists in 2012. One of many biobanking projects worldwide, it opens up unprecedented access to large samples of genetic data for analysis. The other data was sourced from 23andMe itself, the consumer genetics company offering health and ancestry services on a profit-making basis.

The methods described in the appendix to the SSGAC study demonstrate the quantitative and computational complexity of such large-scale genetics research. The study depends on a range of statistical methods, tests, mathematical formulae, algorithms, data visualizations, software platforms with names such as METAL and PLINK, and bioinformatics platforms called DEPICT, MTAG, PANTHER and MAGMA.

As such, the paper published in Nature Genetics is the end-result of the activities of a huge interdisciplinary science team, generous financial funding, enormous databanks from both the non-for-profit and private sectors, and highly sophisticated big data analytics methods, all powered by a vast infrastructure of bioinformatics technologies, statistical software analysis packages, data analytics and visualization. The scale of the scientific infrastructure of knowledge production is miles away from the norms of educational research.

Yet we may expect further education research to locate itself within such infrastructures of professional expertise, labs, databanks, analytics methods and software. Already, scientists are beginning to propose new multidisciplinary experimentation and intervention under the heading of ‘precision education’. Genetics and neuroscience are spectacular new fronts of big data-driven scientific research, and related subfields of educational genomics and educational neuroscience are growing fast, with the support of wealthy foundations and commercial partners. As a result, studies such as that by the SSGAC and other educational genetics teams position big data science as a new frontier of innovative and interdisciplinary education research.

Policy sciences
Researchers in the field of Science and Technology Studies (STS) have long maintained that science and politics are inseparable, and often focus their attention on scientific controversies. This is particularly the case when science enters into official policy, and is translated and manipulated to fit political agendas and policymakers’ requirements. The new genetics of education are an ideal illustration of an emerging scientific controversy in education.

The SSGAC research represents the potential for a significant shift in emphasis in education policy to embrace genetics expertise. Though the SSGAC reports no direct policy implications from its study, it is clear that policymakers seeking explanations for educational attainment would be interested in the results. As Kalervo Gulson and P. Taylor Webb have argued, new kinds of ‘bio-edu-policy-science actors’ may be emerging as authorities in educational policy, ‘not only experts on intervening on social bodies such as a school, but also in intervening in human bodies’. And science writer Antonio Regalado pointed out that one of the SSGAC authors had previously stated that once polygenic scores could be used to predict IQ, it would trigger a ‘serious policy debate’ about ‘personal eugenics’.

Commenting on the SSGAC study, John Warner cautions about how conservative economists might seek to translate the results into policy proposals. ‘How long before schools subject to performance funding as determined by graduation metrics begin to discriminate against students with low polygenic educational attainment scores?’ he asks. ‘When will automated human resources algorithms start weighing polygenic educational attainment scores when sorting through job applicants?’ These questions point to the possibility of students being grouped and clustered together by their polygenic scores, and the potential for enforcing new kinds of ‘biosocial collectivity’ within schools.

A significant problem with the potential translation of educational genomics into education policy is that genetics in education is extremely controversial and politicized. The publication in the mid-90s of The Bell Curve rekindled old debates about genetic determinism, eugenics and racialized discrimination in relation to IQ testing and the political uses of intelligence data. Concerns persist about this ‘new geneism’, and help account for the very careful, actively depoliticised packaging of the SSGAC study. A recent article in The New Statesman on the genetics of education identified deep polarization between right-wing advocates of genetics and left-wing critics, with the former preferring explanations based in biology and the latter seeking environmental explanations. A column reporting on the SSGAC study in the New York Times argued ‘progressives should embrace the genetics of education’, suggesting that ‘the power of the genomic revolution [can] be harnessed to create a more equal society’ while berating the ‘long tradition of left-wing thinkers who considered biological research inimical to the goal of social equality’.

Matters aren’t helped by the fact that some of the most outspoken advocates of genetic explanations for attainment, achievement and intelligence are divisive public figures such as Toby Young and Charles Murray (co-author of The Bell Curve). In a recent Spectator article titled ‘The left is heading for a reckoning with the new genetics’, Young attacked what he saw as liberal progressives’ ‘environmental determinism’ as ‘scientifically indefensible’. ‘Like Marx,’ he argued, ‘post-modernists believe that man’s true nature is reducible to the totality of social relations, that individuals are nothing more than the embodiments of particular class-relations and class-interests, and that everything comes down to the struggle for power. I wouldn’t expect an uncritical acceptance of the new genetics from that quarter’.

Drawing on an interview with Charles Murray, Young also speculated that left wing sociologists in particular would likely become irrelevant unless they embraced the new genetics by the mid-2020s. For Murray, this was even a source of deep concern, since he thought ‘once left-wing intellectuals finally let go of environmental determinism they may veer too far in the opposite direction and embrace gene editing technologies like CRISPR-Cas9 to try to create the perfect socialist citizen’.

Given Young’s proximity to education policymakers and politicians unde the current UK Conservative government, his comments on genetics have caused widespread alarm among academic and educators. Generating policy proposals based on educational genomics in this tense environment, then, is likely to be a continuing source of deep controversy and irreconcilable political suspicions. It appears that education policy in coming years will have to engage in significant debate about genetics and even personal eugenics, requiring informed participation by social scientists whose views on the matter are currently subject to attack and ridicule by conservative commentators. Education policy studies of this scientific and political controversy will be essential.

Genetic exploitation
With growing awareness of the increasing power of genetic science in education, it is highly likely that commercial organizations will seek to exploit the opportunity to build an educational genetics market of services and products.

Consumer companies such as Google-backed 23andMe have already exploited the opportunities made available by the sequencing of the human genome to launch genetic testing services as commercial products. As 23andMe make up part of the team behind the SSGAC study, this commercial outfit has now not only positioned itself as part of the apparatus of education research, but potentially could stand to gain from extending to the provision of further educational genetics products. In the same week the SSGAC study was released, 23andMe also released details of a deal with big pharmaceutical company GlaxoSmithKline to use data from its 5 million customers of home genetics testing kits to design new drugs. The $300million deal will see GSK and 23andMe  applying artificial intelligence and machine learning to the medical discovery process, analysing genetic data from 23andMe and other sources such as UK Biobank. As a private company with vast genetic databanks, 23andMe is clearly positioning itself as a key part of the infrastructure of genetic science in pharmaceuticals and education.

Other companies are likely to see market potential in educational genetic testing products too. Already, concerns are emerging about startup companies seeking to exploit advances in human genomics research to produce genetic IQ tests. Cheap DNA kits for IQ testing in schools, in the shape of ‘intelligence apps’ or other genetic ed-tech products, may be feasible in the not-too-distant future, though considerable and understandable concern exists about their usefulness and ethics. Robert Plomin has proposed that DNA analysis devices such as ‘learning chips’ could make reliable genetic predictions of heritable differences in academic achievement, and it is easy to speculate how consumer-DNA companies could extend in this direction.

Major risks would emerge from the expansion of an educational genetics markets. One is that as genetic predictions become accepted  as forecasts of a child’s future ability, new approaches may emerge to ‘artificially select future generations’–a ‘eugenics 2.0‘ for selecting ‘smarter kids’. While embryo screening programs probably remain unlikely in the West, large-scale efforts are already underway elsewhere to find the genetic code for high IQ. This raises the possibility for selective-intelligence to become attractive to wealthy parents seeking genetic advantage for their children.

The merging of genetic science, big data and commercial speculation in education could lead to a new form of ‘platform scientism’, where the logics of capital accumulation and data analytics combine to push genetic testing and other profiling services in schools. The danger of such a scenario, as detailed in The Atlantic, is that obsession with these ‘slippery genetic predictions could turn people’s attention away from other things that influence how children do in school and beyond — things like their family’s wealth, the stress in their neighborhoods, the quality of the schools themselves’.

Critical postgenomic education research
The acceleration and expansion of educational genetics research as a big data science of attainment, achievement and even intelligence raises distinctive challenges for social scientific education research. Straightforward critique and rejection of genetics represents a possible form of resistance. However, within the wider field of sociology and STS research on postgenomics, researchers have begun to propose different forms of analysis and critique, with some educational researchers also working to get beyond simplistic critical reactions to new biological thinking in productive new ways.

Contemporary postgenomic science, with its emphasis on gene-environment interaction, offers an invitation for social scientists to explore how the biological and the social constitute each other. Biosocial studies, for example, acknowledge that the body, biology and brain are shaped by their social circumstances and environmental contexts. Commenting on contemporary postgenomic science, biosocial researchers argue that the social world gets ‘under the skin’ to impress upon the biological. They insist that bodies are influenced by power structures in society, becoming tangled with social, political and cultural structures and environments.

Biosocial work in education is just beginning to emerge. Developing a ‘biosocial education’ agenda, Deborah Youdell argues that learning may be best understood as the result of ‘social and biological entanglements.’ Biosocial education research therefore takes biology seriously, but also digs critically into the ways scientists have conceptualized the body and thereby made it amenable to experimentation and intervention.

A biosocial approach would seek to understand educational genetics in both biological and social scientific terms by appreciating that the social environments in which learning takes place do in fact inscribe themselves on bodies and brains. The genetic and neural data of contemporary postgenomics would have to be understood from a biosocial view as data about social processes, not only biological processes.

Since genetics is a highly data-intensive and software-saturated field of experimentation and knowledge production, a biosocial perspective would also address the implications of data processing of students’ genetic and neural details. Taking further cues from STS, it would acknowledge that data are always a partial selection, that their analysis through vast data infrastructures of methods and software packages matters a great deal to the results produced, and that the results can influence what happens in educational settings. Is the ‘quantified human’ held in a database and represented by a polygenic score really detailed enough to yield insights to intervene upon students? Additionally, biosocial research would be alive to the possible consequences of for-profit commercial companies building software platforms for collecting and analysing students’ genetic and neural information.

The million-sample SSGAC study is clearly a landmark in postgenomic education science. It is a field of experimentation and knowledge production requiring novel forms of social scientific and philosophical analysis. A biosocial approach may be one way forward, but it is clear that educationalists need to develop a range of concepts and methods in order to perform critical postgenomic education research as the genetic science of education expands and accelerates.

Posted in Uncategorized | Tagged , , , | 1 Comment

Edu-business as usual—market-making in higher education

Ben Williamson

Nido buildingThe Nido Spitalfields Tower, the world’s tallest student accommodation, sits on the boundary of the  financial district of London. Image by UggBoyUggGirl

The global education business Pearson has established itself as a major player in higher education around the world. With core business interests in digital online courses and alternative models of HE provision, Pearson is currently making significant in-roads into British universities and the HE sector more widely. From a critical perspective, Pearson’s ongoing business activities appear symptomatic of the further marketization and privatization of contemporary HE under current government policy and regulation. A ‘neoliberal takeover of higher education’–the subtitle of a tight little book by Lawrence Busch–means universities are increasingly focused on achieving market value through competition, performance metric ranking, consumer demand, and return on investment. However, we need a better understanding of the specific role of edu-businesses such as Pearson in remaking higher education as a market.

Newspaper coverage in The Telegraph in June 2018 reported on a new law passed by government permitting the Office for Students—the HE regulatory body—to share student data with Pearson, HMRC, the Student Loans Company, and the Competition and Market Authority. Headlined ‘University students’ data to be shared with private companies’, the article focused on the risk of student data being exploited for profit, threats to student privacy, and potential re-use or sale of the data for undeclared purposes by Pearson. Response to the news demonstrated considerable  concern about Pearson’s role in furthering business interests in HE through its use of student data. Pearson is a major, multi-billion dollar market actor with a huge global business, and participating in the expanding data infrastructure of the UK’s higher education system too–but questions remain about exactly how Pearson participates in marketization of HE itself. This work-in-progress is an attempt to think these issues through–a working paper rather than a blog post–and the third part of a series on key actors in the expanding data infrastructure of higher education (the first was on the Higher Education Statistics Agency, the second on the Office for Students).

Market-making micro-processes
To understand Pearson’s role, I adopt a framework from Susan Robertson and Janja Komljenovic to analyse marketization in HE. They have adapted it from Çalışkan and Callon, who define ‘marketization as the entirety of efforts aimed at describing, analysing and making intelligible the shape, constitution and dynamics of a market’. For Çalışkan and Callon, markets ‘organize the conception, production and circulation of goods’. Importantly, though, markets depend on a complex arrangement of rules and conventions, technical devices, metric systems, calculating equipment, logistical infrastructures, texts, technical and scientific knowledge, and human competencies and skills—all of which are engaged in power struggles over the definition and valuation of goods. Their definition and approach to marketization—as effort and coordination among people, institutions and things—applies across a diversity of markets.

Following this approach to study the effort of marketization in the HE context, Robertson and Komljenovic therefore argue ‘markets do not simply appear’ as the outcome of market ideology, but instead ‘are both made and remade, as new products and services, frontiers and spaces, are imagined, invented, implemented, inventoried, vetted and vetoed.’ In particular, they focus on how the formerly non-market space of higher education has been reframed and re-made as an ‘education services market’, and subsequently how these HE markets work. The market-making process in HE involves considerable ‘investment’ at the macro-level by policymakers, politicians, investment advisors, education firms, and universities to imagine higher education as a market to be opened up and exploited. At the micro-level it also involves the ‘nuts and bolts’ of creating higher education products and services that can be exchanged in a range of marketplaces. As such, understanding HE marketization requires not just macro analysis of neoliberal political ideology, but micro analysis of the practical, material, technical and discursive effort of market-making and maintenance.

To better understand the micro-processes involved in HE market-making, Robertson and Komljenovic—via Çalışkan and Callon—identify processes of (1) pacifying goods, (2) marketizing agencies, (3) market encounters, (4) price-setting, and (5) market design and maintenance. These analytical categories provide a useful way to think about the emerging role of edu-businesses such as Pearson in the everyday practices and processes of contemporary universities.

Pacifying goods
The first micro-process of pacifying goods refers to how things and services are represented as describable and predictable ‘packages’ with fixed qualities to which value and price can be attached. Robertson and Komljenovic offer the examples of a university being packaged as an object for investment, ‘student experience’ as a product with distinctive elements for students to consume, or ‘business intelligence’ as information software worth purchasing to assist strategic decision-making by university managers.

Pearson’s core business model within the higher education sector depends on the production of packages of goods and services in which it hopes universities will invest. Behind the recent Telegraph coverage of Pearson’s data-sharing agreement with the Office for Students is a longer history of Pearson involvement in producing services and products for HE. As an alternative HE provider, it established Pearson College London in 2012, the only FTSE 100 company in the UK to design and deliver degrees (validated by the universities of Kent and Bradford). Being a higher education provider, Pearson has legitimate reasons—like other providers—to require data access from the OfS for these purposes (as it did previously via HEFCE).

Pearson also offers online degree programs, with several UK universities entering into long-term 10-year deals with the company to deliver courses (at present these are King’s College London, Leeds, Manchester Metropolitan, and Sussex, with others in the US too). Through its ‘full-service approach to creating online degree programs or individual learning solutions’, Pearson’s online learning services are presented as streamlined technical systems and standardized program management packages for universities to purchase in order to ‘help you expand access, reach each student, and improve achievement’.

The process of rendering its services and products as standardized packages within HE markets, however, has required significant investment of company effort to justify government registration and student fees as HE provider. Pearson College London advertises itself as ‘powered by industry experience’ and, through ‘work with industry giants from Unilever, L’Oreal, WPP and IBM, to Framestore, Double Negative, MPC and The Mill’, it has established itself as a distinctive market provider which is ‘transforming higher education’. As such, these industry partners have become part of the package of Pearson’s HE market offer to fee-paying students.

In order to further expand the model as a viable and marketable package, Pearson also released Demand Driven Education: Merging work and learning to develop the human skills that matter predicting a shift in ‘future skills’ requirements for students (based on data from the Future Skills project collaboration between Pearson, Nesta and the Oxford Martin School). Its authors concluded  a transformation in HE would be needed to achieve these future skills. If earlier HE reforms had focused on widening access and improving academic success, ‘demand driven education’ would ‘focus more strongly than ever on ensuring graduates are job-ready and have access to rewarding careers over the course of their lifetime’.

Pearson Future SkillsThe Future Skills landscape, mapped by Pearson, Nesta & Oxford Martin School

As these examples indicate, Pearson has sought to ‘pacify’ its goods and justify them for investment—by universities and prospective students—both by appealing discursively to the ‘widening access’ priority of the university sector, and by actively prompting a shift toward industry-led, future-skills-focused, and demand-driven higher education through the material circulation of glossy reports and websites. It has produced technical systems and logistical infrastructure for program management to ease universities into the online learning market too.

In these important ways, Pearson is participating in making an increasingly competitive HE market in which it is itself a competitor, with an alternative provision that sets it apart from the conventional degree provision of most established universities. At the same time, the model of flexible, technology-infused provision it offers is also increasingly the model pursued by existing HE institutions, indicating how the commercial online learning model is becoming the focus of market competition among universities themselves. Along with its competitors, Pearson has standardized, stabilized and packaged online learning to create a market within UK higher education. Along the way, students have been packaged in terms of marketable ‘future skills’ whose development universities need to invest in as human capital, and universities have been reframed as market providers of ‘valuable’ demand-driven education services.

Marketizing agencies
Marketizing agencies refer to the actors competing to define what is a valuable good or service, which takes place among people, technologies, laws and forms of calculation. As such, marketizing agencies within HE include human actors such as market analysts, data managers and business intelligence officers, but also computer software, business strategies, and private company support.

Pearson has established itself as a powerful marketizing agency in HE, carefully defining through its glossy reports and brochures such as Demand Driven Education what are valuable goods and services for contemporary universities to offer. Through its ‘full service’ online learning packages, it offers its expertise as a global ed-tech courseware and platform provider in ways that have produced conviction in its offerings among university leaders, the Department for Education and the Office for Students. Indeed, the law itself has been changed through the statutory instrument signed off by the DfE’s HE minister Sam Gyimah to enable Pearson and the OfS to share data.

Pearson is also bringing novel kinds of practical ‘know-how’ and expertise into HE—both human experts who know how to engage with complex digital technologies and data, and nonhuman technologies of expertise that can enhance universities’ engagements with their data. It has, for example, positioned itself as a leading centre of expertise in digital data analytics for education (including performance metrics and comparative methodologies) at a global scale, across both the schools and universities sectors. It has developed specific technologies such as data dashboard software packages to allow university leaders and administrators to measure institutional performance through metrics and indicators. The development of these technologies positions it both as a market provider with product, services and expertise to sell and share, and as a market-maker, seeking to prompt universities to see themselves in quantitative terms as performance rivals and competitors with other providers.

Pearson Demand 2Pearson puts higher education under the microscope in Demand Driven Education

As a marketizing agency, what Pearson can do depends on its computer and mathematical equipment, as well as on the cognitive activities of its experts—its software developers, data analysts, education advisers, courseware designers and so on. This hybrid of human expertise and nonhuman equipment enables Pearson to function as a marketizing agency.

However, Pearson is of course in a struggle with other agencies to define what counts as a valuable service—indeed, to define the value of higher education itself. Universities themselves are marketizing agencies, as are the Department for Education and the Office for Students. A key actor in the marketizing agencies involved in HE market-making is Sir Michael Barber, the Chief Education Adviser for Pearson from 2012 before taking up the post of Chair of the Office for Students in 2017. A former senior adviser in the Prime Minister’s Delivery Unit under Tony Blair and education adviser to David Blunkett, he was also a member of the review group for the Browne Review of university funding in 2009-10, and served as a partner of the consultancy McKinsey’s, heading its Global Education Practice programme. Barber physically embodies a meeting-point between agencies, rendering porous the boundaries between government agency, consultancy and private company. He represents effectively how the capacities of agencies across the private and public sectors, such as Pearson and the OfS, have begun to dominate HE institutions, imposing their market model of HE as a valuable consumer commodity upon the sector. This exercise of power is at the core of contemporary struggles by many university employees over the purposes and practices of the university.

Market encounters
Market encounters then refer to how agencies and goods meet one another, such as at higher education fairs, conferences, seminars and other events, as well as through social media, web pages and other online and material arrangements. One might say that the Pearson online learning environment is a key site of market encounter. It brings together the commercial provider, the university, students, and staff into a shared space where diverse investments are made in each other and value is produced for each agency through relations made possible by the software service.

More straightforwardly, Pearson invests considerable effort in staging market encounters with the HE sector. Barber himself, in his prior Pearson role, contributed to various events and material publications promoting a transformative model of HE. His co-authored report An Avalanche is Coming (published by the IPPR think tank) made the argument that:

University leaders need to take control of their own destiny and seize the opportunities open to them through technology – Massive Open Online Courses (MOOCs) for example – to provide broader, deeper and more exciting education. Leaders will need to have a keen eye toward creating value for their students. Each university needs to be clear which niches or market segments it wants to serve and how. The traditional multipurpose university with a combination of a range of degrees and a modestly effective research programme has had its day. The traditional university is being unbundled.

The report particularly emphasized competition between universities and online providers, ensuring education for employment, supporting alternative providers and the future of work, and recognition that the ‘new student consumer is king’. Universities not adapting to these challenges and opportunities risked being swept away by the avalanche of change brought by technology—or, in other terms, market failure. Five years on, the report reads like a template for HE market reform under the Higher Education and Research Act 2017 and the regulatory strategy of the OfS under Barber. As a result, consensus is growing among UK government departments and agencies for the model of HE promoted and offered by Pearson for consumption in the HE market, as its growing presence in the sector demonstrates.

This consensus and market-consolidation is also demonstrated by the Department for Education announcement of an open data competition allowing software developers access to longitudinal student employment and earnings outcomes data in order to create apps or online services to help prospective students choose courses and institutions. (On its launch, HE Minister Sam Gyimah tweeted: ‘We want students to be better informed about degree choices & the returns–today, we’re officially launching a competition for tech companies to take graduate data & create a MoneySuperMarket for students, giving them real power to make the right choice’). The logic of the competition is that student choice is best made on the basis of future earnings, in ways highly similar to Pearson’s own emphasis on career-readiness courses and demand-driven education. But an additional feature of the competition is that it forces prospective students to think of HE as a marketplace, and to see themselves as future ‘human capital’ whose choices about which universities to attend and courses to study are a form of self-investment which will affect their future prospects and value in labour markets.


The eventual products of the competition—whether they are apps or other types of MoneySuperMarket-style online price-comparison services—will themselves become mediating sites of market encounter between students and universities. They will act as sites where the value of a degree, as a pacified good, becomes a matter of calculation. Universities will have to calculate about how best to present the value of their service, algorithms will calculate the data for national comparison and visualization, and students will have to calculate about how to choose in their best interests. As such, these apps and services will be key market-making devices.

Setting a price for a good or service is established through struggles between the different agencies that encounter each other, such as determining how much to sell or buy a service or product. Pearson is an important actor in price-setting in HE because it is offering alternative degree pathways and full service online provision; as such, it is itself in a competitive market among other online HE providers for university customers. It is also interested in how students, seen by its CEO John Fallon as ‘the Spotify generation’, may themselves ‘pay for use. They don’t want to buy to own, and they only want to pay to use things that are directly relevant to their course and their outcomes’. So its price-setting model is adapting to the market logics of online streaming services, and treating students as a direct-to-consumer market.

Deciding how much to pay for a service is a key aspect of market-making in the university sector (this of course is the heart of government disappointment in the failure of universities to differentiate fees in England). Within universities, though, as Robertson and Komljenovic observe, administrators routinely have to make budgetary decisions regarding the purchase of goods or services, but that price is sometimes secondary to personal relations or trust in certain suppliers. Through high-profile partnerships with UK universities that speak the same language as Pearson—emphasizing ‘flexible online study’ and meeting ‘the demands of the evolving labour markets’—the company is establishing itself as a value-for-money provider in an emerging marketplace where traditional universities are hybridizing with alternative providers. It is not the only provider in this space, and is in a struggle with competitors through formal processes such as tendering and procurement.

Pearson reportsA selection of recent Pearson reports focusing on digital transformations in education

The hybrid model of the traditional university partnering with a private company to offer online courses is an interesting example of a particular kind of market. Robertson and Komljenovic note that many universities are involved in ‘inside-out and for-profit’ activities where they are involved in market exchanges by selling services to others for profit—such as selling ‘student experience’ in the shape of study programmes to overseas students. Universities are also involved in ‘outside-in and for-profit’ activities where they act as buyers and contribute to the profits of other actors—such as software vendors, data suppliers, and other outsourced providers of services. The hybridized partnership model that Pearson is establishing creates a market that is both inside-out and outside-in at the same time, with the university gaining advantage from investing in Pearson (in terms of profit-turning distance student fees), and Pearson gaining an advantage through returns on its investment in the shape of paying student customers.

Universities, driven by the imperative to increase revenue, are increasingly seeking ways of recruiting international students their online offerings, thus opening up the market to multiple players and catalysing a price-setting competition between different online providers. At the same time, the price-setting is mirrored by the revenue generation promises of the online service for the university, and by the return on investment projections for the provider. Reputation for all the agencies involved is also a factor–Pearson gains reputational advantage from being embedded in elite institutions, while institutions gain reputational advantage from appearing innovative in digital delivery of future-focused, demand-driven services.

Market design and maintenance
The last micro-process of market-making—design, implementation, management, and maintenance—describes how various elements are brought into being and reproduced to enable ongoing stability, continued extraction of profits, and efficient value-for-money use of resources.

In order to maintain its own market position, Pearson has established a ten-year partnership model with a number of universities to provide online learning services, which is establishing its longevity in UK HE as well as scaling up its online learning services, one of the fastest growing parts of the business. It has built long-term relations of trust with its partner institutions. Through the provision of well-packaged products, providing expertise, staging encounters with the sector, and establishing agreements over price and value of its provision, Pearson is seeking to build and maintain the market for its products to ensure its long-term stability and profitability. In this sense—as Çalışkan and Callon observe in relation to markets more generally—Pearson and its partners don’t just trust each other, but invest considerable hope in the market relationship they are developing. There are market emotions at play in these efforts to implement, manage, and maintain a market of online learning services.

Through reports such as An Avalanche is Coming and Demand Driven Education, Pearson is also involved in market design. It is establishing a discourse and an imaginary of the reformed, transformative future of HE in ways which are closely aligned with the governmental objectives of the Department for Education and the Office for Students as a market regulator. For example, as one of the most powerful figures in British HE, Michael Barber is highly involved in shaping the sector, and is pursuing the same vision as chair of the OfS as he did in his role as chief education adviser at Pearson.

Here it is important to return to the data-sharing issue raised in The Telegraph. While Pearson may well have legitimate reasons to access OfS data as a HE provider, it also has ambitious plans around the use of digital data analytics in HE in ways that reinforce the data-led reforms represented by the OfS. The two organizations share an imaginary for the future design of HE. One obvious point of congruence is that Pearson’s online learning services will be able to provide the kind of fine-grained student data that conventional universities cannot. These data will be available on dashboards for university teachers and administrators to inspect in order to assess and evaluate the performance of courses, staff and students, in ways which reflect the OfS emphasis on performance metrics.

Pearson’s data-led ambitions go beyond performance dashboards however. Demand Driven Education, for example, highlights the potential of using AI and ‘predictive talent analytics’ to match students to career paths. This idea is highly congruent with the DfE’s software competition linking students and courses to earnings potential. Additionally, Pearson has invested considerably in data-driven digital technologies for use in the HE sector, including learning analytics and adaptive learning platforms that require access to huge quantities of past student data and real-time data from student activities on digital courses. It even has a partnership with IBM Watson to embed ‘AI tutors’ in digital courseware that can constantly track a student’s actions and progress, and then ‘interact’ to ‘improve student performance’.

Clearly, these kind of data analytics and AI technologies will require access to vast databases of student information. Their rollout would create a new market of student data that would be valuable to AI systems in a market exchange where students surrender their information in exchange for personalized learning support. As such, a clear and shared imaginary of a technology-intensive, demand-driven, skills-focused HE infuses both governmental and commercial ambitions to design and maintain a highly marketized higher education sector.

Edu-business as usual
The marketization of contemporary higher education has been brought into being and sustained through a range of processes, many of which Pearson is involved in. Of course, Pearson is not alone in making HE into a market, but it is a significant actor as a private company and a provider of digital technologies required by universities to compete in the imagined HE landscape of the future. Contemporary universities are increasingly involved in different kinds of markets and market exchanges, all of which involve considerable social activity, technical involvement, and effort to make, manage and maintain. Pearson is moving its business considerably into the making of HE markets, and establishing ‘edu-business as usual’ as the reformatory model for the future of higher education in the UK and beyond.

Posted in Uncategorized | Tagged , , , , , , , , | 1 Comment

Comments on ClassDojo controversy

Ben Williamson

ClassDojo Class Story Picture

The educational app ClassDojo has been the target of articles in several British newspapers. The Times reported on data privacy risks raised by the offshoring of UK student data to the US company–a story The Daily Mail re-reported. The Guardian then focused on ClassDojo promoting competition in classrooms. All three pieces have generated a stream of public comments. At the current time, there are 56 comments on the Mail piece, 78 at The Times, and 162 on The Guardian. I’ve been researching and writing about ClassDojo for a couple of years, on and off, and was asked some questions by The Times and The Guardian. So the content of the articles and the comments and tweets about them raise issues and questions worth their own commentary–a response to key points of controversy that also speak to wider issues  with the current expansion of educational technology across public education, policy and practice. ClassDojo has also now released its own response and reaffirmation of its privacy policy.

ClassDojo is highly divisive. Online newspaper comments often degenerate into polarized hectoring, but it is apparent (from both the comments and Twitter reactions) that the  expansion of ClassDojo has both enthused some teachers and appalled others. More subtly, some teachers dislike the reward app but like the social media aspects of it, which allow them to streamline messaging to parents and upload photos, videos and examples of student work. Other teachers appear to find the parent messaging a burden, as it makes them available to parents on-demand at all times. These tensions in themselves are reason for some caution regarding ClassDojo marketing claims that the product creates ‘happier classrooms’ and ‘connects teachers with students and parents to build amazing classroom communities.’ More pressingly, they point to real tensions over ed-tech apps among the teaching profession, and the potential of substantial non-use and resistance, as education becomes increasingly digitized.

Teachers’ views about ClassDojo have not been sought. Some comments pointed out that while the newspapers consulted experts and pundits (and ‘PC snowflakes’), none asked teachers about ClassDojo. As I pointed out to The Guardian, there simply is not a body of evidence of how ClassDojo is being used in practice (unless I’ve missed it). This is going to be a large research task since, as many comments pointed out, ClassDojo is used in very different ways as teachers adapt it to their own practices. It’s also in use around the world, in multiple languages. Nonetheless, detailed studies of the situated and contextualized uses of ClassDojo need to be undertaken to listen to teachers’ voices, observe how the app slips into classroom practices, and trace out the effects on children. While I would welcome more teachers’ voices about ClassDojo in the press, too, it’s important to be aware that ClassDojo recruits its own teacher ‘mentors’ and has a ‘Mentor Community’ of early-adopters. The mentors act as advocates for the app, with support from the company, to spread the word to other teachers (as explained in this interview from 28 minutes in). Although it appears ClassDojo has benefited from grassroots momentum, it has choreographed its bottom-up growth too. So selecting teacher voices to cut past its arms-length marketing community would be important.

Is adequate informed consent being sought and secured? As noted in The Times, the privacy policy for ClassDojo is 12,000 words, raising concerns that neither teachers nor parents are likely to fully understand the implications of signing children up to it. With the introduction of GDPR, this could raise problems—probably not for ClassDojo, which has a dedicated team of privacy consultants to ensure its compliance, but for schools if found to be breaching data laws. Ultimately, it is schools and teachers that collect and use the data, that are responsible for gaining informed consent for parents, that opt children in to ClassDojo or agree to parents’ opt-out wishes–again, we have too little evidence of school procedures to know the risks here. One comment at The Guardian reported resentment at the claim that the app had extended into teachers’ hands before awareness of the risks it raised had been considered. But this point was not an attack on teachers. It reflects a concern that teachers are being positioned as data privacy, security and consent experts when it is highly unlikely these are part of their initial professional education or continuing development. Nor, really, should teachers be expected to shoulder such responsibilities, especially if they carry legal consequences. Nonetheless, I think the lack of clarity here should trigger efforts to define what kind of ‘data literacy’ teachers, school leaders and governors may need in order to decide whether to use a free online ed-tech app or service, and what paperwork needs to be completed to ensure its use is ethical and legal. ClassDojo isn’t alone in raising difficult issues about consent. Pearson came under fire recently for experimental uses of student data without seeking their consent too.

Data privacy and protection concerns remain. ClassDojo has been dealing with privacy concerns since its inception, and it has well-rehearsed responses. Its reply to The Times was: ‘No part of our mission requires the collection of sensitive information, so we don’t collect any. … We don’t ask for or receive any other information [such as] gender, no email, no phone number, no home address.’ But this possibly misses the point. The ‘sensitive information’ contained in ClassDojo is the behavioural record built up from teachers tapping reward points into the app. ClassDojo has a TrendSpotter feature to allow analysis of those points over time. School leaders can view it. The behavioural points can follow children from one class to the next. Parent email addresses are required and are stored. While there is currently no indication of any kind of leak or breach from ClassDojo, there has been a steady increase in school cybersecurity incidents which raise wider questions regarding the security of student data. Even the well-resourced education platform EdModo was hacked recently, with the theft of 77 million users’ details. As reported in The Times, just like the commercial, financial and health sectors, ed-tech is not impervious to data security and privacy breaches.

Is ClassDojo monetizing student data? ClassDojo’s founders have stated clearly they will never sell student data for advertising. How it intends to make a profit and secure return on investment for its generous funders, however, remains unclear, giving rise to concerns about its monetization of student data. It has in the past suggested it could use those data to sell behavioural reports back to schools or even local authorities. It has also suggested it could sell ‘Education Bundles’ to parents (see from 51mins here). Its response to issues raised by the press confirmed it was seeking to produce saleable premium features. These are business proposals at present, and easily give rise to concerns about how the data may in future be used to make profit. As one commenter to The Guardian pointed out, ClassDojo needs to reassure teachers and parents by issuing clear and unambiguous statements about how it uses or intends to use the vast database of student behaviours it holds. It is not hard to imagine behaviourally-targeted premium content becoming feasible as it seeks to monetize the platform. Such fears may be unfounded. But it has to provide a return on investment for its investors at some point. It seems unlikely it will do so through sales of cuddly branded toys alone. Another way of securing a return on investment might be to sell the company, which would mean all ClassDojo data coming under its new owner’s privacy policy. Parents would be given 30 days to delete their child’s data in the event of a sale.

ClassDojo is Big Brother with a jolly green face. Not only does ClassDojo capture student behavioural information through the reward app; it also gathers photos, videos, digital portfolios of work, and permits messaging between teachers and parents. The company has slowly shifted from the behaviour app to become more like a social media platform for schools–even the rewards mechanism is similar to social media ‘liking.’ Just as Facebook presents itself as a platform for communities, ClassDojo’s founders and funders see it as the platform for building ‘amazing communities’ of children, teachers, schools and parents. The addition of ‘school-wide’ functionality makes it into the main communication mechanism for many schools, and a way for school leaders to have oversight of class data. Whether ClassDojo is really building ‘amazing communities’ is an empirical question. Researchers of  social media have identified the commercial imperatives and surveillance mechanisms behind their ‘community’ ideals. ClassDojo has subtly worked its way into the central systems of schooling, shaping how teachers think about and monitor student behaviour, reconfiguring how teachers and parents communicate, giving headteachers new ways of observing behavioural trends, and giving parents ‘real-time’ ability to track and watch their children in the classroom. It is shaping what a school community should (ideally) be and how it can connect, with student behaviour metrics at its core. Many commenters on the newspaper stories raised fears about the effects of constantly monitoring and quantifying children. Studies of ClassDojo as a platform would help to reveal its community-building effects, and interrogate to what extent it extends surveillance in schools.

ClassDojo is offshoring student data. Both The Times and Mail reported that ClassDojo offshores sensitive student information to the US. My understanding from the ClassDojo website is all information it collects is stored by Amazon Web Services—so it could be in Dublin, somewhere in mainland Europe, or in the US. Amazon currently has no cloud storage facility in the UK. But AWS is now part of the backbone of the web (as well as government intelligence), so ClassDojo offshoring data is not unique. AWS has also made it extremely cheap to set up social media sites as it drastically reduces costs of data storage and access. In this sense, ClassDojo is part of the massive expansion of Amazon power across the internet and worldwide web, and emblematic of how individuals’ personal information is increasingly distributed, offshored and scattered in cloud computing centres. It does raise the question of just how much influence and commercial gain Amazon may be developing in public education though.

Third party data use. AWS is just one of many third party services employed to help run ClassDojo. The Times latched on to DataDog due to a data breach a couple of years ago, and noted Google and Facebook too. As I understand, Google supplies web analytics—the kind of data that permits ClassDojo to monitor user numbers, visitors to the site, frequency of use of the service and so on. The newspaper coverage may have led readers to understand sensitive student data was being shared with these third parties—or even sold to them. Some commenters immediately presumed the data was being sold to Facebook and Google for targeted advertising (the phrase ‘if the product is free, you’re the product’ was repeated in a lot of the more critical comments). ClassDojo have constantly reiterated that selling student data for advertising is not their business model, and The Guardian reported that too.

ClassDojo is just a digital ‘sticker chart’ or ‘house points’. There are, of course, continuities between ClassDojo and older practices of rewarding and disciplining students. The difference from sticker charts to ClassDojo is that the awarding or deduction of points can be viewed by parents, that the points become a persistent behavioural timeline that can be viewed for trends by teachers and/or school leaders, and that records can be carried across as children move from one class to another. It is much more sticky than sticker charts, which is why, as The Guardian reported, it raises concerns about labelling students in behavioural terms.

ClassDojo is behaviourist & promotes competition. As The Times and Mail reported, ClassDojo promotes ‘gamification’ by ranking students by number of points, which potentially incentivizes students to seek further points through actions they know the teacher will reward—rather than out of interest in the topic of study itself. The Guardian suggested this could make classrooms overly competitive. Of course, there are issues here of the reproduction of existing inequalities. Is the awarding of dojo points equally distributed across socio-economic, ethnic and gender categories? It also raises issues about the central behaviourist mechanism of ClassDojo, which is based on theories of positive reinforcement of ‘correct behaviours’ through issuing rewards and punishments. But who says what’s ‘correct’ behaviour, and on what basis? Apps like ClassDojo appear to be ‘nudging’ students to conform to the behavioural ideals that their designers have programmed in to the software.

ClassDojo exemplifies the growth of positive psychology education. The ClassDojo company is quite clear what ‘correct’ behaviour looks like—it’s behaviour that indicates a student is developing a growth mindset, grit and character. Its founders always talk about these ideas in media interviews, and cite as their major influences the psychologists Carol Dweck (growth mindset) and Angela Duckworth (grit). ClassDojo even ran a ‘Big Ideas’ series of animations teaching children and teachers about growth mindset and how it can be observed in students’ behaviours. Growth mindset in particular is now a hugely popular idea in education, but it’s not uncontested. A recent meta-analysis of growth mindset studies showed very small effects on student achievement, which seems to suggest that claims about the benefits have been overblown and oversold. One Twitter comment likened ClassDojo to ‘corporate Buddhism.’

Ed-tech is taking over the classroom. The ClassDojo controversy exemplifies wider recognition of the influence and impact of the ed-tech industry in shaping what happens in schools, as some comments noted. The ed-tech industry has circulated the idea that public schooling is broken—too much one-size-fits-all teaching and high-stakes testing leads to disengaged and stressed kids—and that their apps and analytics can fix it by ‘personalizing’ learning and thereby support the development of students’ resilient growth mindsets. Such a view has helped the ed-tech industry promote itself as the solution to public problems, and to begin inserting itself actively within the daily routines of schools. ClassDojo has expanded through social media network effects as a free app into the hands of teachers in schools all over the world, ultimately transmitting its company vision of what classroom behaviours should be like into the actions of teachers and students. In many ways, this appears profoundly undemocratic, as responsibility for defining the aims and purposes of public education around the world is assumed by tech-sector entrepreneurs according to their own readings of popular psychological and behavioural theory.

ClassDojo promotes neoliberal, individualized responsibility. From an overtly sociological perspective, ClassDojo is part of a movement in education policy, technology and practice to hold individuals responsible for their behaviours while completely ignoring all the contextual, cultural, socio-economic and political factors that shape students’ behaviours. For sociologists of ‘character education,’ for example, the idealized student under contemporary neoliberal austerity is an entrepreneurial, resilient and self-transforming individual who can take personal responsibility for dealing with chronic hardship and worsening insecurity. As part of the movement to enhance student character and mindset, ClassDojo may be reproducing this ideal, inciting teachers to issue positive reinforcement rewards for behaviours that indicate the development of entrepreneurial characteristics and individual self-responsibility.

Data-danger is a new media genre. The risks of ‘data-danger’ for children reported in the articles about ClassDojo doubtless need to be viewed through the wider lens of media interest in social media data misuses following the Facebook/Cambridge Analytica scandal. This presents opportunities and challenges. It’s an opportunity to raise awareness and perhaps prompt efforts to tighten up student privacy and data protection, where necessary, as GDPR comes into force. ClassDojo’s response to the controversy raised by the press confirmed it was working on GDPR compliance and would update its privacy policy accordingly. Certainly 2018 is shaping up as a year of public awareness about uses and misuses of personal data. It’s a challenge too, though, as media coverage tends to stir up overblown fears that risk obscuring the reality, and  that may then easily be dismissed as paranoid conspiracy theorizing. It’s important to approach ed-tech apps like ClassDojo–and all the rest–cautiously and critically, but to be careful not to get swept up in media-enhanced public outrage.

Image: ClassDojo
Posted in Uncategorized | Tagged , , , | 1 Comment

The Office for Students as the data scientist of the higher education sector

Ben Williamson

Office for students

Data play a huge role in British higher education. The new regulator for the sector, the Office for Students, will escalate data collection and use in HE in years to come. Improving student information and data to ‘help students make informed decisions’ is one of its four key strategic priorities, but it has also raised concerns about its use of student data to increase competition and market pressure. The Labour Party has recently tried to block it in a bid to prevent the further entrenchment of market-oriented higher education policy in the system. However, there remains a need to focus in close detail on how the OfS will use data in its remit as a market regulator.

The cover of the recently published OfS regulatory framework for higher education in England gives some indication of how the new regulator sees itself as a data-centred site of sectoral expertise. It features a scientist peering into a microscope with apparent satisfaction about what she sees. The scientist, naturally, is the OfS, performing experiments and observing the results; the microscope is the technical and methodological apparatus that allows it to see the sector; and (out of shot) is the university, flattened on to glass for inspection–made legible as data to be zoomed in on, scrolled across, examined and compared with other samples from the sector.

The idea of the OfS as a scientist of the sector–or more specifically, as a data scientist of the sector–is intriguing. It smacks of assumptions of scientific rigor, objectivity, and innovation. This form of metric realism, which assumes data tell the truth, is the central epistemology of trends in datafication. The reality of ‘laboratory life’ inside the OfS, like all labs, is doubtless more fraught with disagreement, negotiation and compromise, as STS studies of science practices might note. Nonetheless, the OfS regulatory framework document is a key inscription device that, for the time being at least, gives us the best clues of its planned data activities over the coming years.

As part of ongoing work into the data infrastructure of higher education, I’ve spent some time with the regulatory document, trying to figure out how student data are likely to be used in future years (on uses of research data see the Real-time REF Review project). The OfS is just one of many actors involved in a project to upgrade the core infrastructure for student data collection–a decade-long project that’s been going on 7 years already and is due for national rollout in 2019/2020.

In these notes, I lay out some of the key things the OfS says about ‘data’ in the document. There are 87 uses of the word data in it, so through light-touch discourse analysis I’ve attempted to categorize the various ways the OfS approaches data. I’ve deliberately kept a lot of quotations intact with the addition of a few annotations.

Data as strategy
The first point is that ‘The OfS will develop a data strategy in 2018’ and ‘The information and data the OfS requires to fulfil its functions will be wide-ranging’ (20). This is both mundane and not. The fact that it is developing a data strategy at all–and a wide-ranging one at that–is indicative of how the OfS will make data into a central aspect of HE regulation. As Andy Youell of HESA (Higher Education Statistics Agency) has written, the framework represents a shift from ‘data informed to data led’ regulation with data analysts playing an increasingly influential role in HE policy.

The chair of the OfS, of course, is Sir Michael Barber, a long-standing advocate of metrics and performance delivery models in different aspects of government. His most recent role was as education adviser to the global educational company Pearson, where he oversaw its organizational pivot toward big data, predictive learning analytics and adaptive learning. Under his leadership, the OfS too approaches data as a core strategy for fulfilling its mandate.

Regulatory data
The OfS primary remit is to regulate HE, and it is positioning data as a core component to that work:

The use of information, including data and qualitative intelligence, will underpin how the OfS undertakes its regulatory functions. The OfS will take an information-led and proportionate approach to monitoring individual providers, ensuring that students can access reliable information to inform their decisions. (19)

Key terms here include ‘monitoring’, which confirms concerns that the OfS will possess powers of data-led performance measurement. As well as monitoring individual institutions, the OfS will ‘Monitor the sector as a whole, to understand trends and emerging risks at a sector level and work with the sector to address them’ (20). However:

This regulatory framework does not … set out numerical performance targets, or lists of detailed requirements for providers to meet. Instead it sets out the approach that the OfS will take as it makes judgements about individual providers on the basis of data and contextual evidence. (15)

From ‘monitoring’, then, comes ‘judgement’ from data and other evidence. The OfS comes across as a suspicious actor of evidence-based policy.

Improvement data
Another key use of data by the OfS is to ‘Target, evaluate and improve access and participation, and equality and diversity activities’ (20). As such, monitoring and judgement become the basis for targeted improvement plans, with HE institutions specifically singled out if underperformance is detected from the data in specific areas. As is well-known, the OfS will also ‘Operate the TEF’ (20) and take the outcomes of the 2018 statutory TEF review ‘into account as it considers the future scope and shape of the TEF’ (24). As such, it will be the main data-led judge of teaching quality and improvement in the sector.

Student choice data
As the Office for Students, driven by the political rhetoric of ‘putting students as the heart of the system’, a key ambition is to put students themselves in touch with sector data. This includes efforts to ‘improve the quality of information available to students’ (25). Two key quotes from the framework stand out:

Prospective students will be equipped with the means, underpinned by innovative and meaningful datasets and high quality information, to enable them to make informed choices about the courses that are right for them. (10)

[OfS will] ensure students can access reliable and appropriate information to inform their decisions about whether to study for a higher education qualification and, if so, identify which provider and course is most likely to meet their needs and aspirations. (20)

Here the OfS mirrors recently-announced plans by Universities Minister Sam Gyimah to support software developers to develop student-facing apps for price-comparison of university courses. It’s a controversial idea, announced as part of the renewed Teaching Excellence Framework (TEF), which requires the use of Longitudinal Educational Outcomes (LEO) datasets linking courses to earnings. It’s also controversial because it makes student choice conform with the MoneySupermarket model of product comparison based on value-for-money calculations, and further solidifies the idea of students as consumers and courses as products in a marketplace of comparable choices.

Data alignment
Alongside choice comes constraint. The OfS will ‘publish student outcomes and current and future employer needs as a way of informing student choice’ (17). This short sentence appears to carry two main messages: first, that access to outcomes data from institutions will help shape the choices of prospective students; and second that those choices should also be made through reference to ’employer needs’.

Indeed, the OfS is actively seeking to align HE outcomes to industry requirements, and will ‘Work with employers and with regional and national industry representatives to ensure that student choices are aligned with current and future needs for higher level skills’ (20). This is a very instrumentalized view of HE as part of the employment pipeline for high-skills jobs. Of course, students can choose to ignore this information. But presenting HE data in this way may itself shape the choice environment for students, with certain choices made more attractive than others.

Nudge data
Talking of choice environments, the OfS is ‘taking the latest thinking on behavioural science into account, to consider how best to present this data in a consistent and helpful way to ensure that students have access to an authoritative source of information about higher education’ (25).

Clearly, the idea of intervening in students’ choice through subtle behavioural means is not an accident; the OfS is actively engaging with the psychology of choicemaking to shape and nudge how students decide on their courses. In this sense, the OfS is seeking to instantiate the experimental methods of behavioural public policy in HE, using data to prompt or even persuade students to make ‘desirable’ choices.

But it may over time extend to logic of behavioural science to sectoral nudging at scale. According to one commentary on the regulatory framework, ‘the OfS should be encouraged to further consider behavioural theory and its various insights, such as those contained in “nudge” theory, and thus design interventions that incentivise compliance from the outset.’

Government data
Though it is notionally an arms-length agency–geographically, it’s located in the south west, along with all the other HE agencies–the OfS appears to enjoy a remarkably close and mutually reinforcing relationship with government. Not only did it emerge from BIS (now BEIS), but it will also use its expertise in HE data to:

Support the Department for Education, given its overall responsibility for the policy and funding framework in which the sector operates, and other public bodies such as UKRI in the delivery of their prescribed functions.

In contrast the role of a ‘broker‘ between government and the sector performed by HEFCE, the OfS appears to have a much more hand-holding relationship with government–despite being at arms-length, a government minister has the power to give it directions and demand advice or reports–while simultaneously strong-arming the sector into compliance.

Designated data body
In my longer project, I have focused on the work of HESA as a central agency for delivery of the new student data infrastructure for HE. HESA is part of the family of ‘official statistics’ agencies in the UK, and in 2017 applied for the position of ‘designated data body’ (DDB) to work with the OfS, a position conferred on it early in 2018 by central government ‘on the recommendation of the OfS’ (19). As such, the OfS will ‘Work with, and have oversight of, the designated data body (DDB) to coordinate, collect and disseminate information’ (17).

The DDB will collect, make available, and publish appropriate information on behalf of the OfS, and the OfS will be responsible for holding the DDB to account for the performance of those functions. (19)

As this makes clear, HESA is now subordinate to the OfS, acting on its behalf and held to account for its own performance in the statistical delivery of the data required by the OfS. As such, the work of HESA has shifted from statistical reporting to a much more politicized position, ‘play[ing] a key role in supporting and enhancing the competitive strength of the sector.’

Indicator data
Indicators are the principal power source in the OfS machinery. Through indicators, the OfS will ultimately receive regular signals of institutional performance which can then be used to assess risk or to identify need for intervention:

All providers will be monitored using lead indicators, reportable events and other intelligence…. These will be used to identify early, and close to real-time warnings that a provider risks not meeting each of its ongoing conditions of registration. (18)

The OfS will identify a small number of lead indicators that will provide signals of change in a provider’s circumstances or performance. Such change may signal that the OfS needs to consider whether the provider is at increased risk of a breach of one or more it its ongoing conditions of registration. These indicators will be based on regular flows of reliable data and information from providers and additional data sources. (49)

The mention of ‘close to real-time warnings’ is especially important, as it signals a significant acceleration in the temporality of HE data reporting, analysis and action. Under the OfS, universities are to be monitored for performance fluctuations and changes that, like economic spikes and dips, may be presented as informational flows on data dashboards to affect prompt and timely decision-making.

Longitudinal data
In addition to ‘close to real-time’ data, the OfS is seeking to expand and improve the use of longitudinal datasets and analyses:

The OfS will draw on the longitudinal education outcomes (LEO) dataset as an important source of information about graduate outcomes. Its further development will be a priority for the OfS, taking into account both its limitations and its significant potential. (25)

LEO consists of experimental statistics on employment and earnings of higher education graduates using matched data from different government departments, which has controversially been used to suggest that students can choose courses based on future earnings potential. It is also a significant methodological accomplishment, linking datasets about education, personal characteristics, employment and income, and benefits gathered from the departments of education, work and pensions, HESA and HMRC.

Comparative data
The data will also be used comparatively to assess different institutions against each other:

It is anticipated that this data will be largely quantitative and generated as a result of a provider’s existing management functions … allowing for greater consistency, comparability and objectivity when looking across a range of providers. (50-51)

Data-led comparison and benchmarking is of course at the heart of rows over HE marketization, as universities are incited to compete for prospective students and income. It drives institutions to showcase themselves as competitive, high-performing organizations, and is visible in all kinds of HE rankings such as UniStats and Complete University Guide tables.

Anticipatory data
Furthermore, the data used by the OfS will not be merely historical, real-time and comparative–it will be anticipatory too. The OfS will undertake ‘horizon scanning to understand and evaluate the health of the sector’ (17) and will use indicator data ‘to anticipate future events’ (161). In this sense, the OfS is simply mirroring the increasing use of predictive analytics in HE, with institutions in the UK already using data to forecast student progress or identify students at-risk of drop-out or non-completion. The use of predictive data practices by the OfS, however, will be applied to institutions and the sector as a whole–to predict, for example, providers at-risk of underperformance or financial difficulty.

Data burden
All this data collection and analysis activity sounds like it will be a heavy burden on institutions, and the OfS admits:

The implementation of the OfS’s data strategy may initially increase regulatory burden, but the long term aim is to use data to reduce regulatory burden. Such data requirements are not therefore intended as a regulatory burden on providers but to provide the information that allows the OfS to be an effective and proportionate regulator.

Perhaps, however, the heaviest burden will be the threat of punitive action based on constant OfS investigation of institutional data.

Data auditing & investigation
Regimes of audit and inspection are of course familiar across many sectors, and the OfS will make ‘data audits’ a part of the HE landscape:

The OfS will assess, as part of its routine monitoring activities, the quality, reliability and timeliness of information supplied by a provider including through scheduled or ad hoc data audit activity. If the OfS has reason to believe that information received is not reliable, it may choose to investigate the matter. (131)

It may even, in certain cases, ‘require information to be re-audited by a specified auditor, where the OfS has reasonable concern that the audit opinion does not provide the necessary assurance’ (56). It therefore appears that the OfS will demand new forms of meta-auditing of existing audit data.

Targeted action
Finally, the OfS proposes to use data as the basis for taking targeted action on institutions and the sector:

The OfS may also take targeted action if it needs to establish the facts before reaching a judgement about whether there is, or is likely to be, a breach of one or more ongoing conditions of registration.

May require the provider to take particular co-operative action by a specified deadline – these actions may include access to, information (including data), records or people, to enable the OfS to investigate any concerns effectively and efficiently. (56)

All in all, the OfS will instantiate a new regime of data in HE, emphasizing an empiricist faith in the ‘truth-telling’ capacities of digitally generated information. It is positioning itself as a source of data scientific expertise in the sector, treating universities as samples to be observed, students as specimens to be nudged to make choices based on data, and the sector as a whole as a laboratory for its experiment in data-led regulation.

Image: Office for Students
Posted in Uncategorized | Tagged , , , | 1 Comment

Personalized precision education and intimate data analytics

Ben Williamson

Precision_Pascal Volk

The word ‘precision’ has become a synonym for the application of data to the analysis and treatment of a wide range of phenomena. ‘Precision medicine’ describes the use of detailed patient information to individualize treatment and prevention based on genes, environment and lifestyle, while ‘precision agriculture’ has become an entire field of R&D focused on ‘engineering technology, sensor systems, computational techniques, positioning systems and control systems for site-specific application’ in the farming sector.

Precision medicine and precision farming approaches share a commitment to the collection and analysis of diverse data and scientific expertise for the purposes of highly targeted intervention. This may seem to make sense when it comes to medical diagnosis or optimizing crop production. But the production of precision may have more worrying consequences in other domains. Cambridge Analytica’s involvement in voter microtargeting through psychographic profiles, for example, has been termed ‘precision electioneering.’ Data-driven precision is therefore both a source of scientific certainty and of controversy and contestation.

Emerging interests in ‘precision education’ foresee the concerted use of learner data for purposes of implementing individualized educational practices and ‘targeted learning.’ As precision education has been described on the Blog on Learning and Development (BOLD):

Scientists who investigate the genetic, brain-based, psychological, or environmental components of learning … aim to find out as much as possible about learning, in order to accommodate successful learning tailored to an individual’s needs.

As this indicates, precision education is based on enormous ambitions. It assumes that the sciences of genes, neurology, behaviour and psychology can be combined in order to provide insights into learning processes, and to define how learning inputs and materials can be organized in ways best suited to each individual student. Advocates of precision education also suggest that complex computer programmes may be required to process these vast troves of data in order to personalize the learning experience for the individual.

The task of precision education requires the generation of ‘intimate’ data from individuals, and the constant processing of genetic, psychological, and neurological information about the interior details of their bodies and minds.

Unpacking precision education
It’s worth trying to think through what is involved in precision education, what it might look like in practice, and its implications for education policy.

In some ways, precision education looks a lot like a raft of other personalized learning practices and platform developments that have taken shape over the past few years. Driven by developments in learning analytics and adaptive learning technologies, personalized learning has become the dominant focus of the educational technology industry and the main priority for philanthropic funders such as Bill Gates and Mark Zuckerberg.

For example, the private non-profit National University (which runs concentrated online courses) has a ‘Precision Institute’ dedicated to precision education through ‘adaptive, machine learning instruction’ and ‘individualized course navigation’ using ‘real-time data generated from multiple sources of assessment tools.’ It is creating a Precision Education Platform for Personalized Learning to gather data from students in order to analyse relationships between ‘student characteristics and learning outcomes.’

A particularly important aspect of precision education as it is being advocated by others, however, is its scientific basis. Whereas most personalized learning platforms tend to focus on analysing student progress and assessment outcomes, precision education requires much more intimate data to be collected from students. Precision education represents a shift from the collection of assessment-type data about educational outcomes, to the generation of data about the intimate interior details of students’ genetic make-up, their psychological characteristics, and their neural functioning.

A key example is the Precision Learning Center, a partnership established 2017 between a number of labs across the University of California and Stanford Graduate School of Education, which is dedicated to ‘improving the science of learning and education using cognitive, psychological, biomedical and environmental information.’ One of the key partners is Neuroscape, a lab at UCSF with a stated mission to ‘use modern technology … to harness the brain’s inherent plasticity to enhance our cognition, refine our behavior, and ultimately to improve our minds.’ Another key Precision Learning Center partner, BrainLENS, also based at UCSF, integrates ‘the latest brain imaging techniques, genetic analysis, and computational approaches to examine processes of learning.’ BrainLENS focuses especially on the neurobiological underpinnings of ‘grit,’ ‘growth mindset’ and ‘social-emotional processing,’, the neural inheritance of cognitive and character traits, the genetics of cognition, and ‘personalized education’ based on predictive learner profiling.

As such, precision education is part of a surge in interest in educational neuroscience and educational genomics to ‘enable educational organisations to create tailor-made curriculum programmes based on a pupil’s DNA profile.’ Researchers are already undertaking studies of the links between genes and attainment, and proposed DNA analysis devices such as ‘learning chips‘ to make reliable genetic predictions of heritable differences between children in terms of their cognitive ability and academic achievement. Cheap DNA kits for IQ testing in schools may not be far away, driven by the ‘new genetics of intelligence.’ Psychology, too, has begun ‘advancing the science and practice of precision education to enhance student outcomes.’

Two articles on the BOLD blog have made a particularly strong case for a scientific approach to precision education and personalized learning. BOLD is itself an initiative funded by the Jacobs Foundation. It is ‘dedicated to spreading the word about how children and young people develop and learn’, with a pronounced emphasis on ‘the science of learning,’ neuroscience, developmental psychology and genetic factors in learning, along with considerations of the technologies and programs required to ‘tailor education to children’s individual needs, taking into account biological, social and economic differences as well as differences in their upbringing. … A wide variety of disciplines – psychology, neurobiology, evolutionary biology, pediatrics, education, behavioral genetics, computer science and human-computer interaction – need to be involved.’ It is in this context that BOLD has begun to address the potential and challenges of precision education.

The precision education articles by Annie Brookman-Byrne are thoughtful and cautious, but also clearly angled towards the development of an interdisciplinary field of research. In the first post, Brookman-Byrne acknowledges that ‘We are currently a long way off from having the kinds of information needed to realise precision education’ but argues that ‘the groundwork has started’:

  • Educational neuroscience is building an understanding of the science behind learning and teaching through the convergence of multiple disciplines and collaborations with educators.
  • Evidence is being gathered from a diverse set of fields, which will eventually lead to a deeper understanding of the mechanisms involved in learning.
  • The study of genetics is part of this investigation. Rather than something to be feared, our understanding of genes is simply another part of the puzzle in the science of learning.
  • As the appetite for evidence-based practice increases, the future of teaching and learning may well be personalised education that takes into account a host of factors about the individual.

In the follow-up post, Brookman-Byrne in particular highlights how it will be ‘necessary to gather vast amounts of data’ to make precision education possible:

  • This process of data collection has already begun, in the form of the many studies that aim to uncover the psychological and neurological processes that underpin learning.
  • If precision education is to come to fruition, each individual learner will need to provide their own data in order to establish which type of learning materials best suit them.
  • Precision education would draw on the best available evidence from a host of factors which might include test scores, genetic data, the learner’s own interests, and environmental factors.
  • Precision education may also lead to greater choice for the learner – in particular, adolescents choosing which subjects to focus on later in school.
  • A very strong scientific understanding of the mechanisms that influence learning will be the first step towards the realisation of precision education.

Brookman-Byrne acknowledges that it is too early to say how precision education will appear in practice, if at all. But BOLD itself has already begun to propose that neuroscience provides ‘ever-advancing technologies that allow us to image the thinking brain,’ thus enabling educational neuroscientists to ‘know more than ever before about how students learn,’ although it also cautions that ‘it’s not easy to translate these findings to the classroom.’ It has also supported researchers examining the links between genetics and educational success.

Regardless of the cautions and caveats, the sciences of the brain and the gene, as well as psychology and behavioural science, are already becoming lodged in education policy. It is easy to see the potential appeal of precision education to policymakers eager to find ‘scientific’ evidence-based solutions to educational problems. A new combination of education policy and the human sciences is currently emerging in the context of policy preoccupations with ‘what works.’ As Kalervo Gulson and P. Taylor Webb have argued, new kinds of ‘bio-edu-policy-science actors’ may be emerging as authorities in educational policy, ‘not only experts on intervening on social bodies such as a school, but also in intervening in human bodies.’

Critical approaches to precision education
Many people will find the ideas behind precision education seriously concerning. For a start, there appear to be some alarming symmetries between the logics of targeted learning and targeted advertising that have generated heated public and media attention already in 2018. Data protection and privacy are obvious risks when data are collected about people’s private, intimate and interior lives, bodies and brains. The ethical stakes in using genetics, neural information and psychological profiles to target students with differentiated learning inputs are significant too. Such concerns will be especially acute as politics press for greater emphasis on the biological determinants of learning, or as precision education approaches are developed by startup companies with dubious credentials.

Precision education also needs to be examined in considerable detail to understand the feasibility of its promises and claims.

The technical machinery alone required for precision education would be vast. It would have to include neurotechnologies for gathering brain data, such as neuroheadsets for EEG monitoring. It would require new kinds of tests, such as those of personality and noncognitive skills, as well as real-time analytics programs of the kind promoted by personalized-learning enthusiasts. Gathering intimate data might also require genetics testing technologies, and perhaps wearable-enhanced learning devices for capturing real-time psychophysiological data from students’ bodies as proxy psychometric measures of their responses to learning inputs and materials. By combining neurological, genetic, psychological, and behavioural data along with environmental factors and test scores, precision education is an outgrowth of current enthusiasms to ‘quantify the human condition’ while reducing human being to ‘databodies‘ of informational patterns.

Each of the technologies for the production of intimate data about students relies on complex combinations of scientific knowledge, technical innovation, business plans and social or political motivations. Some of them are likely not to be interoperable, either technically or intellectually. Just as software platforms do not always plug into each other effectively, there remain significant disciplinary cleavages between psychology, neuroscience and genetics which would need bridging for precision education to become possible. There are already concerns that precision medicine can reproduce bias and discrimination through its datasets and outcomes. Precision education data could be a similarly risky exercise in data collection and use.

In addition, brain science, genetics and psychology have all been subjected to considerable critique. Contemporary science often appears to treat the brain, the body and the mind as malleable and manipulable, able to be ‘recoded’ and ‘debugged’ in the same ways as software, as distinctions between the computational and the biological have begun to dissolve. Concerns have also been raised about ‘the new geneism’ and the potential for genetic data to reproduce ‘dangerous ideas about the genetic heritability of intelligence.’ Both old controversies in the use of genetics, neuroscience and psychology in the governing of bodies and behaviours, and new concerns about treating the body as if it were silicon, have potential for reproduction through precision education.

One productive way forward might be to approach precision education from a ‘biosocial‘ perspective. As Deborah Youdell  argues, learning may be best understood as the result of ‘social and biological entanglements.’ She advocates collaborative, inter-disciplinary research across social and biological sciences to understand learning processes as the dynamic outcomes of biological, genetic and neural factors combined with socially and culturally embedded interactions and meaning-making processes. A variety of biological and neuroscientific ideas are being developed in education, too, making policy and practice more bio-inspired.

Other biosocial studies also acknowledge that ‘the body bears the inscriptions of its socially and materially situated milieu,’ being  ‘influenced by power structures in society,’ and that ‘the brain is a multiply connected device profoundly shaped by social influences.’ The social gets ‘under the skin’ to impress upon the biological. As such, a biosocial approach would seek to understand precision education in both biological and social scientific terms by appreciating that the social environments in which learning takes place do in fact inscribe themselves on bodies and brains. Such an approach would view precision education as a source of power, reshaping the social environment of the school or the university in order to intervene in the biological, neurological and psychological correlates of learning.

Intimate data analytics
The intimate data analytics of precision education raise a few key themes for future interrogation:

  • The emergence of bio-evidence-based education policy, as data captured about the biological–genetic, neural and psychophysiological–details of students’ bodies are turned into policy-relevant knowledge and targets of intervention
  • The translation of students into bioinformational flows of numbers and scientific categories, bringing about new ways of understanding learning processes as biologically-centred, and erasing other perspectives
  • The accumulation of biocapital by companies that are able to market products, collect, analyse, and then exchange and sell students’ biodata, whether directly to schools and parents or by less direct means
  • The development of bioeconomies of educational data as genetic, neural and psychophysiological technologies and assessment tools become new competitive marketplaces (including scam outfits looking to exploit interest in student biodata)
  • The sculpting of new student biosubjectivities, as students are addressed and begin to address themselves in quantified, biological terms, and are incited to undertake activities to improve themselves in response to their genetic, neural and psychophysiological data

Whether or not precision education ever really takes off as an interdisciplinary field of R&D, let alone influences policy and practice, may itself matter very little if we recognize that many of the technologies and priorities captured in this emerging category already exist or are coming online. Developments in neurotechnology, psychoinformatics and genetics technologies are either already available or in the development pipeline for mining intimate data from the interior of bodies and brains. And with newer developments such as neurofeedback, gene-editing and behaviour-change apps, technologies stand poised not just to mine biology, cognition and behaviour, but to tweak and modify them too. As a biosocial perspective would see it, intimate data analytics get under the skin.

Image by Pascal Volk
Posted in Uncategorized | Tagged , , , , , , | 5 Comments

Learning from psychographic personality profiling

Ben Williamson


Contemporary education policy and practice is increasingly influenced by developments in data analytics. The big data analytics story of the year so far concerns the alleged ‘psychographic’ profiling techniques of data analytics firm Cambridge Analytica and its use of the personal data of millions of Facebook users. A torrent of writing has appeared looking at it through various lenses. Among the best commentaries are by Jamie Barlett, who argued that the Cambridge Analytica/Facebook controversy reflects more generally how politics is drifting into a behavioural science of algorithm-based triggers and nudges which are tuned to personality and mood. From a different perspective, David Beer cast the enterprise through a cultural lens, arguing that Cambridge Analytica’s efforts reflect the wider aspirations of the data analytics industry, which ‘is aiming to turn anyone into a data analyst … to speed us up, make us smarter, allow us to see into the hidden depths of organisations, allow us to act in real-time or enable us to predict the future’.

Both these pieces resonate with developments in education that I’m currently tracking–that is, the weaving together of the sciences of the brain and psychology with data-processing educational technologies and education policy. The ways education policy is becoming a kind of behavioural science, supported by intimate data collected about psychological characteristics or even neural information about students, is the central focus of this ongoing work. Expert knowledge about students is increasingly being mediated through an edu-data analytics industry, which is bringing new powers to see into the hidden and submerged depths of students’ cognition, brains and emotions, while also allowing ed-tech companies and policymakers to act ‘smarter’, in real-time and predictively, to intervene in and shape students’ futures.

A more direct line can be drawn between the psychographic personality profiling of Cambridge Analytica and education, however. Although the science of psychographic personality profiling that Cambridge Analytica has boasted it has perfected may well be highly dubious, it is based on an underlying body of psychological knowledge about how to measure and classify people by personality that has a long history. At the core of the Facebook dataset it allegedly used for psychographic profiling and micro-targeting of US voters is a psychological model called ‘The Big Five’, and in particular instruments such as the Big Five Inventory originally created by Oliver John of the Berkeley University Personality Lab. When Cambridge Analytica contracted Aleksandr Kogan of Cambridge University for its psychographics project, it was to implement a digital survey on Facebook that would, like the Big 5 Inventory but adapted into the form of an online quiz, capture intimate personal data on users’ ‘openness’, ‘conscientiousness’, ‘extroversion’, ‘agreeableness’ and ‘neuroticism’ (OCEAN). These categories are believed by personality theorists to be suitable for capturing and classifying the full range of human personalities. OCEAN is a universal, culture-free psychological classification for assessing and categorizing human characteristics.

Having possession of a vast quantified personality database would clearly grant power to any organization wishing to find ways to engage, coerce, trigger or nudge people to think or behave in certain ways–advertisers, say, or propagandists. Whether it worked in Cambridge Analytica’s case remains open to debate–though I think Jamie Bartlett is right to understand this as just one example of a shift to new forms of behavioural government in the wider field of politics. Mark Whitehead and colleagues call it ‘neuroliberalism‘–a style of behavioural  governance that applies psychology, neuroscience and behavioural sciences methods and expertise to public policy and government action–and convincingly show how it has been installed in governments and businesses around the world. In education we have already seen how organizations such as the Behavioural Insights Team (‘Nudge Unit’) are being contracted to provide policy-relevant insights based on psychological and behavioural expertise and knowledge.

The more direct connection between the Big Five personality profiling and education, however, comes in the shape of the OECD’s planned Study on Social and Emotional Skills. A computer-based international assessment scheduled for implementation in 2019, at its core the test is a modified version of the Big Five Inventory. I previously called it ‘PISA for personality testing‘, and detailed how the OECD had drawn explicitly on the expertise of both personality psychologists and econometrists to plan and devise the test. Indeed, the architect of the Big Five Inventory, Oliver John, presented his work at the OECD meeting where its application to social-emotional skills testing was agreed. When it is implemented in 2019, the social and emotional skills test will assess 19 skills which fit into each of the Big Five categories. Moreover, it will collect metadata from test-takers which might also be used to support the assessment.

To be clear, the connection I am trying to make here is that personality profiling–the production of psychographic renderings of human characteristics–is not just confined to Cambridge Analytica, or to Facebook, or to the wider data analytics and advertising industries. Instead, the science of personality testing is slowly entering into education as a form of behavioural governance.

The OECD test is not that dissimilar to the personality quiz at the heart of the Cambridge Analytica/Facebook scandal. The same psychological assumptions and personality assessment methods underpin both. And while Cambridge Analytica appears to have been an unofficial instrument of a potential government, the OECD assessment is supposed to be a policy instrument of global governance–encouraging national departments of education to focus on calculated levels of student personality. The OECD assessment of social-emotional skills shares personality testing approaches with the Cambridge Analytica personality quiz, and its results are intended to support political decisionmaking.

That said, the OECD test itself will not produce individual psychographic profiles of students. Its emphasis is on aggregating the data in order to assess, at macro-scale, whether countries have the right stock of social-emotional skills to deliver future socio-economic outcomes. Large-scale personality data is presumed to be predictive of potential productivity.

However, the OECD is a powerful influence on national education policies at a global scale. The impact of PISA is well known–it has reshaped school curricula, assessments and whole systems in a ‘global education race‘. Could its emphasis on personality testing similarly reshape schooling practcies and education policy priorities? Already, a commercial market of ed-tech apps and products–such as ClassDojo–has emerged to support and measure the development of students’ social-emotional skills in schools, while educational ‘psycho-policies’ and government interventions have begun to focus on social-emotional categories of learning, such as grit, growth mindset and character, too. In the UK, for example, the Department for Education supports the development of character skills in schools.

While the OECD is only measuring student personality, the inevitable outcome for any countries with disappointing results is that they will want to improve students’ personalities and character to ensure their competitiveness in the global race. Just as PISA has catalysed a global market in products to support the skills tested by the assessment, the same is already occuring around social-emotional learning, character skills and personality development. While ClassDojo is currently popular as a classroom app for supporting growth mindset and character development, it is certainly conceivable that it could be used to promote and reward the Big Five (its website says it is also compatible with Positive Behavioural Interventions and Support, a US Department of Education program, for example–it’s flexible to market demands).  It’s not a huge leap to link ClassDojo to psychographic personality profiling–ClassDojo’s founders have openly described being inspired by economist James Heckman, and Heckman helped shape the OECD’s views on the links between personality and economic productivity.

Just as ClassDojo can already be used to produce visualizations and reports based on teachers’ observations of individual students’ behaviours, future iterations or other products could be used to produce psychographic educational profiles of individuals based on personality categories. It’s not hard to imagine teachers awarding ClassDojo points for behaviours that correlate with the Big Five. Educational applications of wearable biometrics, affective computing and even neuroheadsets to monitor attentional levels and emotional arousal are sitting at the edges of ed-tech implementation, ready to render students in psychographic detail.

Given current developments in personality testing, character development and social-emotional skills modification through ed-tech, maybe we can paraphrase Jamie Bartlett to suggest that not only are politics drifting to behavioural government, but education policy and practice too are beginning to embrace a behavioural science of algorithm-based triggers and nudges which are tuned to personality and mood. Education appears to be generating more intimate data from students, mining beneath the surface of their measurable knowledge to capture interior details about their personality, character and emotions. Policymakers, test developers and ed-tech producers may not openly say so, but just like Cambridge Analytica they are seeking to learn from psychographic personality profiling.

Image by Mark Vegas
Posted in Uncategorized | Tagged , , , , | 5 Comments

10 definitions of datafication (in education)

Ben Williamson


What is datafication? And how does it affect education? These questions were put to me ahead of conference discussion panel recently. While writing a few notes, it quickly became apparent I needed some categories to sort out my thinking. In simple terms, datafication can be said to refer to ways of seeing, understanding and engaging with the world through digital data. This definition draws attention to how data makes things visible, knowable, and explainable, and thus amenable to some form of action or intervention. However, to be a bit more specific, there are at least ten ways of defining datafication.

1 Historically
Datafication as we know it today has a long history, going back at least as far as the industrial revolution and efforts  then to capture statistical knowledge of the state, society and its population, and then to use that knowledge to come up with better institutions and practices of management and intervention. David Beer offers a really good historical view of the historical evolution of ‘metric power.

In terms of education, Michel Foucault of course articulated how children could be counted in terms of their development, knowledge, behaviour, progress, worth, cleanliness, age, social class and character, in order that they could then be ranked, supervised and disciplined more effectively. He called schools and classrooms ‘learning machines’. Within education policy, Martin Lawn and others have charted the historical rise of data in education systems. These authors have shown, for example, how the nineteenth century Great Expositions became carefully stage-managed presentations of different states’ educational performance rates, so allowing different national systems to be compared for their effectiveness in producing the labour required for social and economic progress. These early historical developments in the datafication of education have slowly given rise to the ‘global race’ that we still see in education policy today, driven by comparative analysis of performance in large scale assessments (LSAs).

Although there are clear continuities from the past to the present, the current version of datafication through ‘big data’ also represents a bit of a rupture with the past. The assessment data that dominates LSAs is sampled, collected at long temporal intervals, and slow to collect. New digital datafication technologies such as ‘learning analytics,’ by contrast, harvest data in real-time as students complete tasks, enable high-speed automated analysis and feedback or adaptivity, and can capture data from all participants rather than a sample. They also allow individuals to be compared against each other and with aggregated norms calculated in massive datasets, rather than the broad-brush comparison of national systems enabled by LSAs.

2 Technically
In technical terms, datafication is a process of transforming diverse processes, qualities, actions and phenomena into forms that are machine-readable by digital technologies. Datafication allows things, relationships, events, processes to be examined for patterns and insights, often today using technical processes such as data analytics and machine learning which rely on complex algorithms to join up and make sense out of thousands or millions of individual data points. The technical language of datafication can get quite bewildering, proliferating to include technical concepts and methods which are even being modelled to some degree on human processes–so-called ‘cognitive’ computing, deep ‘learning’, and ‘neural’ networks.

Thinking educationally, it’s intriguing that much of the language associated with digital datafication refers to learning, training and neural processes of cognition. Datafication relies to a significant technical degree on ‘learning machines’. Algorithms have to be ‘taught’, using ‘training sets’ of past data to determine how to act when put ‘into the wild’ to process live and less structured data. This can be done through ‘supervised learning’, which sounds rather like direct instruction, or through ‘unsupervised learning,’ which is more like autodidactic learning through experience. DeepMind’s AlphaGo Zero–a highly advanced AI program for unsupervised learning–for example, learns purely from its own experience and from a ‘self-reinforcement learning algorithm’ that rewards it for every ‘success’ it experiences. BF Skinner’s famous behaviourist ‘teaching machines’ have been encoded in algorithmic form.

Also, in the technical sense datafication relies on the material infrastructure of hardware, software, servers, cables, connectors, microprocessors—all of the ‘stuff of bits’, as Paul Dourish has argued, that has to be assembled together in order to generate data. The materialities of datafication significantly shape how data are generated and how they can be put to use.

3 Epistemologically
Thinking epistemologically about datafication concerns what we can know from data. For some, datafication rests on the assumption that the patterns and relationships contained within datasets inherently produce meaningful, objective and insightful knowledge about complex phenomena. As Rob Kitchin has shown, this empiricist epistemology assumes that through the ‘application of agnostic data analytics the data can speak for themselves free of human bias or framing.’

For critics, however, this empiricist epistemology is flawed because all data are always framed and sampled; data are not simply natural and essential elements that are abstracted from the world in neutral and objective ways to be accepted at face value. As Nathan Jurgenson has put it, data do not provide a ‘view from nowhere‘ because factors such as algorithms, databases, and venture capital pre-format data and so shape what may be seen or known. Data don’t tell the unbiased ‘truth’ because the data points captured and analysed are always affected by the choices of the original designers.  Making sense of data is also always framed–data are examined through a particular lens that influences how they are interpreted. Jose van Dijck has described an epistemological ‘data-ist’ trust in the numbers provided through datafication.

Epistemology in this sense extends to include a methodological definition of datafication. Datafication is a process of employing certain data scientific methods to produce, analyse and circulate data. These methods have their own social origins–or ‘social lives’ as John Law claims–and derive from and reproduce the particular epistemological assumptions of the expert groups that created them. Datafication, in other words, is epistemological and methodological.

4 Ontologically
Datafication also raises ontological questions about what data really are. One view is that data are simply ‘out there’ waiting for collection, as supposed in the term ‘raw data’ and in the view that ‘data speak for themselves’. The other, more common in contemporary social science, is that data are inseparable from the software and knowledge employed to produce them. Data do not simply represent the reality of the world independent from human thought but are constructions about the world. These insights into the ontology of data are often associated with sociological theories of science, technology, statistics and economics–Sheila Jasanoff pulls these strands together in a recent article on ‘data assemblages’.

Moreover, however, data have consequences and shape individual actions, experiences, decisions and choices. In that sense, they shape and change reality; they have ontological consequences and partake in making up reality. So, ontologically, datafication is a product of the social world and of specific practices, but it also acts upon the world and on other practices, changing them in various ways. For example, Marion Fourcade has  shown how the statistical practices of economists, of ‘ever-finer precision in measurement and mathematics … have constructed a wholly separate and artificial reality,’ a ‘make believe substitution’ that is entirely made out of historical and disciplinary conventions, ‘nothing more’. Yet, as Fourcade adds, if you change the statistical convention, ‘the picture of economic reality changes too’, sometimes with dramatic real-world results. As such, datafication is ontological because it has the potential to produce or perform different versions of reality–what actor-network theorists call ‘ontological politics’.

5 Socially
Datafication is accomplished by social actors, organizations, institutions and practices. So today we have data scientists, data analysts, algorithm designers, analytics engineers and so on all bringing their expertise to the examination of data of all kinds. These people or experts are housed in businesses, governments, philanthropies, social media firms, financial institutions, which have their own objectives, business plans, projects and so on, which frame how and why digital data are captured and processed. In this sense, datafication can be defined socially because it is always socially situated in specific settings and framed by socially-located viewpoints.

In education, we have ‘education data scientists’ and learning analytics practitioners, engineers and vendors of personalized learning platforms, even entrepreneurs of artificial intelligence in education, all now bringing their own particular forms of expertise to the examination and understanding of learning processes, teaching practices, schools, universities and educational systems. They are supported by funding streams from venture capital firms, philanthropic donations from wealthy technology entrepreneurs, impact investment programs which all direct financial resources to the datafication of education. Putting it super-simply, datafication exists because people and institutions of society make it so.

Moreover, datafication needs to be defined socially because much data is captured from the social world—people, institutions, behaviours and the full range of societal phenomena are the stuff of data. As Geoffrey Bowker has memorably put it, ‘if you are not data, you do not exist’! People are data; societies are the data. Even more consequentially, these social data can be used to reshape social behaviours. Bowker adds that as data about people are stored in thousands of virtual locations, reworked and processed by algorithms, their ‘possibilities for action are being shaped’.

6 Politically
The new actors undertaking datafication are invested with a certain form of data power. Expert authority, as William Davies argues, increasingly resides with those who can work with complex data systems to generate analyses, and then narrate the results to the public, the media and policymakers. This is why governments are increasingly interested in capturing the digital traces and datastreams of citizens’ activities. By knowing much more about what people do, how they behave, how they respond to events or to policies, it becomes possible to generate predictions and forecasts about best possible courses of action, and then to intervene to either pre-empt how people behave or prompt them to behave in a certain way. For example, there’s a whole ‘Data for Policy’ movement and new funding streams for ‘GovTech’ applications in the UK to realize the potential of ‘Government by Algorithm’. Evelyn Ruppert and colleagues have termed this ‘data politics’ and note that power over data no longer only belongs to bureaucracies of state, but to a constellation of new actors in different sectoral positions.

Something of an arms race is underway by those organizations that want to attain data power in education. Education businesses like Pearson are putting large financial, material and human resources into technologies of datafication, and are seeking both to make it commercially profitable and also attractive to policymakers as a source of intelligence into learning processes. Dorothea Anagnostopoulos and colleagues have written about the ‘informatic power’ possessed by the organizations and technologies involved in processing test-based data. But some of that power is now being assumed by those actors, organizations and analytics technologies that process digital learning data and turn it into actionable intelligence and adaptive, personalized prescriptions for pedagogic intervention.

7 Culturally
This definition draws attention to datafication as a cultural phenomenon and as a concept that has attained a privileged position in the view of the public, businesses, governments and the media. Increasingly, it seems, data and algorithms are invested with promises of objectivity and impartiality, at a time when human experts are not necessarily to be trusted because they’re too clouded by subjective opinion, bias and partiality. An article in the Silicon Valley ed-tech magazine EdSurge effectively represented how the objectivity of data has been culturally adopted and accepted in some parts of the education sector. It claimed teachers are unable to recognize how well students are engaging with their own learning because the teachers are too subjectively biased. This speaks to a cultural narrative which frames datafication in terms of mechanical objectivity, certainty, impartiality.

But the cultural acceptance or otherwise of datafication is of course context-specific. In some European countries such as Germany the cultural narrative of datafication and algorithms is more contested, and perhaps legally and politically inflected. It would be interesting to tease out how datafication in general and datafication of education in particular becomes culturally embedded or not in different geographical, political and social locations. So for example, datafication in education may appear to be a largely Anglophone phenomenon. Recently, however, a new report on ‘Learning Analytics for the Global South’ appeared which considered ‘how the collection, analysis, and use of data about learners and their contexts have the potential to broaden access to quality education and improve the efficiency of educational processes and systems in developing countries around the world’. Datafication of education is becoming culturally sensitive.

8 Imaginatively
Datafication is the subject of breathless utopian fantasies of real-time responsive smart cities, global Internet of Things, human-machine symbiosis, algorithmic certainty, hyperpersonalized services, driverless cars and so on—a world plastered with a new shiny surface of machine-readable data, which acts as a fuel for an automated, responsive, personalized environment which constantly moulds itself around us.

Education is affected by the same fantasies and utopian imaginaries. At last year’s British Science Festival, Sir Anthony Seldon, Master of Wellington College and VC of the University of Buckingham, presented a picture of a robotized future of schools, with ‘extraordinarily inspirational’ machines completely personalizing the education journey, ‘adaptive machines that adapt to the individual,’ that ‘listen to the voices of the learners, read their faces and study them in the way gifted teachers study their students,’ know what ‘excites’ learners and can ‘light up the brain’ through ‘intellectual excitement.’

Datafication is in this sense the subject of imagination–but imaginary visions can sometimes catalyse real-world applications, with powerful visionaries gathering coalitions of support to make reality conform with their utopian ideals. SIlicon Valley entrepreneurs have animated their visions of data-driven education through the capture or donation of funding and engineering teams, for example.

9 Dystopically
In contrast to the utopian imagination, datafication is also a great source of anxiety. Concerns circulate about gross privacy invasion, panoptic dataveillance, data bias against ethnic groups, the manipulation of behaviours through persuasive design, viral spread of computational propaganda powered by data-driven profiling and targeting, information war, data breaches, hacking and cyberterrorism, and that datafication reduces people to their data points—as if we are our data, perfectly knowable through our digital traces.

In education, the children’s writer Michael Rosen recently posted a tweet along these lines, writing that: ‘First they said they needed data about the children to find out what they’re learning. Then they said they needed data about the children to make sure they are learning. Then the children only learnt what could be turned into data. Then the children became data.’

Recently a lot of commentary has emerged about the social and emotional anxiety experienced by students in both schools and universities. Many of these psychological frailties are at least partly blamed on social media and other technologies that harvest up data from young people and then target and manipulate them for commercial profit. Richard Freed calls it the ‘tech industry’s psychological war on kids’. These kinds of stories are now part of a kind of cultural narrative about the dystopian, nightmarish effects of datafication on children, which is happening both in their own time and in an increasingly data-driven education.

10 Legally & ethically
Finally, there are legal, ethical and regulatory mechanisms shaping datafication. Europe is much more privacy-focused than the US, for example, as the incoming EU General Data Protection Regulation shows. So how datafication plays out—what datafication is—is itself shaped by law, ethics and politics.

In the US, for example, specific federal acts such as COPPA and FERPA exist to protect children’s privacy, and organizations like the Internet Keep Safe Coalition enforce them. Other organizations such as the Future of Privacy Forum exist to produce ‘policy guidance and scholarship about finding the balance between protecting student privacy and allowing for the important use of data and technology in education’. The US also has the 2015 Every Student Succeeds Act (ESSA), which has made it possible for states and schools to apply for additional funding for personalized learning technologies. So there’s a new federal act in place which performs the double task of stimulating market growth in adaptive personalized learning software and incentivizing schools to invest in such technologies in the absence (or at least shortage) of public funding for state schooling.

Of course, the ethical issues of datafication of education are considerable and fairly well rehearsed. An interesting one is the ethics of data quality–a topic discussed by Neil Selwyn at the recent Learning Analytics and Knowledge (LAK) conference. There are significant potential consequences of poor data in learning analytics platforms. In other spaces, such as healthcare and military drones, the consequences of poor data quality can lead to disastrous, even fatal, effects. Poor quality datafication of education may not be quite so drastic, but it has the potential to significantly disrupt students’ education by leading to mismeasurement of their progress, misdiagnosis of their problems, or by diverting them on to the ‘wrong’ personalized pathways.

I’m sure datafication could be cut in different ways. But hopefully these categories capture some of its complexity.

Image by Kevin Steinhardt
Posted in Uncategorized | Tagged , , , , , | 2 Comments