In the book A Vast Machine, Paul Edwards refers to knowledge infrastructures as the “robust networks of people, artifacts, and institutions that generate, share and maintain specific knowledge about the human and natural worlds”. In another report, knowledge infrastructures are defined as:
ecologies, or complex adaptive systems; they consist of numerous systems, each with unique origins and goals, which are made to interoperate by means of standards, socket layers, social practices, norms, and individual behaviours that smooth out the connections among them.
Both the development of such infrastructures and their effects raise important questions for education. We can think of all education as forms of knowledge infrastructures, but in relation to this post, I am particularly concerned with the work of digital technologies in the knowledge infrastructures of education.
Recent developments in digital technology and growing research on the work of software, code and algorithms in daily life have the potential for providing significant resources through which to explore the embedded work of digital technologies and knowledge infrastructures in the practices of education–what I elsewhere describe as the hidden curriculum of software. This research points us towards the knowledge infrastructures, software and associated practices through which digital education is enacted. Knowledge infrastructures do not simply represent data. They select, translate and transform them. It is the ontologies, codes, algorithms and the linking of data, the applications of technical standards, and ways in which decision-making and reasoning are articulated in digital technologies that make things perform in ways and become specific actors in particular educational practices.
In relation to this I shall here focus on the role played by forms of classification, standards and ontology-building associated with the development of digital databases, and the ways in which complex knowledge is represented. To classify requires the removal of ambiguity from representation, when of course many knowledge claims are ambiguous and contested. As Paul Edwards and colleagues argue, the digitalization of data raises three main issues:
first, a plethora of “dirty” data, whose quality may be impossible for other investigators to evaluate; second, weak or non-existent guarantees of long-term persistence for many data sources; and finally, inconsistent metadata practices that may render reuse of data impossible – despite their intent to do the opposite.
In similar vein, Susan Halford, Catherine Pope and Mark Weal explore the implications of the development of the semantic web and the promises propounded about the internet as a linked database. The big promises of open data are related in terms of transparency, transcending knowledge silos and the potential to make greater advances in knowledge. These are not unimportant. However, they also point to some of the challenges associated with such promises, not least, the naming of data entities, the structuring of data and the processing of data. To name and categorize an entity in a consistent way across space and time is not without its challenges, not least because such categories themselves might be subject to challenge. These challenges are commonly identified by those who research knowledge infrastructures. As a result, Halford and colleagues argue that in the development of such infrastructures “making some things ‘known’ tends to obscure other things and, indeed, ways of knowing” and that “ontology building is not a simple or solely technical matter.”
In this respect, ontology-building, the naming and structuring of digital data in the enactment of knowledge infrastructures, has itself become a subject of increasing research. For example, in their paper ‘Between meaning and machine,’ David Ribes and Geoffrey Bowker argue that:
Ontologies are an information technology for representing specialized knowledge in order to facilitate communication across disciplines, share data or enable collaboration. In a nutshell, they describe the sets of entities that make up the world-in-a-computer, and circumscribe the sets of relationships they can have with each other. They are a complex and ambitious technical approach to address the problem of diverse languages, heterogeneous categorizations and varied methods for organizing information. In the wake of ontologies the information of a domain is substantially reorganized, facilitating data exchange and reuse.
Ontologies are fundamental to the work of digital technologies in knowledge infrastructures, but how they are developed and the extent to which that process is taken for granted once they are developed is critically important in relation to the digital representation of knowledge in education. In their study of the development of an ontology, Ribes and Bowker found that for those scientists involved “the primary orientation… was to complete a working ontology rather than a coming to a definitive resolution.” This was because the outcome was determined by the pragmatic digital requirements for the data to be machine readable. Thus, it is arguable that the representation of data for the purposes of digitalization requires different qualities than those associated with existing practices in research and pedagogy. In this process, one critical dimension was not achieved in practice – the representation of disagreements, uncertainties, ambiguities and ambivalences. These are qualities that it might be argued are critically important to a worthwhile education.
The scientists involved in Ribes and Bowker’s study became “lay ontologists”, but for those involved in using rather than developing the ontology, the decisions necessary to enable the information to be machine-readable are hidden. These practices of inclusion and exclusion in developing ontologies have been identified in similar studies. A close ethnographic study of ontology-building by Dave Randall and colleagues, for example, concluded that “much time and effort is spent reaching agreement about what should be in a given ontology and what should be left out.” Thus, Edwards and his co-authors argue that in their development, knowledge infrastructures “not only provide new maps to known territories – they reshape the geography itself.” We therefore see how digital technologies raises important questions about the politics of knowledge, as “turning everything into data, and using algorithms to analyze it changes what it means to know something,” as Lev Manovich puts it in his recent book Software Takes Command.
Ribes and Bowker also point to the importance of temporality in relation to ontologies: “as knowledge, terminology or concepts change within the scientific community, a once-accurate ontology could become obsolete”. With the passing of time and the incorporation of such data into new knowledge infrastructures, the pre-history of data, the selections and applications of ontologies and standards, and the application of rules can disappear from view. As Edwards and colleagues suggest, “the presentation of datasets as complete, interchangeable products in readily exchanged formats… may encourage misinterpretation, over reliance on weak or suspect data sources, and ‘data arbitrage’ based more on availability than quality”.
This points to the increasing complexity of the work of digital technologies in knowledge infrastructures within education, the tracing of which is and is likely to become ever more complex. As Edwards and co-authors suggest, in articulating what is known, we also need to engage with the “accidental and systematic means by which non-knowledge is produced and maintained”.