Dr. C. Annemieke Romein, Editor History
*I would like to thank Richard Deswarte and Becky Taylor (University of East Anglia) for their input on this blog.
The British Historical Association’s journal History welcomes a very wide arrange of articles touching upon all aspects of historical research. Digital Humanities (DH) has gained ground within historical research and is expected to do so in the years to come. DH is much more than studying big data. DH includes, for instance, opening up sources, searching them (both text and image) and making them available for the public. Hence, despite the challenges that data-storage, proper references and source-criticism may pose: a whole new (digital) world opens up awaiting a new generation of historical research. Journals are still quite hesitant to accept publications with a DH-component, as its novelty is perceived to be ‘scary’, ‘daunting’, or ‘too modern’. I will argue that the impact of the use of DH-technologies is certainly having, and is going to continue to have, an impact on humanities publishing and as such, awareness of the technical challenges needs to grow with the editors, reviewers and readers. In this blog post, I would like to raise several issues as recognition is the first step in a long process of figuring out solutions that will suit all parties involved in publishing digital history articles. Where possible, I would like to provide some thoughts on potential solutions.
I want to mention, at the start, that it is important – and this I must phrase explicitly – that handling so-called big data is not better than traditional research methods, it is merely different. Using massively digitised sources still requires researchers that ask proper research questions, and take into account the limitations and choices made while sources were (not) digitised. A critique phrased by Tim Hitchcock (Sussex – Keynote DHBenelux 2019) says that historians should be wary not to phrase research questions suiting currently available DH-techniques, but keep challenging the techniques too! Authors, editors and reviewers all still need to focus on a well-phrased, challenging, research question backed up by a rigorous methodology. If you can do that, I, as an ECR-editor, would wholeheartedly consider your submission to the field of DH and history!
The researcher-historian is the one person managing and crunching the big data. They incorporate the – often unspoken – academic methodological and research values of transparency and reproducibility. Here, DH offers some challenges at different stages of the research cycle, such as writing Data Management Plans on where data will be stored. This is very much key for digital-born archives: how and where are these stored and are they (still) accessible. When studying versions of (draft) documents of poets, authors or maybe even researchers, it also involves questions on where to store the physical hard drives. For instance, the Dutch Royal Academy of Sciences has a service called Data Archiving and Network Services (DANS), which guarantees ‘permanent access to digital research resources’. Referring to datasets stored within DANS is easy, which is great from a journal’s perspective. In the UK, the Data Archive hosted at the University of Essex is an equivalent. Storage of data provides a fantastic opportunity for researchers, as datasets should be regarded as an intrinsic part of their research-output – yes, there is more than publications! Moreover, such a repository allows an easier tracking down of data, without having to find the researcher who may have changed jobs. Here I want you to consider two issues: 1) when are you depositing data, at the end, or at several points in time, your references to the data within journal articles should be accessible for verification; 2) when you refer to your data in a submitted article, you will need to – in case of password protected data due to copyright issues – consider providing the journal with access upon submission, to pass on to reviewers so that the anonymity of the reviewer will be granted. Furthermore, another issue to be considered is concerning the programming activities involved in your research and publication. When you have used some programming-language, it is important to include your scripts. Scripts can look very messy – as a researcher’s main concern may have been that it performed what it was designed to do – so you may not want to include it in your article, but at least include a link to the repository (e.g. GitHub) you have stored it in. Make sure you do store your scripts and its explanation! Additionally, when Artificial Intelligence (AI) is involved you need to be aware that traditional historians hear the sound of alarm-bells. Why? The question of reproducibility. The ‘big black box’ that AI represents, learns from studying the data you have presented it with. Hence, running the same query through the same system potentially leads to different results. I would advise you to save the data first on a separate server, so that, if requested, the data can be used on another server.
Journals provide guidelines regarding how they like to see the annotations being formulated. But, here, challenges lay around the corner. Do you – and if yes, how, – credit a language model that you used on your digitised sources when you applied them through the HTR-tool (Transkribus) that was created within the READ-project? In other words, how do you attribute credits to all involved in the various stages of the chain of research? Do you (wilfully) stick to referring to the traditional paper versions of archival sources or their digitised counterparts? How do you refer to the efforts of collection holding institutes making them digitally available – and if you do, whom do you include? For now, there is a lack of standards. Here journals, collection holding institutes, and researchers should join hands and come up with proper guidelines. Helle Strandgaard Jensen (Keynote, DHBenelux 2019) flagged this as a crucial shift in thinking, so that collection holding institutes become increasingly aware that their contribution to the digitisation process is an intrinsic part of the research. Indeed, it makes them a crucial contributor and they should be credited for their part. Journals, too, play a role here as they are gatekeepers for publishing proper research, including proper referencing.
As easy as it is to refer to the original paper publications, we – the academic community – need to take into account that even with these paper versions there are reprints and next editions. If you want to properly refer to a certain version, an ISBN-code could offer a solution, but as we enter the digital age Digital Object Identifiers (DOIs) are becoming more and more common. I want to stress that this use should become compulsory, as it makes it far easier to refer to publications and find them again! Moreover, I would go as far as to say that digitisation projects – such as the massive Google Book project – should adopt DOIs, as hyperlinks have already proven to be too unstable.
Last, but certainly not least, DH is very interesting from the point of cOAliton-S, which aims to go for Open Access by 2021. DH has, as mentioned above, the opportunity to add digitised sources and techniques through links and make these available as an underlying layer to contributions.
If it has not been made obvious in this blog post, I want to stress that journals need to start accepting historical research, based upon DH-methodologies! The issues flagged in this post are meant to raise awareness and should function as starting-points to create guidelines to be followed by researchers to get their study published. There should be room for a novel methodological approach. DH is not better, it is different!