Research Data Spring - a case study for collaboration

Digital preservation is not a problem that any single institution can realistically find a solution to on their own. Collaboration with others is a great way of working towards sustainable solutions in a more effective way. This post is a case study about how we have benefited from collaboration whilst working on the "Filling the Digital Preservation Gap" project.

In late 2014 Jisc announced a new collaborative initiative called Research Data Spring. The project model specifically aimed to create innovative partnerships and collaborations between stakeholders at different HE institutions working within the field of Research Data Management. Project teams were asked to work in short sprints of between three and six months and were funded for a maximum of three phases of work. One of the projects lucky enough to be funded as part of this initiative was the “Filling the Digital Preservation Gap” project, a collaboration between the Universities of Hull and York. This was a valuable opportunity for teams at the two universities to work together on a shared solution to a shared problem and come up with a solution that might be beneficial to others.

The project team from Hull and York
The aim of the project was to address a perceived gap in existing research data management infrastructures around the active preservation of the data. Both Hull and York had existing digital repositories and sufficient storage provision but were lacking systems and workflows for fully addressing preservation. The project aimed to investigate the open source tool Archivematica and establish whether this would be a suitable solution to fill this gap.

As well as the collaboration between Hull and York, further collaborations emerged as the project progressed. 

Artefactual Systems are the organisation who support and develop Archivematica and the project team worked closely with them throughout the project. Having concluded that Archivematica has great potential for helping to preserve research data, the project team highlighted several areas where they felt additional development was required in order to enhance existing functionality. Artefactual Systems were consulted in detail as the project team scoped out priorities for further work. They were able to offer many useful insights about the best way of tackling the problems we described. Their extensive knowledge of the system put them in a good place to look at the issues from various angles to find a solution which would meet our needs as well as the needs of the wider community of users. Artefactual Systems were also able to help us with one of our outreach activities, joining us (virtually) to give a presentation about our work.

The UK Archivematica group was kept informed about the project and invited to help shape the priorities for development (you can read a bit about this in a previous blog post). Experienced and established Archivematica users from the international community were also consulted to discuss the new features and to review how the proposed features would impact on their workflows. Ultimately, none of us wanted to create bespoke developments that were only going to be of use to Hull and York.

Collaboration with another Research Data Spring project being carried out at Lancaster University was also necessary to enable future join up of these two initiatives. One of the areas highlighted for further work was improved reporting within Archivematica. By sponsoring a development to enable data to be more easily exposed to third party applications, the project team worked closely with the DMAOnline project team at Lancaster to ensure the data would be made available in a manner that was suitable for their tool to work with.  

Another area of work that called for additional collaboration was in the area of file format identification. This is very much an area that the digital preservation community as a whole needs to work together on. For research data in particular, there are many types of file that are not identified by current identification tools and are not present within the Pronom registry of file types. We wanted to get greater representation of research data file formats within Pronom and also enhance Archivematica to enable better workflows for non-identified files (see my previous post for more about file identification workflows). This is why we have also been collaborating with the team at The National Archives who develop new file signatures for Pronom.

The collaborative nature of this project brought several benefits. Despite the short time scales at play (or perhaps because of them) there was a strength in working together on a new and innovative solution to preserve research data.

The universities of Hull and York were similar enough to share the same problem and see the need to fill the digital preservation gap, but different enough to introduce interesting variations in workflows and implementation strategies. This demonstrated that there is often more than one way to implement a solution depending on institutional differences.  

By collaborating and consulting widely, the project hoped to create a better final outcome and produce a set of enhancements and case studies that would benefit a wide community of users.

Jenny Mitcham, Digital Archivist


Popular posts from this blog

How can we preserve Google Documents?

Preserving emails. How hard can it be?

Checksum or Fixity? Which tool is for me?