Friday, 15 May 2015

Jisc Archivematica project update - making progress with the 'how?' and the 'what?'

I mentioned in a previous post that we have funding through the Jisc Research Data Spring initiative for phase 1 of a project to look at the potential of using Archivematica to help manage research data for the longer term. Here is the second of our project updates showing what progress we have made over the last few weeks.

I can not quite believe we are already halfway through the first 3 month phase of this project. Where has all the time gone?

The most exciting moment of the last few weeks was during a Skype call between the project teams in York and Hull when within minutes of each other, both institutions managed to get their first test archives transferred into their institutional implementations of Archivematica! A few technical hiccups after the initial setting up of Archivematica had stopped this momentous occasion happening earlier. This does to my mind highlight one of the resourcing implications of Archivematica, that a certain amount of technical skill is required to understand and troubleshoot the error messages and configure the system to ensure that all is working smoothly. With my limited technical abilities, this is not something I would have been able to do myself!

We are now continuing to test the capabilities of Archivematica and alongside this I have read the Archivematica documentation from virtual cover to virtual cover, followed the mailing list posts within interest and chatted to other users. I am hugely grateful to other institutions that have been happy to share information about their infrastructure and workflows. 

The project team have a brainstorming meeting planned for next week to discuss the project 'how?'...


  • How? How would we incorporate Archivematica into a wider technical infrastructure for research data management and what workflows would we put in place? Where would it sit and what other systems would it need to talk to?


If all goes well, expect diagrams next time!

Following up from my previous blog post which talked about the project 'what?'...


  • What? What are the characteristics of research data and how might it differ from other born digital data that memory institutions are establishing digital archives to manage and preserve? What types of files are our researchers producing and how would Archivematica handle these?
...I've been having some interesting conversations with real researchers.

Though most of my time is spent hidden away in my office within the archives, it is real bonus being involved with the wider Research Data Management project at the University of York and helping deliver data management training courses to researchers. Getting out and having the opportunity to talk to researchers about their data is invaluable in helping to keep an eye on the longer term goals of this project. 

In a recent training session I've encouraged researchers to complete a simple questionnaire, which tells us a bit more about software packages and file formats.

Helping to answer the project 'what?'

Some researchers on completing this basic level of information, have also agreed to be contacted by me and have subsequently provided some samples of their data. Nothing sensitive or confidential, but files that they have agreed that I can share with The National Archives to create file signatures within Pronom. I hope this will lead to more types of research data being identifiable within digital preservation systems (Archivematica included).

I'm not reaching a huge number of researchers through this and subsequent training sessions over the next few weeks, so with help from colleagues, we've also sent an e-mail out requesting sample files from the top 20 software packages used by researchers at York. Sample files are coming in at a slow trickle rather than a deluge but hopefully, we will soon have suitable test set to share with The National Archives.


 The most popular applications and software used by researchers at the University of York (from Software and Training Questionnaire report by  Emma Barnes and Andrew Smith, 2014)




Jenny Mitcham, Digital Archivist

The sustainability of a digital preservation blog...

So this is a topic pretty close to home for me. Oh the irony of spending much of the last couple of months fretting about the future prese...