This is a guest post from Simon Wilson, University Archivist at the University of Hull based within the Hull History Centre. Simon has been working with me on the "Filling the Digital Preservation Gap" project and agreed to provide a short write up of the UK Archivematica group meeting in my absence.
With Jen presenting at iPRES in North Carolina Julie Allinson and I attended the UK Archivematica user group meeting at the Laidlaw Library in Leeds. After the round table introductions from the 11 institutions that were represented, Julie began proceedings with an presentation on our Jisc "Filling the Digital Preservation Gap" project. She updated the group on the progress within this project since the last user group meeting 5 months previously and focused in particular on the development work and enhancements to Archivematica that are being undertaken in Phase 2.
A presentation from Fergus O'Connor and Claudia Roeck at the Tate highlighted their use of Archivematica for video art with an estimated 500 items including video, film and slide material with the largest file some 20GB in size. It was interesting to hear how digital content had impacted on their video format migration policies and practices. As they were looking at this stage at just one particular format they had been able to identify some of the micro services that weren't appropriate (for example OCR tools and bulk extractor) as a timely reminder of the value of adjusting the workflow within Archivematica as necessary. This is something we will look at when developing the workflows at Hull for research data and born-digital archives.
One question raised was that of scalability and Matthew Addis from Arkivum reported that it had been successfully tested with 100,000 files. There was an interesting discussion about whether availability of IT support was a barrier to take-up in institutions.
John Beaman from Leeds gave a thought provoking session about data security and the issue of personal identifiable information and the impact this had on the processing of content. This is an issue we are familiar with for paper material but haven't spent a lot of time translating these experiences to digital material. There was lots of note taking in the discussion about anonymisation (removing references to personal identifiable information) and pseudonymisation (changing the personal identifiable information across the dataset) and the respective impact on security and data re-use (in summary anonymisation is best for security and pseudonymisation best for re-use). The pointer to the ISO Code of practice on this has been added to my reading list. John also discussed encryption which seems to be an important consideration for some data. These are important issues for anyone working with born digital data regardless of the system they are using.
Jonathan Ainsworth also from Leeds talked us through their work with their collections management system KE Emu and ePrints - and the challenges of fitting Archivematica into an existing workflow. He also highlighted the impossibility of trying to predict every possible scenario for receiving or processing digital content. There was an interesting discussion about providing evidence to support a business case and what might be considered useful measures and discussion about cost models.
The day concluded with Sarah Romkey from Artefactual Systems joining us via Skype and bringing us up to speed with developments for v1.5 due later this year and v1.6 due in 2016. I am especially looking forward to getting my hands on the arrangement and appraisal tab being developed by colleagues at Bentley Historical library.