Thursday 10 April 2014

Hydra: multiple heads are better than one

Trinity College, Dublin
I spent a couple of days this week in Dublin at the Hydra Europe Symposium. Hydra has been on my radar for a little while but these two days really gave me an opportunity to focus on what it is and what it can do for us. This is timely for us as we are currently looking at systems for fulfilling our repository and digital archiving functions. 

At York we currently use Fedora for our digital library so developments within the Hydra community are of particular interest because of its relationship to Fedora.

Chris Awre from the University of Hull stated that the fundamental assumptions on which Hydra was built were that:

1. No single system can provide the full range of repository based solutions for an institutions needs
2. No single institution can resource development of a full range of solutions on its own

This chimes well with our recent work at York trying to propose a technical architecture that could support deposit, storage, curation and access to research data (among other things). There is no one solution for this and building our own bespoke system from scratch or based purely on Fedora would clearly not be the best use of our resources.

The solution that Hydra provides is a technical framework that multiple institutions can work with but that can be built upon with adopting institutions developing custom elements tailored to local workflows. Hydra has one body but many heads supporting many different workflows.

We were told pretty early on within the proceedings that for Hydra, the community is key. Hydra is as much about knowledge sharing as sharing bits of code.

“If you want to go fast go alone, if you want to go far, go together” – This African proverb was used to help explain the Hydra concept of community. In working together you can achieve more and go further. However, some of the case studies that were presented during the Symposium clearly showed that for some, it is possible to go both far and fast using Hydra and with very little development required. Both Trinity College Dublin and the Royal Library of Denmark commented on the speed with which a repository solution based on Hydra could be up and running. Speed is of course largely dependent on the complexity or uniqueness of the workflows you need to put in place. Hydra does not provide a one-size-fits-all solution but should be seen more as a toolkit with building blocks that can be put together in different ways.

Dermot Frost from Trinity College Dublin summed up their reasons for joining the Hydra community, saying that they had had experience with both Fedora and DSpace and neither suited their needs. Fedora is highly configurable and in theory does everything you need to do, but you need a team of rocket scientists to work it out. DSpace is a more out-of-the-box solution but you can not configure it in the way you need to to get it to conform to local needs. Hydra sits between the two providing a solution that is highly configurable, but easier to work with than Fedora.

Anders Conrad from the Royal Library of Denmark told us that for their repository solution, 10-20% of material is deemed worthy of proper long term preservation and is pushed to the national repository. The important thing here is that Hydra can support these different workflows and allows an organisation to put one repository in place that could support different types of material with different values placed on the content and thus different workflows going on within it. The 'one repository - multiple workflows' model is very much the approach that the University of Hull have taken with their Hydra implementation. Richard Green described how data comes in to the repository through different routes and different types of data are treated and displayed in different ways depending on the content type.

And what about digital preservation? This is of course my main interest in all of this. One thing that is worth watching is Archivesphere, a Hydra head that is being created by Penn State designed to "create services for preserving, managing, and providing access to digital objects, in a way that is informed by archival thinking and practices" and including support for both PREMIS and EAD metadata. This is currently being tested by Hydra partners and it will be interesting to see how it develops.

Another thing to think about is how Hydra could meet my digital preservation requirements that I published last year (note they have changed a little bit since then). I think the answer to this is that it probably could meet most of them if we wanted to develop the solutions on top of existing Hydra components. Archivesphere is already starting to introduce some of the required functionality to Hydra, for example file characterisation, normalisation and fixity checking. I guess the bigger question for me is whether this is the best approach for us or whether we would be preferable to make use of existing digital archiving software (Archivematica for example) and ensure the systems can talk to each other effectively.



Jenny Mitcham, Digital Archivist

1 comment:

  1. Hi Jenny,

    I'd be interested to hear how you get on with Hydra and on your thoughts on Archivematica too, which I've looked at and which seems promising from a preservation workflow perspective.

    Kirsty

    Jenny Mitcham, Digital Archivist

    ReplyDelete

The sustainability of a digital preservation blog...

So this is a topic pretty close to home for me. Oh the irony of spending much of the last couple of months fretting about the future prese...