Tuesday, 15 November 2016

AtoM harvesting (part 1) - it works!

When we first started using Access to Memory (AtoM) to create the Borthwick Catalogue we were keen to enable our data to be harvested via OAI-PMH (more about this feature of AtoM is available in the documentation). Indeed the ability to do this was one of our requirements when we were looking to select a new Archival Management System (read about our system requirements here).

Look! Archives now available in Library Catalogue search
So it is with great pleasure that I can announce that we are now exposing some of our data from AtoM through our University Library catalogue YorSearch. Dublin Core metadata is automatically harvested nightly from our production AtoM instance - so we don't need to worry about manual updates or old versions of our data hanging around.

Our hope is that doing this will allow users of the Library Catalogue (primarily staff and students at the University of York) to happen upon relevant information about the archives that we hold here at the Borthwick whilst they are carrying out searches for other information resources.

We believe that enabling serendipitous discovery in this way will benefit those users of the Library Catalogue who may have no idea of the extent and breadth of our holdings and who may not know that we hold archives of relevance to their research interests. Increasing the visibility of the archives within the University of York is an useful way of signposting our holdings and we think this should bring benefits both to us and our potential user base.

A fair bit of thought (and a certain amount of tweaking within YorSearch) went into getting this set up. From the archives perspective, the main decision was around exactly what should be harvested. It was agreed that only top level records from the Borthwick Catalogue should be made available in this way. If we had enabled the harvesting of all levels of records, there was a risk that search results would have been swamped by hundreds of lower level records from those archives that have been fully catalogued. This would have made the search results difficult to understand, particularly given the fact that these results could not have been displayed in a hierarchical way so the relationships between the different levels would be unclear. We would still encourage users to go direct to the Borthwick Catalogue itself to search and browse lower levels of description.

It should also be noted that only a subset of the metadata within the Borthwick Catalogue will be available through the Library Catalogue. The metadata we create within AtoM is compliant with ISAD(G): General International Standard Archival Description which contains 26 different data elements. In order to facilitate harvesting using OAI-PMH, data within AtoM is mapped to simple Dublin Core and this information is available for search and retrieval via YorSearch. As you can see from the screen shot below, Dublin Core does allow a useful level of information to be harvested, but it is not as detailed as the original record.

An example of one of our archival descriptions converted to Dublin Core within YorSearch

Further work was necessary to change the default behaviour within Primo (the software that YorSearch runs on) which displayed results from the Borthwick Catalogue with the label Electronic resource. This is what it calls anything that is harvested as Dublin Core. We didn't think this would be helpful to users because even though the finding aid itself (within AtoM) is indeed an electronic resource, the actual archive that it refers to isn't. We were keen that users didn't come to us expecting everything to be digitised! Fortunately it was possible to change this label to Borthwick Finding Aid, a term that we think will be more helpful to users.
Searches within our library catalogue (YorSearch) now surface Borthwick finding aids, harvested from AtoM.
These are clearly labelled as Borthwick Finding Aids.


Click through to a Borthwick Finding Aid and you can see the full archival description in AtoM in an iFrame

Now this development has gone live we will be able to monitor the impact. It will be interesting to see whether traffic to the Borthwick Catalogue increases and whether a greater number of University of York staff and students engage with the archives as a result.

However, note that I called this blog post AtoM harvesting (part 1).

Of course that means we would like to do more.

Specifically we would like to move beyond just harvesting our top level records as Dublin Core and enable harvesting of all of our archival descriptions in full in Encoded Archival Description (EAD) - an XML standard that is closely modelled on ISAD(G).  This is currently not possible within AtoM but we are hoping to change this in the future.

Part 2 of this blog post will follow once we get further along with this aim...






Jenny Mitcham, Digital Archivist

2 comments:

  1. Thank you for sharing this! This is extremely helpful. I envision doing something similar here at the University of Toronto once we have finished imputing all our legacy finding aids into our new instance of AtoM. But it's still very much a work in progress at the moment.

    Jenny Mitcham, Digital Archivist

    ReplyDelete
  2. Great - glad it was helpful to you and best of luck with your work in this area. Note we still have *lots* of legacy finding aids to input - that is very much an ongoing task!

    Jenny Mitcham, Digital Archivist

    ReplyDelete

The sustainability of a digital preservation blog...

So this is a topic pretty close to home for me. Oh the irony of spending much of the last couple of months fretting about the future prese...