Born Digital Cataloguing (some thoughts from the ARA SAT event)

First day back to work after my holiday and I am straight back into the fray – no quiet day catching up on e-mails and getting my head slowly back into work mode for me! On Monday I attended an ARA Section for Archives and Technology event on BornDigital Cataloguing and also had the opportunity to talk about some of our current work in this area.

It was great to see the event so well attended (the organisers had to find a bigger room due to the huge amount of interest!). This is clearly an important and interesting subject for many archives professionals and it was clear throughout the day that many of us are grappling with very similar issues. Here are some of the main points that I latched on to from the morning’s presentations:

  •  It is important to preserve the directory structure of digital files as submitted into the archive – even if you subsequently move the files into a different structure. This is the equivalent of original order and can give context to the files. End users of the digital archive should also have access to this information so it is included in the description within the Discovery interface (Anthea Seles, TNA).
  •  Users don’t really know what they want or need with regard to born digital material – it is too early to say and too new a field. We need to try and predict what they will require and also need to learn from our experiences as we go along (Anthea Seles, TNA).
  • “It’s all just stuff” – born digital archives should be treated the same as paper as far as possible (Chris Hilton, Wellcome Library).
  • Interesting case study from the Wellcome Library about how an archival management system (Calm) and a digital preservation system (Preservica) can work together. It is important to establish which data is duplicated between the 2 systems (there may be some overlap) and if this is the case, which is the master data and how the information is synchronised between systems. In this case study, digital data starts off in Preservica and overnight catalogue records are copied over into Calm. Calm then becomes the master for resource discovery metadata and any subsequent edits need to be made in Calm before syncing back to the digital archive (Chris Hilton, Wellcome Library).
  • Original order – in the Wellcome Library’s case study, the method the creator or donor used to store and order his digital files was different to the system of arrangement used for paper. Digital files were arranged chronologically but the paper archive was arranged according to themes. This results in a hybrid archive that is ordered or arranged inconsistently depending on the media and leaves the archivists with a decision to make (Victoria Sloyan, Wellcome Library).
  • Workflow is crucially important. It matters what happens when. Once data is ingested into a digital archive such as Preservica (I believe Archivematica is the same), it becomes difficult to remove individual items from the Archival Information Package. This becomes more of a problem when that information has also been replicated into an archival management system. Selection and Appraisal therefore needs to happen at an early stage in the workflow….and we also need to accept that our digital archives may not be perfect – we are unlikely to be able to weed out all redundant files on a first pass so we may end up with items in the digital archive that are not needed (Victoria Sloyan Wellcome Library).
  • Should we stop using the word cataloguing and instead talk about ‘enabling discovery’ – this is really what we are trying to do? We may end up moving away from the traditional archival catalogue (particularly for digital data) but we still need to ensure that we can enable our users to find the information they require. Digital collections may lead to alternative (less labour intensive) ways of enabling resource discovery (Jessica Womack and Rebecca Webster, Institute of Education).
  • We should be working with donors and depositors to get them to structure and label their data appropriately (and thus help with born digital cataloguing). It is very hard for archivists to deal with large quantities of digital data that has been created with little order or structure (Jessica Womack and Rebecca Webster, Institute of Education).
  • Digital is different to paper in that it requires more immediate action once it has been accepted into an archive and we need to ensure our processes, procedures and workflows can cope with this (Jessica Womack and Rebecca Webster, Institute of Education).

The last scheduled presentation of the morning was from me in which I gave a non-archivists perspective on born digital cataloguing. I'll try and summarise some of my points in a separate post later this week.

And here are some of the main messages I took away from the day as a whole:

  • Try things out – it is better to do something now than to wait until we have a perfect solution. This is the best way of learning what works and what doesn't.
  • Accept that the solutions you put in place may be temporary. We are all learning, and born digital cataloguing is not a solved problem (particularly with regard to hybrid archives).
  • Be honest about failures as well as successes – others can learn as much from finding out what didn't work and why as they can from finding out what did.
  • Think about which approaches are scalable in the longer term. Digital archives are going to increase in size and volume and we need to explore different ways of enabling discovery.

Despite the fact that there were more problems than solutions highlighted during the course of the day, it was comforting (as always) to discover that we are not alone!

Jenny Mitcham, Digital Archivist


Popular posts from this blog

How can we preserve Google Documents?

Preserving emails. How hard can it be?

Checksum or Fixity? Which tool is for me?