Thursday, 29 January 2015

Reacquainting myself with OAIS

Hands up if you have read ISO:14721:2012 (otherwise known as the Reference Model for an Open Archival Information System)…..I mean properly read it…..yes, I suspected there wouldn’t be many of you. It seems like such a key document to us digital archivists – we use the terminology, the concepts within it, even the diagrams on a regular basis, but I'll be the first to confess I have never read it in full.

Standards such as this become so familiar to those of us working in this field that it is possible to get a little complacent about keeping our knowledge of them up to date as they undergo review.

Hats off to the Digital Preservation Coalition (DPC) for updating their Technology Watch Report on the OAIS Reference Model last year. Published in October 2014 I admit I have only just managed to read it. Digital preservation reading material typically comes out on long train journeys and this report kept me company all the way from Birmingham to Coventry and then back home as far as Sheffield (I am a slow reader!). Imagine how far I would have had to travel to read the 135 pages of the full standard!

This is the 2nd edition of the first in the DPC’s series of Technology Watch reports. I remember reading the original report about 10 years ago and trying to map the active digital preservation we were doing at the Archaeology Data Service to the model. 

Ten years is quite a long time in a developing field such as digital preservation and the standard has now been updated (but as mentioned in the Technology Watch Report, the updates haven’t been extensive – the authors largely got it right first time). 

Now reading this updated report in a different job setting I can think about OAIS in a slightly different way. We don't currently do much digital preservation at the Borthwick Institute, but we do do a lot of thinking about how we would like the digital archive to look. Going back to the basics of the OAIS standard at this point in the process encourages fresh thinking about how OAIS could be implemented in practice. It was really encouraging to read the OCLC Digital Archive example cited within the Technology Watch Report (pg 14) which neatly demonstrates a modular approach to fulfilling all the necessary functions across different departments and systems. This ties in with our current thinking at the University of York about how we can build a digital archive using different systems and expertise across the Information Directorate.

Brian Lavoie mentions in his conclusion that "This Technology Watch Report has sought to re-introduce digital preservation practitioners to OAIS, by recounting its development and recognition as a standard; its key revisions; and the many channels through which its influence has been felt." This aim has certainly been met. I feel thoroughly reacquainted with OAIS and have learnt some things about the changes to the standard and even reminded myself of some things that I had forgotten I said, 10 years is a long time.


Friday, 16 January 2015

The first meeting of Archivematica UK users (or explorers)

Last week I was happy to be able to host the first meeting of a group of UK users (or potential users) of Archivematica here in York

There are several organisations within the UK that are exploring Archivematica and thinking about how they could use it within existing data management workflows to help preserve their digital holdings. I thought it would be good to get us in a room together and talking about our ideas and experiences. 

Of the institutions who attended the meeting, most were at a similar stage. Perhaps we would not yet call ourselves 'Archivematica users', but having recognised its potential, we are now in the process of testing and exploring the system to evaluate exactly how we could use it and what systems it would integrate with. 

Each of the nine institutions attending the meeting were able to give a short presentation with an overview of their experience with Archivematica and intended use of it. I asked each speaker to think about the following points:

  • Where are you with Archivematica (investigating, testing, using)?
  • What do you intend to use it for - eg: research data, born digital archives, digitised content?
  • What do you like about it / what works?
  • What don't you like about it / what doesn't work?
  • How do you see Archivematica fitting in with your wider technical infrastructure - eg: what other systems will you use for storage, access, pre-ingest?
  • What unanswered questions do you have?
By getting an overview of where we all are, we could not only learn from each other, but also see areas where we might be able to put our heads together or collaborate. Exploring new territory always seems easier when you have others to keep you company.

Over the course of the afternoon I took down pages of notes - a clear sign of how useful I found it. I can't report on everything in this blog post but I'll just summarise a couple of the elements that the presentations touched on - our likes and dislikes.

What do you like about Archivematica? What works well for you?

  • It meets our requirements (or most of them)
  • It uses METS to package and structure the metadata
  • It creates an Archival Information Package (AIP) that can be stored anywhere you like
  • You can capture the 'original order' of the deposit and then appraise and structure your package as required (this keeps us in line with standard archival theory and practice)
  • There is a strong user-community and this makes it more sustainable and attractive
  • Artefactual publish their development roadmap and wishlist so we can see the direction they are hoping to take it in
  • It is flexible and can talk to other systems
  • It doesn't tie you in to proprietary systems
  • It connects with AtoM
  • It is supported by Artefactual Systems (who are good to work with)
  • It is freely available and open source - for those of us who don't have a budget for digital preservation this is a big selling point
  • It is managed in collaboration with UNESCO
  • It has an evolving UK user base
  • The interface mirrors the Open Archival Information System (OAIS) functional entities - this is good if this is the language you speak
  • It allows for customisable workflows
  • It has come a long way since the first version was released
  • It fills a gap that lots of us seem to have within our existing data curation infrastructures
  • As well as offering a migration-based strategy to digital preservation, it also stores technical metadata which should allow for future emulation strategies
  • It is configurable - you can add your own tools and put your own preservation policies in place
  • It isn't a finished product but is continually developing - new releases with more functionality are always on the horizon
  • We can influence it's future development

What don't you like about Archivematica? What doesn't work for you?

  • It can be time consuming - you can automate a lot of the decision points but not all of them
  • Rights metadata can only be applied to a whole package (AIP) not at a more granular level (eg: per individual file)
  • Storage and storage functionality (such as integrity checking and data loss reporting) isn't included
  • Normalisation/migrations of files only happens on initial ingest but we are likely to need to carry out further migrations at a later date
  • Upgrading is always an adventure!
  • Documentation isn't always complete and up-to-date
  • It doesn't store enough information about the normalised preservation version of the files (more is stored about the original files)
  • It is not just a question of installing it and running with it - lots of thought has to go into how we really want to use it
  • You can't delete files from within an AIP
  • There is no reporting facility to enable you to check on what files of each type/version you have within your archive and use this to inform your preservation planning

A Q&A session with Artefactual Systems via WebEx later in the afternoon was really helpful in answering our unanswered questions and describing some interesting new functionality we can expect to see in the next few versions of Archivematica.

All in all a very worthwhile session and I hope this will be just the first of many meetings of the Archivematica UK group. Please do contact me if you are a UK Archivematica user (or explorer) and want to share experiences with other UK users.

Thursday, 18 December 2014

Plugging the gaps: Linking Arkivum with Archivematica

In September this year, Arkivum and Artefactual Systems (who develop and support the Open Source software Archivematica) announced that they were collaborating on a digital preservation system. This is a piece of work that myself and colleagues at the University of York were very pleased to be able to fund.

We don't currently have a digital archive for the University of York but we are in the process of planning how we can best implement one. Myself and colleagues have been thinking about requirements and assessing systems and in particular looking at ways we might create a digital archive that interfaces with existing systems, automates as much of the digital preservation process as possible ....and is affordable.

My first point of call is normally the Open Archival Information System (OAIS) reference model. I regularly wheel out the image below in presentations and meetings because I always think it helps in focusing the mind and summarising what we are trying to achieve.

OAIS Functional Entities (CCSDS 2002)

From the start we have favoured a modular approach to a technical infrastructure to support digital archiving. There doesn't appear to be any single solution that "just does it all for us" and we are not keen to sweep away established systems that already carry out some of the required functionality.

We need to keep in mind the range of data management scenarios we have to support. As a university we have a huge quantity of digital data to manage and formal digital archiving of the type described in the OAIS reference model is not always necessary. We need an architecture that has the flexibility to support a range of different workflows depending on the retention periods or perceived value of the data that we are working with. All data is not born equal so it does not make sense to treat it all in the same way.

How we've approached this challenge is to look at the systems we have currently, find the gaps and work out how best to fill them. We also need to think about how we can get different systems to talk to each other in order to create the automated workflows that are so crucial to all of this working effectively.

Looking at the OAIS model, we already have a system to provide access to data with York Digital Library which is built using Fedora, we also have some of the data management functionality ticked with various systems to store descriptive metadata about our assets (both digital and physical). We have various ingest workflows in place to get content to us from the data producers. What we don't have currently is a system that manages the preservation planning side of digital archiving or a robust and secure method of providing archival storage for the long term.

This is where Archivematica and Arkivum could come in.

Archivematica is an open source digital archiving solution. In a nutshell, it takes a microservices approach to digital archiving, running several different tools as part of the ingest process to characterise and validate the files, extracting metadata, normalising data and packing the data up into an AIP which contains both the original files (unchanged), any derived or normalised versions of these files as appropriate and technical and preservation metadata to help people make sense of that data in the future. The metadata are captured as PREMIS and METS XML, two established standards for digital preservation that ensure the AIPs are self-documenting and system-independent. Archivematica is agnostic to the storage service that is used. It merely produces the AIP which can then be stored anywhere.

Arkivum is a bit perfect preservation solution. If you store your data with Arkivum they can guarantee that you will get that data back in the same condition it was in when you deposited it. They keep multiple copies of the data and carry out regular integrity checks to ensure they can fulfil this promise. Files are not characterised or migrated through different formats. This is all about archival storage. Arkivum is agnostic to the content. It will store any file that you wish to deposit.

There does seem to be a natural partnership between Archivematica and Arkivum - there is no overlap in functionality, and they both perform a key role within the OAIS model. In actual fact, even without integration, Archivematica and Arkivum can work together. Archivematica will happily pass AIPs through to to Arkivum, but with the integration we can make this work much better.

So, the new functionality includes the following features:

  • Archivematica will let Arkivum know when there is an Archival Information Package (AIP) to ingest
  • Once the Arkivum storage service receives the data from Archivematica it will check the size of the file received matches the expected file size
  • A checksum will be calculated for the AIP in Arkivum and will be automatically compared against the checksum supplied by Archivematica. Using this, the system can accurately verify whether transfer has been successful
  • Using the Archivematica dashboard it is possible to ascertain the status of the AIP within Arkivum to ensure that all required copies of the files have been created and it has been fully archived

I'm still testing this work and had to work hard to manage my own expectations. The integration doesn't actually do anything particularly visual or exciting, it is the sort of back-end stuff that you don't even notice when everything is working as it should. It is however good to know that these sorts of checks are going on behind the scenes, automating tasks that would otherwise have to be done by hand. It is the functionality that you don't see that is all important!

Getting systems such as these to work together well is key to building up a digital archiving solution and we hope that this is of interest and use to others within the digital preservation community.

Friday, 7 November 2014

A non-archivist's perspective on cataloguing born digital archives

As blogged in my previous post, earlier this week I attended an ARA Section for Archives and Technology event on Born Digital Cataloguing and also had the opportunity to talk about some of the Borthwick's current work in this area.

I gave a non-archivist's perspective on born digital cataloguing. These were the main points I tried to put across, though some of the points below were also informed by discussions on the day:

  • Born digital cataloguing within a purely digital archive is reasonably straightforward. The real complexity comes when working with hybrid archives where content is both physical and digital
  • The Archaeology Data Service are good at born digital cataloguing. This is partly because they only have digital material to worry about, but also down to the fact that they have many years of experience and the necessary systems in place. Their new ADS Easy system allows depositors to submit data for archiving along with the required metadata (which they can enter both at project level and individual file level). A web interface for disseminating this data can then be created in a largely automated fashion. It makes sense to ask the person who knows the most about the data to catalogue it, freeing up the digital archivists' time to focus on checking the received data and metadata and more specialist digital preservation work.
  • Communication can be a problem between traditional archivists and digital archivists. We may use different metadata standards and we may not always know what the other is talking about. I was at the Borthwick Institute for approximately a year before I worked out that when my colleagues talked about describing archives at file level (which may cover multiple physical documents within the same physical file), they didn't mean the same as my perception of 'file level metadata' (which would apply to a single digital item). It is important to recognise these differences and try and work around them so that we understand each other better when working with hybrid archives.
A digital archivist may speak a different language to traditional archivists,
but we can work around this

  • At the Borthwick we are in the process of implementing a new system for accessioning and cataloguing archives (both physical and digital archives). We have installed a version of AtoM (Access to Memory) and have imported one of our more complex catalogues into it. We now need to build on this proof of concept and fully establish and populate this system. As well as holding information about our physical holdings, this system will provide a means of cataloguing born digital data and also the foundations on which a digital archiving system can be built. It will also provide the means by which we can disseminate digital objects to our users.
  • There are other types of metadata that are required for digital material and these are outside the scope of AtoM which is primarily for resource discovery metadata. More technical metadata relating to digital objects and any transformations they undergo needs to reside within a digital archiving system. This is where Archivematica comes in. We are currently testing this digital preservation system to establish whether it meets our digital archiving needs.
  • I worry about the identifiers we use within archival catalogues. The traditional archival identifier is performing two jobs – firstly acting as a unique identifier or reference for an item or group of items, and secondly showing where those items sit within the archival hierarchy. This can lead to problems...
    • ...if the arrangement of the archive changes – this may lead to the identifier changing – never a good thing once it has been published and made available, or, if that identifer is being used to link between different systems.
    • ...if we want to start describing objects before we know where they sit within the hierarchy. This may be the case in particular for digital material where we may want to start working with it with greater urgency than the physical element of the archive.*
  • We can argue that digital isn't different, but with digital we do tend to think more at item level. Digital preservation activities and the technical and preservation metadata that this generates are all at file (item) level, so perhaps it makes sense for the resource discovery metadata to follow this pattern. Unlike physical archives, for digital archives we can pretty easily generate a title (or file name) for every item. If we are to deal with digital archives at file level would this cause confusion when cataloguing a hybrid archive?
  • Before we incorporate digital material into a digital archive, some selection and appraisal needs to be carried out  - depending on the digital archiving system in use, it can be non-trivial to remove files from an AIP (archival information package) once they have been transferred, so we really do need to have a good idea of what is and isn't included before we carry out our preservation activities. In order to carry out this selection we may wish to start putting together a skeletal description of each item. Wouldn't it be nice if we could start to do this in a way which could be easily transferred into an archival management system? At the moment I have been doing this in a separate spreadsheet but we need strategies that are more sustainable and scalable.
  • Workflows are crucially important. Who does the born digital cataloguing where hybrid archives are concerned? It's place within the archive as a whole is key so it should be catalogued in tandem with the physical, but if we want to archive the digital material more rapidly than the physical how do we ensure we have the right workflows and procedures in place? Much of this will come down to institutional policies and procedures and the capabilities of the technologies you are using. These are still issues we are grappling with here at the Borthwick as we try and establish a framework for carrying out born digital cataloguing.

* as an aside (and a bit off-topic), my other bugbear with archival identifers is that they contain slashes (which means we can’t use them in directory or file names) and that they don’t order correctly in a spreadsheet or database as they are a mixture of numeric and alphabetical characters

Wednesday, 5 November 2014

Born Digital Cataloguing (some thoughts from the ARA SAT event)

First day back to work after my holiday and I am straight back into the fray – no quiet day catching up on e-mails and getting my head slowly back into work mode for me! On Monday I attended an ARA Section for Archives and Technology event on BornDigital Cataloguing and also had the opportunity to talk about some of our current work in this area.

It was great to see the event so well attended (the organisers had to find a bigger room due to the huge amount of interest!). This is clearly an important and interesting subject for many archives professionals and it was clear throughout the day that many of us are grappling with very similar issues. Here are some of the main points that I latched on to from the morning’s presentations:

  •  It is important to preserve the directory structure of digital files as submitted into the archive – even if you subsequently move the files into a different structure. This is the equivalent of original order and can give context to the files. End users of the digital archive should also have access to this information so it is included in the description within the Discovery interface (Anthea Seles, TNA).
  •  Users don’t really know what they want or need with regard to born digital material – it is too early to say and too new a field. We need to try and predict what they will require and also need to learn from our experiences as we go along (Anthea Seles, TNA).
  • “It’s all just stuff” – born digital archives should be treated the same as paper as far as possible (Chris Hilton, Wellcome Library).
  • Interesting case study from the Wellcome Library about how an archival management system (Calm) and a digital preservation system (Preservica) can work together. It is important to establish which data is duplicated between the 2 systems (there may be some overlap) and if this is the case, which is the master data and how the information is synchronised between systems. In this case study, digital data starts off in Preservica and overnight catalogue records are copied over into Calm. Calm then becomes the master for resource discovery metadata and any subsequent edits need to be made in Calm before syncing back to the digital archive (Chris Hilton, Wellcome Library).
  • Original order – in the Wellcome Library’s case study, the method the creator or donor used to store and order his digital files was different to the system of arrangement used for paper. Digital files were arranged chronologically but the paper archive was arranged according to themes. This results in a hybrid archive that is ordered or arranged inconsistently depending on the media and leaves the archivists with a decision to make (Victoria Sloyan, Wellcome Library).
  • Workflow is crucially important. It matters what happens when. Once data is ingested into a digital archive such as Preservica (I believe Archivematica is the same), it becomes difficult to remove individual items from the Archival Information Package. This becomes more of a problem when that information has also been replicated into an archival management system. Selection and Appraisal therefore needs to happen at an early stage in the workflow….and we also need to accept that our digital archives may not be perfect – we are unlikely to be able to weed out all redundant files on a first pass so we may end up with items in the digital archive that are not needed (Victoria Sloyan Wellcome Library).
  • Should we stop using the word cataloguing and instead talk about ‘enabling discovery’ – this is really what we are trying to do? We may end up moving away from the traditional archival catalogue (particularly for digital data) but we still need to ensure that we can enable our users to find the information they require. Digital collections may lead to alternative (less labour intensive) ways of enabling resource discovery (Jessica Womack and Rebecca Webster, Institute of Education).
  • We should be working with donors and depositors to get them to structure and label their data appropriately (and thus help with born digital cataloguing). It is very hard for archivists to deal with large quantities of digital data that has been created with little order or structure (Jessica Womack and Rebecca Webster, Institute of Education).
  • Digital is different to paper in that it requires more immediate action once it has been accepted into an archive and we need to ensure our processes, procedures and workflows can cope with this (Jessica Womack and Rebecca Webster, Institute of Education).

The last scheduled presentation of the morning was from me in which I gave a non-archivists perspective on born digital cataloguing. I'll try and summarise some of my points in a separate post later this week.

And here are some of the main messages I took away from the day as a whole:

  • Try things out – it is better to do something now than to wait until we have a perfect solution. This is the best way of learning what works and what doesn't.
  • Accept that the solutions you put in place may be temporary. We are all learning, and born digital cataloguing is not a solved problem (particularly with regard to hybrid archives).
  • Be honest about failures as well as successes – others can learn as much from finding out what didn't work and why as they can from finding out what did.
  • Think about which approaches are scalable in the longer term. Digital archives are going to increase in size and volume and we need to explore different ways of enabling discovery.

Despite the fact that there were more problems than solutions highlighted during the course of the day, it was comforting (as always) to discover that we are not alone!

Wednesday, 24 September 2014

To crop or not to crop? Preparing images for page turning applications

How do you prepare digital images of physical archive volumes for display within a web-based page turning application?

I thought this was going to be a fairly straight forward question when I was faced with it a couple of months ago.

Over the summer I have been supervising an internship project with the goal of finalising a set of exisiting digital images for display within a page turning application. The images were digital surrogates of the visitation records for the Archdeaconry of York between 1598 and 1690 (for more information about these records see our project page on the Borthwick website).

I soon realised that there are many ways of approaching this problem and few standard answers.

Google is normally my friend but googling the problem surfaced only guidelines geared towards particular tools and technologies - not the generic guides to good practice in this area that I was hoping for.

Page turning for digital versions of modern books is fairly straightforward. They will be uniform in size and shape, with few idiosyncrasies. The images will be cropped right down to the edges of the page resulting in a crisp and consistent presentation. 

However, we have slightly different expectations of digital surrogates of an archival volume. When photographing material from the archives it is good practice to leave a clear border around the edge of the physical document. This makes it explicit that the whole page has been captured and helps people to make a judgement on the authenticity of the digital surrogate. 

For archival volumes we have decided the best strategy is to leave a thin border around the edges of the page as shown on the left. The problem with the right image is that it is not clear that the whole page has been captured.

Volumes that we find in the archives are unique and idiosyncratic and often refuse to conform to the standard that we see in modern books. Exceptions are the norm in archives and this can make digitisation and display slightly more challenging. Page turning can work in this context, but it does require a little more thought:

Volumes within the archives do not
always have straight edges!

  • Bound volumes within the archives are not always uniform. Straight edges are rare. Damage is sometimes present, pages may even have holes in allowing other pages to be visible underneath. Should such pages be imaged as is, or should we insert a sheet underneath the page so we can see only the page that is being imaged?
  • Page size may not be consistent. A volume may contain pages of all different shapes and sizes. Fold outs may be present - meaning that a page may be larger than the size of the cover. Fold outs may have writing on both sides.
  • Inserts may be present and can occur in all shapes and sizes. They may be scattered throughout the volume or may be all inserted at the start or the end of the volume. Is their current position in the volume indicative of where they should appear within the page turning application? Should they be photographed in situ (difficult if they are folded and are larger than the volume) or removed from the volume for photography? Should they be displayed as part of the page turning application or as separate (but related) items within the interface?
  • Archival volumes may not all be in one piece. The original cover for the volume may have been separated from the pages. The pages may be loose. Should the page turning application display these volumes as they exist today, or attempt to reconstruct the volume as it once was?

There are lots of different ways we could address these challenges. Here is a summary of some of the lessons we have learnt:

  • Thoroughly assess the physical copies before digitisation commences - having an idea of what challenges you will encounter will help. It is best to work out a strategy for the volume as a whole at the start of the process and have to image the volume only once, rather than have to go back and re-image specific pages (bearing in mind you will need to try and ensure any new images are consistent with the previous ones to ensure a good page turning experience for the end user). If you come across fold outs, inserts or holes, decide how you are going to image them.
  • As part of this assessment process, seek the help of a conservator if there are pages for which a good image could not be easily captured (for example if a corner of the page is folded over and obscuring text). A conservator may be able to treat the document prior to digitisation to enable a better image to be captured.
  • Choose a background that will be suitable for the whole volume and stick to it.
  • Crop images to the spine of the book but with a small border around the other edges of the page. Try to keep a consistent crop size for the resulting images, but accept the fact that where there are fold outs or large inserts, the image will have to be larger. A good page turning application should be able to handle this.
  • Different page turning technologies will be able to support different things. Work out what technology you are using and know its capabilities

The last point to make is that we should not focus solely on dissemination. Image dissemination strategies, tools and applications will come and go but ultimately when you are taking high quality digital images of archives you will need to maintain a high resolution preservation version of those images within a digital archive.

An insert found within Visitations Court Book 2 - should this be
photographed within the volume or separate from the volume?
These preservation images will be around for the long term and can be used to make further dissemination copies where necessary. Think carefully about what is required here and remember to save your preservation originals at the right point within the workflow (for example once the images have been checked and a sensible file naming strategy implemented, but before any loss of information or degradation in image quality occurs). 

Also think about what other images may be needed to fully record the physical object for preservation purposes. It may be necessary to take some images that would not be used within the page turning application but that record valuable information about the physical volume. For example, the spine of the book, or a small detail on the cover that needs to be captured at a higher resolution. 

Monday, 8 September 2014

Physical diaries versus digital calendars - a digital perspective

This summer as part of our annual staff festival I had the chance to play at being a ‘real’ archivist. Coming to work at a traditional archive through a digital route with no formal archives training means that there are many traditional archives activities that I have not had any experience of. It was great to have the chance to handle some physical archives as Borthwick staff embarked on a ‘mass list in’ of the Alan Ayckbourn archive.

Given a couple of heavy brown archive boxes and a pencil (no pens please!) and paper I was tasked with creating a box list (essentially just a brief description of what the boxes contained) for a selection of Ayckbourn’s diaries. This proved to be an interesting way to spend a morning.

My job doesn't take me into the strongrooms or searchroom very often and opportunities to handle physical archives are rare. Opening a box from the archives and lifting out the contents was reminiscent of my past career in archaeological fieldwork, in particular the excitement of not quite knowing what you may find.

The diaries I was looking at were appointments diaries rather than personal journals. The more recent diaries were used by Ayckbourn in a fairly standard way (as I use my physical appointments diary today). They were brief and factual, recording events happening on a particular day, be it the dress rehearsal of a particular performance, dinner with friends, Christmas parties or a reminder to take the cat to the cattery.

Earlier diaries from the late eighties were used in a slightly different way by Ayckbourn. These are A4 diaries with a page devoted to each day of the year. This format provided more space and allowed for uses beyond the simple appointments format. The diaries were used for to-do lists (with lots of crossings out as tasks were completed), names and addresses, notes and thoughts and thus had more points of interest as I looked through them. Much of the content I couldn’t make sense of – the handwriting was often a challenge (particularly when crossed out), and notes were often present without relevant contextual information required to fully understand them. These diaries were very much a personal tool and not created with future access in mind but this does not mean they could never be a valuable resource for research.

Whilst looking at these diaries it occurred to me to think about the modern day digital equivalent of these hard backed physical diaries and how they might be preserved and re-used into the future.

I am a keen user of a digital calendar in my professional life. At York University we have embraced the Google suite of tools and this includes Google calendar. It is an incredibly valuable tool with benefits far and above anything that could easily be achieved with its paper equivalent. I can share the calendar with colleagues to enable them to see where I am when, check multiple people's calendars at the same time and invite colleagues to meetings. Of course it also helps me manage my time in an more immediate way by popping reminders up 10 minutes before I am meant to be at a particular meeting or appointment.

Will we be archiving Google calendars in the future instead of (or alongside – I certainly use both at the moment) their paper equivalents? I think so. In December last year Google announced a new (and long awaited) feature which enables users of the calendar to download their appointments to a file. This of course would enable donors and depositors to hand their digital calendar over to a digital archive for longer term curation and access just as they would with their physical diaries and no doubt this is something we might expect to see delivered to us in the future.

This is the message Google sends once your calendar
has been prepared for export and archiving

Information from a Google calendar can be downloaded as described in the Gmail blog post. It exports the calendar data as iCalendar format (.ics) which is an independent format for exchange of calendar information (rather than something that is specific to Google). The fact that it is essentially a plain text file is great news for us digital archivists. It means we can open it up in a simple text editor and make some sense of the content without any specialist software.

After downloading my calendar from Google I had a look at it to see what level of detail was included within the iCalendar file and whether all the significant properties of my online calendar were preserved. Initial inspection shows that this is a pretty good version, though of course not as easy to read or understand as it is in its creating application. All the information appears to be there,
  • the date and time of each event
  • the date and time the event was created and last modified
  • whether my attendance is confirmed or not
  • the location of the meeting
  • who created the calendar event (including e-mail address)
  • who else is invited (including e-mail addresses)
  • any further details of the meeting that have been included in the entry

So although this is the modern equivalent (and even the future) of the physical appointments diaries in the Alan Ayckbourn archive, it is a very different beast. In some ways the data within it is better - more consistent and more detailed - than the physical diary and this can be one of the key benefits to working in a digital sphere. In other ways it is far less rich - there are no crossings out, no scribbles within the margin, no coffee stains and very little personality. The very things that are good about the digital calendar are the things which make it harder to get a sense of the real person behind the appointments.

Musings on value aside, it is good to know that when I'm faced with this question in the future I am in a better position to understand how we might preserve a digital calendar for the long term within our archive.