Tuesday, 4 December 2012

DPC’s Digital Preservation Awards 2012

I was really pleased to be able to attend the Digital Preservation Awards at the Wellcome Collection in London last night. In it's 10th year, the DPC is celebrating in style with three separate awards:

1. The DPC Decennial Award for an outstanding contribution to digital preservation
2. The DPC Award for Teaching and Communications
3. The DPC Award for Research and Innovation

Participants were encouraged by the chair of the ceremony Richard Ovenden to switch on their mobiles and tweet as the results of each category were announced. The excitement in the room was almost tangible!

I was very pleased and not at all surprised to see that the Digital Preservation Training Programme (DPTP) was the winner of the award for Teaching and Communications. DPTP was quite ground-breaking when it ran its first residential course in digital preservation back in 2005. I was there to help deliver presentations and case studies on the Open Archival Information System (OAIS) and preservation metadata and how we made digital preservation work in practice at the Archaeology Data Service. DPTP has gone from strength to strength ever since and has run courses all over the UK. A worthy winner!

Most exciting of all was the Decennial Award for an outstanding contribution to digital preservation. I couldn't be more happy to see my ex-colleagues at the Archaeology Data Service pick up this most prestigious of awards. The award was a great honour given the exceptionally high standard of the other candidates nominated in this category. An impressive accolade awarded for the ADS's excellent track record of research and innovation in digital preservation over the last ten years and it's innovative business model. I am very pleased to be able to say that I was a part of this.

Congratulations also go to the excellent PLANETS project which won the award for Research and Innovation.

A big thank-you to William and Carol at the DPC and our hosts at the Wellcome Collection for making the event a night to remember.

Thursday, 22 November 2012

The first accession!

I am pleased to report that last week I accessioned the first files into the digital archive here at the Borthwick Institute!

This may sound like a rather grand claim at the moment. I will admit that we do not have a 'digital archive' infrastructure in place yet and we are still in the very early stages of considering how best to treat digital material. 'Accessioning' of digital data is not the formal process that I would like it to be, but I am setting up some basic procedures to tide us over until a more cohesive system of managing digital archives alongside their analogue friends and relations is established.

It has been said many times that with digital preservation there is no point waiting for the perfect solution because this may be a long time coming. If we keep on waiting, the problem will get bigger and crucially, data loss may occur this is the methodology I have established so far.

Once I have checked that the media is readable and free from viruses, my first priority is to ensure that new digital data deposited with us is copied on to our digital archive server storage space (and securely backed up). This is of utmost importance and is the first step to ensuring longevity of digital data. If data exists only on one device (whether it is a floppy disc, a DVD or a hard drive) we can not assume it will still be readable or usable next time we need to access it. Ensuring we have more than one copy of the digital data is a key step towards preserving that data.

The next step is to find out exactly what we have. File identification and characterisation tools such as DROID are really helpful here. Running DROID over the files will produce a list of technical metadata about each file. This will include the file name and file size alongside a checksum (we can use this over time to check that a file hasn’t corrupted or been accidentally altered). DROID also tries to identify the exact file type and version of your files. Very useful information as this can provide a starting point for making decisions about future file migrations.

It is also important to maintain a record of the digital deposit process and the provenance of the data. Keeping copies of relevant correspondence about the process and any other documentation submitted which describes the data is crucial as it is hard to recreate this if not captured at the time of deposit.
This is the just the starting point for me - the first steps towards preserving the material. However, the small steps we can take now should ensure that the files can be more easily incorporated into a fuller digital archiving solution in the future.

Friday, 19 October 2012

Requirements for software and systems

When I started my job in June, one of the first things I wanted to do was get an idea of what existing systems and software were in place for recording analogue archives and see how I could record and manage digital material alongside this. It is essential that systems for analogue and digital archives are integrated to allow us to catalogue digital material alongside its analogue equivalents. If an organisation has been archiving minutes of their meetings with us in paper form for several decades but has now moved to digital deposition, we need a media-blind system which records the archival material regardless of format and upholds the relationships between these objects.

In talking to other archivists at the Borthwick over my first few months it was clear that existing systems for accessioning, cataloguing and providing access to metadata about archives did not meet all of our needs. We decided that this would be a great opportunity to list our requirements and analyse a range of software and systems against these requirements.

I set to work creating a list of system requirements for the Borthwick Institute. Feedback from other staff within the Institute was essential to ensure that we all agreed on what was needed. One of our key requirements from the start was about resource discovery – we want the information about our collections to be easy to search on-line. However, the list grew as I realised that there were many different but related bits of information that we wanted to record. Discussions with the conservation department at the Borthwick have revealed an interest in logging their preservation and conservation actions digitally. The staff on the desk in the Borthwick searchroom would benefit from a more integrated database and interface to log enquires and orders and keep records of which documents are accessed most frequently. On top of this, I have my own digital archiving agenda. I need to ensure that we can effectively manage the digital material that is coming our way.

Myself and my colleagues have now finalised the document and have created a fairly comprehensive list of 31 requirements. These requirements are fairly generic in some ways. We specify that we need a system that we can use to log accessions and create a receipt for an accession but we do not specify the exact fields that we need to record. The requirements for digital archiving are similarly brief – we will need to tease out the specifics once we get an idea of what solutions look promising. We have deliberately excluded requirements to do with cost or technical expertise. If we find a solution that suits our needs, we will need to make a case for the expenditure and of course the cost of any development time it will take to implement and configure the system.

I am not aware of any one single solution that will tick all of the boxes for us and I think that we may need to think about configuring a number of different solutions in such a way that they can work together to perform different bits of our workflow. Of course we will also test our current situation against this matrix just to make sure – there is a small chance that the best possible option for the Borthwick Institute will be to maintain and enhance the systems that are already in place. I will keep you posted with the results!

Thursday, 11 October 2012

Ten years of the Digital Preservation Coalition

Photo owned by the DPC. Taken by Megan Taylor
On Monday this week I was lucky enough to be able to attend the ten year anniversary celebration of the Digital Preservation Coalition. This was held at the House of Lords, Westminster and was a great event. Great because of the fabulous location but also because of the opportunity to get all the members of the DPC in one room together.

The last ten years have certainly been an interesting time from a digital preservation perspective. There have been major changes and developments in this field and the DPC has played an important part in facilitating many of these. I can also map my own involvement with digital preservation into this ten year time frame!

Knowing pretty much nothing about digital preservation apart from the fact it was 'A Good Thing', I applied for a job at the Archaeology Data Service at the end of 2002. I was lucky enough to get my first job there as a Curatorial Assistant (a strange job title but essentially I was working on the curation of digital files that archaeologists produce). I gradually learnt what I had to about digital preservation 'on the job' with no formal training.

It is fabulous to see that times have changed now and there are digital preservation training courses (such as The Digital Preservation Training Programme and Digital Futures) available now for newbies starting work in this area.

At the Archaeology Data Service I worked my way up to Curatorial Officer and eventually myself and my colleagues decided to change our name to Digital Archivists (because we were fed up of people not really knowing what we actually did!). Policies, procedures and systems at the Archaeology Data Service developed over time as new standards and good practice emerged in these areas, much of this informed by workshops organised by the DPC and the excellent Technology Watch Reports which I still find myself regularly referring to and have been recommending to colleagues in my new job at the University of York earlier this week.

One thing to come out of DPC chairperson, Richard Ovenden's speech on Monday was that we haven't got digital preservation completely sussed yet. Digital preservation is still a little bit scary and an organisation like DPC acts as a kind of 'Digital Preservation Anonymous' for those of us that simply need to talk to others that are facing similar problems. Monday's event was an excellent opportunity to do just that. My only complaint was that it was all over too quick!

Tuesday, 2 October 2012

Testing out Google Forms

Over the last couple of weeks I have been creating an on-line survey to find out about data management practices amongst researchers at the University of York. The aim is to find out what sorts of digital data is being produced and what plans are in place to manage their data, both for the lifetime of their project and for the longer term.

It was suggested that rather than purchasing a licence for a survey tool such as Bristol Online Surveys we use Google Forms. The University of York has moved to Google for e-mail, calendars and document sharing, so this seemed like a logical step.

The survey is quite long with 40+ questions and several redirects or dependencies required based on which way an earlier question has been answered. My previous experience of creating on-line forms has primarily been to hand-code them using Coldfusion and a database back end, so this was a new venture for me. Here are my first impressions of Google Forms.

  • Easy to use. Can quickly jump into it and set a survey up without any training.
  • Allows for a certain amount of interactivity - skipping certain questions if a previous question has been answered a particular way.
  • Not had a chance to properly test this yet, but it looks like visualisation of survey results is good. The form allows you to quickly and easily see the survey results as they are collected in the form of pie charts, graphs and summary statistics. See the example below which is created based on a small amount of sample data.

  • Allows you to create as many pages as you like and add title and text to each page
  • Allows you to add help text to each question
  • Some nice features that I didn't expect such as the ability to create a grid question such as the one illustrated below:
  • A nice way to integrate with University of York's existing Google Applications. Can set up the form to automatically collect respondents' name and e-mail address based on their Google login.

  • No progress bar - this would be a useful addition. In order to allow people to see their progress through the survey, I have added a note (e.g. page 3 of 14) to the top of each page that describes where they are and how far they have to go.
  • No way to import drop down lists. For a number of questions I had long lists of possible responses for the respondent to select from (e.g. a list of all of the departments in the university or funders for the project). Despite the fact that I had these lists in digital form, there didn't seem to be any way of importing them into the form. Each had to be copied and pasted individually ...quite a time consuming exercise!
  • No way of changing the order of the drop down list. I would have hoped that after entering the list of drop down options, if I then decided that the order should be slightly different or I wanted to add a new option near the top of the list, that this would be a simple task. Unfortunately, it was not possible to simply click and drag the options to re-order them, they needed to be re-typed in the order that was required!
  • There were cases where I would have liked to configure the form more. For example to make a certain question mandatory depending on how a previous one was answered. This is not possible in Google forms.

This looks to be a good tool, especially for setting up simple surveys. There are definitely ways that it could be improved but I do think this is going to be fit for purpose for our Research Data Management survey. It will be interesting to start looking at the results once it goes live.

Thursday, 13 September 2012

Installing archivematica ...and running out of memory

For a couple of months now I have been intending to install archivematica and test it out. Described as a "free and open-source digital preservation system that is designed to maintain standards-based, long-term access to collections of digital objects", this sounds like it could be a great starting point for establishing a digital archive here at the Borthwick Institute. Also strongly in its favour from my perspective is the fact that it is compatible with the Open Archival Information System (OAIS) reference model and supports metadata standards such as PREMIS and Dublin Core.

Though I have had a passing interest in this software for some time, in my previous job at the Archaeology Data Service we had a long established digital archive with preservation systems and migration pathways already in place. Setting up archivematica in such a way to interface with our existing processes and procedures there may not have been an easy. Here in the Borthwick Institute we are starting from scratch with digital preservation which in some ways gives us more freedom to make use of some of the new and exciting tools that are being developed in this area. What we need to be able to interface with here are existing systems for cataloguing and providing access to traditional analogue archives. One thing that attracted me to archivematica was the way in which it can work alongside ICA's AtoM. This is a system that supports archival description and can be used by traditional archives to catalogue their holdings and make their catalogues searchable on-line. This is another tool we will be investigating in the not too distant future.

So, to cut a long story short, this morning I set about downloading and installing Oracle VirtualBox and setting up archivematica as a virtual appliance. This all seemed to go quite smoothly despite all the warning messages about non-verified software (all above board according to the installation guide). The only problem was, when I tried to run archivematica and test it out I was disappointed to find I couldn't get past the login screen as there was not enough available memory to continue.

I will try again once my PC has been upgraded so watch this space.

In the meantime I would be really interested to hear from anyone else in the UK who is using both archivematica and ICA AtoM. It would be great to be able to see a real life example of how they work together.

Wednesday, 12 September 2012

First steps for securing digital media within an analogue archive

Isn't it a happy day when a new report on digital preservation appears in your in-box at just the right time? The following report from OCLC is just one of these. I have been in the new post as Digital Archivist for the Borthwick Institute for Archives for 3 months now and had been thinking about what to do with the digital material that is buried deep in the strongrooms of our building.

You've Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media by Ricky Erway, OCLC Research

This report describes a simple but straightforward set of steps for locating and securing digital media that exist in a traditional analogue archive. It works on the premis that doing something now is far better than waiting until a more complete digital preservation solution is available. By simply locating the digital media, copying it to a more secure storage area and establishing what we have, we can instantly gain some level of control over our digital holdings. This will put us in a far stronger position for preserving these into the future.

I am currently creating a methodology based on this report and with reference also to the University of Hull's accessioning workflows described in the AIMS project white paper. Once this methodology has been agreed internally we can start work on this digital 'rescue' mission. I for one am very much looking forward to finding out what we have!