Monday, 11 April 2016

Responding to the results of user testing

Did you notice that we launched our new AtoM catalogue last week? I hope so!

In the month whilst preparing for launch we wanted to take the time to find out what a sample of users thought about our new catalogue and here I will summarise some of the findings and the steps that we have taken to react to this feedback.

We had 14 people test the catalogue for us off-site and fill out an online questionnaire which was put together using Google forms. Testing was carried out on AtoM version 2.2.0. The volunteers for user testing were found by putting out a call on Twitter and the results were helpful and constructive (though one user could not access the site so was not able to answer the questions in any meaningful way). Despite the small sample size there were several themes that were mentioned more than once. Interestingly these weren't necessarily the themes that we thought would be mentioned more than once!

Let's start with the positives....

The good things

It's always nice to receive positive feedback and we were encouraged to see that there was plenty of this to come out of the user testing - things that were praised fell into the following categories:

Look and feel - The vast majority of users found the catalogue visually appealing. A couple of people mentioned that they liked the colour scheme and one appreciated the fact that it flowed nicely from our website. The image on the home page was also praised. Others commented on the fact that it was well set out with a clean and clear appearance. One respondent compared it very favourably with other leading archival catalogues.

Our home page image

Functionality - The search functionality of the catalogue was praised as was the faceted classification that allows you to filter your search results. The browse by subject feature had several positive mentions and one person liked the ability to download XML files. Navigation within the catalogue was praised, including a specific comment about the tree-view feature on the left side of the interface.

The data - We were pleased to hear people saying good things about the quality of the data that we have in the catalogue. The information was described as being 'full' and 'comprehensive'. The level of detail held in the Conditions of Access and Use field was mentioned specifically and the fact that you could see when each description was last updated. One respondent stated that they liked the fact the catalogue conformed to recognised archival standards and that it was clear from the interface which rules had been used to create the data.

Digital objects - Several of the testers mentioned specifically that they liked the inclusion of digital objects within the catalogue. We have not utilised this feature to full effect just yet, but for some of our descriptions a finding aid or an image is available. Users liked the way that AtoM displays the thumbnails in the results list. An archival catalogue can be quite text-heavy so using digital objects to break the text up was seen as a good thing.

The help pages - Our glossary page had a positive mention. We put this together as we recognised that archival terminology can be a bit of a mystery to non-archivists (myself included) so being able to define some of the key terms we use was a priority for us.

My favourite comment under the question "What did you like about the catalogue?" was "Almost everything". This highlights to me that we have pretty much got it right but of course we shouldn't put our feet up - there is always room for improvement!

The not-so-good things

We also received comments about the things which weren't working so well in our new catalogue:

Look and feel - Of the users who did not think the catalogue was visually appealing, one comment was that it was 'bland' and that too much space on the front page was taken up by the image. The same person didn't like the fact that all the navigation was on the left and they couldn't find the search box. Another respondent thought that the links on the left hand side were too small and their eye wasn't drawn to them because of the large image on the front page. It was thought by one person that the location of the main image on the front page looked odd because it wasn't central.

Our response: We wondered about trying to increase the size of the text in the left hand navigation bar in order to make these links stand out a bit more but concluded that this may well upset the balance of the current design. Being that the majority of respondents were very happy with the visual appearance of the site, we decided that no changes were needed at this point in time.

Search box - The visibility of the search box was an issue that was raised a couple of times. We are using a slightly customised version of the default Dominion theme within AtoM and this puts the search box at the top of the screen. One person didn't find the search box at all whilst testing the catalogue. Another found it but wasn't immediately sure of its purpose as its location and proximity to the University of York logo suggested it would search our website rather than our catalogue. This may have been a direct result of our decision to style the catalogue to mirror the look and feel of our website as we do have a similar sized website search box in the top bar of our website.

Our response: We have given some serious thought to how to make the search box more prominent within AtoM but I'm not convinced there is an obvious solution to this. Prior to the user testing we had already changed the colour of the search box from dark grey to white to make it more visible. We have since made another minor tweak to the default theme to turn the 'Search' text within the search box from grey to black to make it stand out more. We considered making the search box bigger (longer) but our top bar is already getting quite crowded and filling it up any more than necessary does have knock on effects to the responsive design when viewed on smaller screens. 

While I can see a benefit to having the main search box taking centre stage on the catalogue front page, I also see it is useful having it up in the top bar so it is always accessible where ever a user is within the catalogue. We don't intend to make any further changes for the time being.

Search results - Several people mentioned that there were simply too many results when you carry out a search ...and the results that come up are not always relevant. We had already been discussing this very issue on the AtoM mailing list and were not surprised that our users were struggling with this. 

Our response: We are hoping that this is something that will be resolved in future versions of AtoM, but in the meantime we are focusing on educating our users by giving them the information they need in order to run more effective and precise searches (even just using the powerful functionality that is available within the basic search box). 

We think that a change to AtoM's default behaviour which currently searches for multiple words by default with an 'OR' operator rather than an 'AND' would produce search results that were more in line with what our users were expecting. Also, although users of Google will happily run a search that produces many thousands of results and feel comfortable not moving beyond the first ten 'hits', users of archival catalogues do not necessarily take the same approach. There seems to be more of an assumption that the list of results will be relevant and each should be worked through in turn. This is something we are definitely hoping for a solution to in the future.

Filtering the search results - One person expressed a desire to be able to filter a search by date

Our response: We agree that this would be a really useful feature and we were pleased to hear from Artefactual Systems that this will be possible within the next version of AtoM (2.3) which is due out soon. This will also introduce the ability to search within the date field in the advanced search and order results by start date in the results list. I think these features are going to be really valuable to our users.

Navigation - One person reported that the catalogue was hard to navigate but didn't give further details. Another struggled with navigation and described a scenario in which they had got lost within the catalogue. 

Our response: I can easily understand how someone could get lost within our catalogue - it has happened to me too! In some respects this problem is directly related to the powerful functionality of the AtoM interface and relational nature of the underlying data structure. Searching and browsing AtoM isn't a linear journey but rather an opportunity to follow links between one record and another based on shared subject terms or creators. Getting lost is a fairly inevitable consequence of this functionality and I struggle to think of an effective solution (apart from encouraging repeated use of the browser 'back' button to get back to where you started!)

The data - One user reported that there is "not much material yet" and another asked for more digitised documents. It was also mentioned that there were "not enough categories for searching" (we speculate that this might relate to the subject terms we have entered). Another comment received was about the term 'accrual' which is used as a field name within AtoM and also within the data that we enter in that field. It was suggested that this word might be a bit off-putting for some users. It was also mentioned that the lists within the Scope and Content field were"pretty hard reading" and a suggestion was made that this would be more user-friendly if presented as a bulleted list rather than a paragraph of text.

Our response: We did expect to get comments about our data. Just because we have launched our catalogue we do not consider it to be a finished piece of work. Further work on populating the catalogue and a fuller exploration of the functionality around digital objects will follow over the next couple of years. It was interesting to get the feedback about the word 'accrual' - we had actually anticipated much more feedback about the terminology that we use but hadn't considered this word in particular. I do agree that this word is a tricky one for non-archivists and I'm pretty sure I had not encountered it before I came to work at the Borthwick Institute. We don't want to change it on the basis of one comment but did decide to add the term to our glossary (one of the help pages we have created within AtoM) and hope that this helps our users.

The help pages - In our questionnaire we asked people specifically whether they used the catalogue help pages. The majority of users surveyed didn't use the help pages and this was not a surprising result. One person's reason for not using the help was because they "should not need to in a well designed information system". Another person stated that they preferred to "just see if I could use the catalogue instinctively". A couple of people mentioned that the page was too text heavy and someone else reported that they didn't know there were any help pages. Someone also suggested that the help pages should open in a new window.

Our response: As a result of the user testing we have made several changes to our help pages. We have updated the text (specifically to explain how to reduce the number of search results) and added a number of screenshots to help convey the information in a more visual way. 

Our help pages are now more visual and include screenshots - the first graphic simply shows how to access the search box. We have also created some printed and laminated copies of these for use in the searchroom.

Of course we can put a lot of effort into putting the right level of information into our help pages but we can not force people to use them! So, over the last couple of weeks we have been ensuring that our searchroom assistants (the people who will be providing front line support to our users as they grapple with our new catalogue) are aware of the different search options within AtoM and understand how they can be used to best affect.

There are also things we can do to make it clearer to users where the help pages are so that they can easily find them if they want to. By default the help pages in AtoM appear under an 'i' icon alongside other static pages. Replacing this 'i' icon with a '?' seemed to be a sensible step to take in order to make it clearer where help could be found. Artefactual Systems were able to point us to the relevant icon in Font Awesome which was just what we needed to implement this little change. 

We agreed that it may be useful for the help pages to open in a new tab so that someone could access them without losing their place within the catalogue (particularly being that 'getting lost' was also an issue that had been reported). Our help pages now open within a separate tab. We will monitor how users respond to this and whether the potential proliferation of tabs becomes a problem.

It has been a useful exercise reviewing this initial sample of responses and giving some thought to how AtoM and our own implementation of it can be improved. We will be continuing to gather user feedback through further more detailed testing with a smaller sample of users and by pulling together the ad hoc comments we are likely to receive now our catalogue is live. 

Thursday, 7 April 2016

Our catalogue is now live!

Was it really 3.5 years ago when I first blogged about requirements for a new archival management system?

My main aim in getting involved in this project was to create a stable base to build a digital archive on.

If you build a digital archive on wobbly foundations there is a strong chance that it will fall over.

Much safer to build it on top of a system established as the single point of truth for all accessions information your organisation holds. A system which will become the means by which you disseminate information about your digital holdings (alongside the physical ones) and enable users to access copies of born digital and digitised material.

Finally we have such a solution in place!

We chose Access to Memory (AtoM) as our new archival management system, and over the last few years there has been a huge amount of work going on behind the scenes getting it up and running. I'm so pleased that today we are in a position to unveil the results of all of that hard work.

Our new catalogue can be viewed at

In a previous blog post "A is for AtoM" I talked about some of the tasks that have been going on and decisions that have been made to get us up and running, so I won't repeat all of that here.

Suffice to say that a considerable amount of work has gone in to getting AtoM installed, configured and styled. While this has been going on, Project Genesis has been key to getting the catalogue populated with archival descriptions. The task of populating our catalogue will continue via project Genesis until April 2017 and by other channels beyond that.

While our initial focus has been to get a collection level description for each of our archives into the catalogue, further work is required on the wider task of retroconversion - getting a variety of finding aids in a range of different formats into the system. We have managed to tackle some of this in an ad hoc way but there is still much to do.

Our AtoM catalogue is live, but our work is not yet done. I need to start thinking about how we can build digital preservation functionality on top of this (via Archivematica) and of course how we can start to provide more access to our digital holdings through the catalogue interface. Watch this space!

In the meantime, we'd be happy to hear any feedback about our catalogue so do get in touch.

Friday, 1 April 2016

Kicking off phase 3 of "Filling the Digital Preservation Gap"

I realise I've gone a bit quiet on "Filling the Digital Preservation Gap" since the release of our phase 2 project report. I am pleased to pass on the news that we have been funded by Jisc to continue some of our work into Phase 3.

Our Research Data Spring phase 3 kick off meeting was held yesterday at the Hull History Centre and we celebrated with a suitably spring-themed cake!

Our Research Data Spring chicken cake

So here is a run down of what we are planning to do in phase 3:

The big one at the top of the list is Archivematica implementation. Both York and Hull are going to be working on their own proof of concept implementations of Archivematica integrated with their existing repositories (and potentially other systems within the RDM workflow). We may not be able to follow the implementation plans from our phase 2 report in full (as we have not been funded in full) but both institutions plan to get an implementation up and running with a focus on a single use case.

I for one am very excited about this implementation phase. This is what our work over the previous two phases has been leading up to. The ground work laid in phases 1 and 2 has been incredibly valuable, but it will be great to move from talking about Archivematica to actually working with it!

We are also going to continue to look at the issue of unidentified file formats. This has been a recurring theme during phases 1 and 2 and is particularly pertinent for research data which comes in such a huge variety of formats. We are going to work with The National Archives to ensure a few more research data file formats are represented in PRONOM. We will also give further thought to our workflows for handling unidentified files and how tools such as Archivematica can help.

We will of course be continuing our dissemination and outreach work. Some of this has already happened over the last couple of months.
  • I gave a presentation at the IDCC16 conference in Amsterdam in February and discussed why active digital preservation is often left out of RDM workflows - the slides can be viewed here
  • Julie Allinson presented a case study about our project at a workshop entitled 'Digital Preservation: Strategic Issues' at the National Library of Wales in February
  • Myself and Simon Wilson from Hull produced a poster for the UK Archives Discovery Forum last month to promote some of the themes of our project so far and make sure the wider archives community is aware of our work

Our UKAD 2016 poster
  • At the UK Archivematica meeting last month I gave a presentation which summarised the outcomes of the development work we funded in phase 2. This can be found here
Watch out for us at 'Research Data, Records and Archives: Breaking the Boundaries' in Edinburgh later this month and Open Repositories in Dublin in June.

Of course we will also be keeping you posted on this blog as phase 3 of our project progresses, so watch this space

Friday, 18 March 2016

'A' is for AtoM

Over the past couple of years we have been busy working away behind the scenes on our implementation of Access to Memory (AtoM) at the Borthwick Institute for Archives and very soon we will be launching our new catalogue to the public.

I haven’t said much about AtoM on this blog thus far but it has been a huge preoccupation over the last couple of years. Here I attempt to redress that balance.

It turns out that deciding to adopt a system is relatively simple, working out exactly how you are going to use it is far more complex!

What follows is a list of just some of the things we have been thinking about and working on over the last couple of years as we move towards launch. I present you with the A to Z* of implementing AtoM….

A is for Accession Mask

We were very keen to use AtoM for accessioning…in fact the need to urgently find a new system for recording our accessions was the key driver for getting us moving with AtoM in the first place.

As we started using AtoM for recording new accessions we realised we needed to get the accessions mask right. This is just one of the configuration options within AtoM and it enables you to create unique references for your accessions. We wanted to ensure that our new accession numbers were in the same format as our previous ones so with a  little bit of help and advice from the AtoM user forum settled on the mask “%Y/%iii” which creates  numbers in our preferred format of [yyyy]/[no]. Now we just need to remember to reset the accession number to ‘0’ at the start of each new year so that our running number sequence starts again. This is just one of the ways that an institution can configure AtoM to suit local preferences.

B is for Business as Usual

Any organisation when adopting a new and complex system like AtoM needs to think beyond initial implementation and consider how the solution can be embedded into their workflows for the longer term? The ultimate goal for us is getting AtoM seen as 'business as usual' at the Borthwick. We are not there yet (though perhaps we almost are when it comes to working with accessions data). Getting us to the point where AtoM is not a standing item on our meeting agendas is something to aim for in the future!

C is for Customising the look and feel

AtoM gives you some options for customising of the look and feel of the front end. Being that the AtoM interface is going to be the primary means through which our users will browse and view information about our holdings, we want the interface to look consistent with our other communications. It needs to be clear that it belongs to us. Using our brand colours was a quick win and we also put some additional effort into creating an attractive image for the home page to make it look more visually appealing.

Note that there is a limit to the level of customisation that can be done without developer support. Within the admin interface of AtoM some basic changes to theme colours can be made, but I quickly found that changing the background colour to our Borthwick orange did not look pretty! Much better to call in our local technical experts to tweak the CSS behind the scenes.

D is for Drop Down Lists

AtoM comes ready populated with wordlists (called taxonomies) that populate the drop down lists to support data entry, however, institutions can change these to meet their own local needs. We have had to tweak a few of the taxonomies within AtoM, for example the deposit types in the accessions section and the levels of description (after much internal debate!).

E is for Experimenting

In order to understand AtoM we knew we really need to get some of our data into it. We experimented with some structured finding aids that already existed in EAD format and had a go at importing them. We discovered that data may not always import in the way you expect.

One of the key problem areas for us has been the way AtoM handles the <bioghist> element in our EAD files. The issue is documented here. Essentially what it tends to mean for us is that we end up with lots of untitled authority records when we import an EAD finding aid. This has been a bit of a barrier for us in getting more of our existing catalogues into AtoM. Experimenting and carrying out tests to check the behaviour however, does allow us to consider how we can tackle the issue and work towards a solution for future data imports.

F is for Friendly Advice

Though there is much detail in the AtoM documentation, anyone starting to use a new system such as AtoM will inevitably get to the point where they need to speak to someone, or see another implementation. The AtoM mailing list and the staff at Artefactual Systems are friendly and helpful and it is easy to get quick answers to specific questions. It is also incredibly valuable to have a local AtoM user to talk to, to bounce random questions off (particularly ones that may sound too silly or trivial for the mailing list!).

G is for Give it a Name

In the last few weeks before AtoM launch it occurred to us that we needed to decide what to call it. Internally we have simply been calling it ‘AtoM’ but we realised that this label is of little use to our users. As we started to finalise the interface and prepare the publicity for launch date we agreed that we would call it the ‘Borthwick Catalogue’. Perhaps not very imaginative but it is at the very least a concise description of its content and purpose!

H is for Help Pages

An online archival catalogue is quite a complex thing and we are aware that some of our users may be a bit daunted by it. Help pages are therefore really important to describe how to search and filter the results.

AtoM comes with some standard static pages, that can be very easy edited. We've been working on our help pages and expect we will be editing these further once we have completed our user testing. We have also created another static page to act as a glossary of archival terms. Although one of AtoM's big selling points for us was the fact it was aligned with archival standards and terminology, we are concerned that our users may struggle with some of the language used.

I is for ISDIAH

Within AtoM the archival descriptions from an institution all link back to an ISDIAH record that describes the archival institution. This record is useful for users of our data, whether browsing within the AtoM interface directly or through aggregators.

We have had some internal debate on the extent  to which we should replicate information that is on our website, but have decided that providing links to the relevant content would be better in many cases. For information about access and opening hours and the extent of our holdings, we want to ensure that the information is accurate and up to date, and having another place where this information would need to be edited adds an extra overhead.

J is for Just Start!

For a while we were stuck in a chicken and egg situation. Not sure how to use AtoM until it was set up properly and ready to go, and not sure how to set it up until we had started using it and fully understood the issues we would encounter.

Reading the documentation is essential but testing and experimenting with AtoM are really the best ways of working it out. Only by importing different datasets into AtoM or by creating new ones direct into the web form did we really understand how it worked and how this impacted on our own internal workflows. Learn by doing!

K is for Kittens (because they are never really free)

AtoM is open source and freely available for all. However, Artefactual Systems who support it stress it is “free as in free kittens”. In other words, you can have AtoM for free but it isn’t cost neutral - you need someone to install it, manage the server, configure it, and administer it. Populating it is also going to require a huge outlay of staff time.

On top of this, there will undoubtedly be things that you want AtoM to do that it doesn't yet do. If you are implementing AtoM, have a budget for funding further developments. Sponsored developments will then benefit the wider AtoM community and together we can make AtoM better and better. Quite early on in our AtoM implementation project we funded a small piece of work to include covering dates within the accessions module of AtoM as we felt that this was important information to record during the accessioning process and we did not want to lose this data from our existing accessions records when we imported them into the system. Of course we are hoping this feature will also be valuable for other AtoM users. There will undoubtedly be other feature developments we will sponsor in the future.

L is for Local Guidance

One of AtoM’s key selling points to us was the fact that it was created in association with the International Council on Archives (ICA) and is closely aligned with their metadata standards. There is however still a need for local guidance on how we intend to use some of these metadata fields.

In response to this we have created our own AtoM handbook to sit alongside the documentation that Artefactual provides. The handbook doesn't duplicate the official documentation, but describes our local procedures and requirements for data entry. This is all the more necessary given the fact that the majority of the data fields within AtoM are free text fields. With multiple users entering data into AtoM, it is important to have local guidance to ensure we maintain some consistency in the way we describe our archives.

M is for MySQL access

When we initially assessed AtoM against our requirements for an archival management system, it performed well but it didn't do everything we needed it to do.  Searching and reporting functionality within AtoM does not currently meet all of our needs. It was considered essential then that we had another method of querying the data within AtoM and producing reports and statistics. To do this, we need access to the MySQL database that sits behind AtoM.

Access to the the data via a free tool (I use Squirrel but there are other options out there) and a working knowledge of Structured Query Language allows you to do pull out exactly the data you require.

AtoM has quite a complex and involved data structure so getting to grips with this was a bit of a learning curve, but having now got a working query to enable me to extract an annual summary of all accessions we have received over a given year I feel ready for the next challenge that is thrown my way!

N is for Not Perfect

AtoM (like all complex systems) has its limitations. It ticks many boxes for us but it does not tick them all. There are several areas where we think it could improve and we have been discussing these with the user community and developers and hope to influence its roadmap. As with all open source solutions, rather than complaining about what it doesn't do well, the user community should be working together to solve problems and support improvements. AtoM is not perfect but we are confident that it is moving in the right direction and getting better all the time.

O is for Objects (digital ones!)

One of the main reasons I got involved with AtoM implementation was because I wanted a stable base to build a digital archive on – a single point of truth about our holdings and a single system through which our users could access information about our holdings. Being able to expose access copies of our born digital archives and digitised content via AtoM is something we haven’t yet explored in full but this work will become a priority over the next couple of years. Once AtoM is launched I will be turning my attention back to Archivematica in order to help get this moving.

P is for Populating AtoM

This is undoubtedly the biggest challenge we have. Over the course of the 60 years we have been in existence, the Borthwick has created a wealth of catalogues and finding aids. Of course, these are in a range of different formats and states of completion. Some are digital, some are not. Of the digital ones, some are structured data and some are not. Some comply to modern archival standards and some don’t. Some are complete but some do not include information about more recent accruals to the archives. Just working out the current state of play is a challenge in itself.

Being both pragmatic and realistic about what is achievable is a good place to start. Getting all of this information into AtoM is a huge task and not something we can do quickly. While we have managed to enter some full finding aids into AtoM, we have not had the staff time to do as much as we would have liked. What we have prioritised though, is the creation of a collection level description for each archive that we hold and this is being achieved through Project Genesis.

Populating AtoM with our accessions data was also not without its problems but now this has been achieved we are able to browse and search all of our accessions data in one place for the first time - a really important step for us!

Q is for Quality

In an ideal world, all our data within AtoM would be of a high quality.

...but we do not live in an ideal world.

Accepting that legacy data will not always meet current standards or be as accurate as we would like is key to moving forward with a system such as this.

We are striving for a full range of high quality and standards compliant finding aids within AtoM but difficult decisions have to be made. Is it better to expose a small number of perfect catalogues or a larger number of catalogues that don’t contain all the mandatory ISAD(G) fields? The second option gets my vote.

R is for Reference Codes

Quite early on, we had to make a decision about whether or not to inherit reference codes. This is a setting you can change within the admin section of AtoM and a very important one to give some thought to before you go too far down the data entry or import route.

AtoM can either be set up so that you enter the full reference code for each level of the hierarchy of archival description, or it can be set up to inherit previous levels of its reference code depending on its position within the hierarchy.

There is no right or wrong answer here and each institution will need to work out what will suit them best.  It can be hard to make a decision like this at the point where you are just starting out. Until you start to use AtoM in earnest you may not understand the full implications of your decision. Having initially agreed internally that we were going to inherit the reference code to save time with data entry and help guard against human errors, we subsequently changed our minds and decided not to inherit. This decision was influenced heavily by the way AtoM displays the reference numbers to the end user and how the archival hierarchy appears on the left side of the interface. We wanted the full reference to be displayed alongside each element of the hierarchy to help our users interpret the data and more easily see how the different levels relate to each other.

Time will tell whether we've made the right decision or not, but I imagine that once we have a substantial quantity of data within AtoM, this will become a harder decision to change!

S is for Session Timeout

Beware the inactive session timeout! AtoM times out by default after 30 minutes of inactivity. This has caused us problems when creating detailed descriptions within AtoM. If completing the Scope and Content field for a large and complex archive, it is necessary to spend some time consulting the physical archives and composing a description. Colleagues sometimes found that by the time they came to save their record the session had timed out. Naturally this was the source of great frustration.

We experimented with trying to extend the inactive session timeout period but these efforts were not successful. To avoid data loss we do encourage staff to regularly save their work. A text editor can also be used to compose descriptions. With an autosave function and no timeout, data is safer here and can be pasted into AtoM once it is complete.

T is for Training

Artefactual Systems offer introductory training sessions in AtoM and delivered one of these to Borthwick staff via WebEx at the start of our implementation project. This was well worth the expense, ensuring that staff understood the capabilities of the system and had a basic grounding in how to use it. I had my reservations about how well a training session via WebEx would work, but needn't have worried on that score. We heard Sarah Romkey from Artefactual Systems in Canada loud and clear and she was able to maintain a high level of enthusiasm throughout the session despite the fact that we had got her out of bed very early in the morning.

Training is not just a one off exercise. Now we are further along in our AtoM implementation we will be arranging further staff training to focus more on our local use of AtoM and internal processes and workflows.

U is for User Profiles and Roles

We have been giving some thought to who needs to do what within AtoM.

  • Who should have access to the import and export functions?
  • Who will be able to add new users to the system?
  • Who needs the ability to edit the static pages?
  • Who can publish and delete archival descriptions?
  • Who can change the accessions counter?

We are keen that AtoM is widely used by our staff and want to ensure that everyone has the necessary permissions to be able to carry out their work. User roles may evolve over time but some initial decisions do need to be made in the early stages of implementation.

V is for Volunteers

Prior to release of AtoM we have been calling for volunteers from our user base to help us test AtoM and give us their feedback.

We have put a lot of work into getting our AtoM instance ready to release and we have had our users in mind at many stages of the process. We now need to find out whether we have got it right. User testing is ongoing and we envisage we will be making some changes to AtoM once the feedback is collated.

We are really looking forward to seeing what people think.

W is for Web Address

We have made some decisions about the web address we will use for our production version of AtoM. The default url had ‘atom’ in it, but we wanted to change this to something more meaningful. AtoM means something to us and perhaps to other archives professionals but not to our users.

So, we have replaced ‘atom’ with something more descriptive and meaningful to our users – we will be plastering this url over the bookmarks and other publicity we are creating for our scheduled launch date so we want to get it right!

X is for XML

We do not want people to have to come to us to find out what we hold, we want our data to be signposted as widely as possible via other portals and aggregators both nationally and internationally. By doing so we facilitate serendipitous discovery and attract new users.

To this end we have been talking to external aggregators such as the Archives Hub to find out whether our AtoM data can be incorporated into their portal. We have been exporting sample data as EAD XML files so that the Archives Hub can assess it and see if it can be incorporated into their portal. A few initial problems with the EAD that AtoM creates have been ironed out and we are moving closer to being able to make this a reality over the next few months.

Y is for YorSearch

One of the features of AtoM we have been looking at before launch is the OAI-PMH functionality. We have used this to enable our AtoM data to be surfaced as simple Dublin Core metadata via our University Library Catalogue, YorSearch. It will be interesting to see whether students and staff members from the University (who may not have thought to consult our catalogue directly) will be approaching us in the future to consult our archives.

So, these are some of the things we have been thinking about and working on over the last year or so whilst moving our AtoM implementation from idea to reality. Hopefully it is of use to others who are embarking on the same process.

And of course, watch this space for news of launch!

* Actually an A-Y ...did anyone notice that there was no letter 'Z'?

Wednesday, 10 February 2016

I'll show you my research data if you show me yours...

My research data
A few months ago I was having a clear out at home and came across a bunch of floppy disks in the drawer of my bedside table.

This is my research data...

Actually, that is not strictly true. I did a taught masters course and my research consisted of just a short dissertation at the end of the course. Most of these disks contain files from the taught element of my course and the subsequent dissemination of results. 

I published a paper at the end of the masters on the findings of my dissertation. 

If you are interested in the placement of
Iron Age hillforts in the landscape then
this is the book to look for.
No-one has since approached me and asked if they can see the data that underlies this publication

...but this was the 1990's! 

Times are different now. We expect our researchers to be able to produce the data and share it (where appropriate) so that others can build on their research. 

I'm now involved in teaching researchers here at York about Research Data Management (RDM) and how they should look after their data for future re-use.

When I created and stored this data I was not a digital archivist. I had no idea I would become a digital archivist. I like to think I would have managed my data differently if I had known more. 

Let's start with documentation. Much of the documentation for this data is what is actually written on the disk labels. I gave myself a little pat on the back for having recorded what was on the floppies so well on *most* of the disks. This of course was particularly useful in those days. File names were restricted to 8.3 characters so very little detail about the files could be incorporated into the name. Documenting things on disk labels helps add a bit of context. All well and good until you notice the disk on the far right with no label at all. This one remains a mystery!

So what are the issues here. First and most obviously, as a student in the 90's I was using cutting edge storage technology - the floppy disk! Can we read these today? Yes and no. Floppy disks fall firmly into the category of 'obsolete media' which is a topic that we digital archivists like to talk about. I found I could read about a quarter of these using the USB floppy reader that is attached to my PC. For the others I saw a lot of error messages like this:

The answer is "No"!

Fortunately I had more success using an old PC I keep in my office for the very purpose of reading old floppy disks - all but two of the floppies could be read and copied using this PC. On one disk I could view the list of files on it but couldn't copy all of them off the disk so I considered this to be a partial success. The one disk which I couldn't access at all was interestingly the one with no label. Perhaps this mystery disk was in fact never formatted or put into active use. 

Not too bad a result so far?

So what about the contents of the disks?

The contents of one of the floppy disks. Windows Explorer identifies the DOC files as
Microsoft Word 97-2003 but they are likely to be an earlier version of Word than this

As mentioned above the file and folder naming is noticeably brief (as is the way with media from this period). Today we talk to our researchers about the importance of naming files in such a way that you know what it is before you double click on it. This was near on impossible when faced with only 8 characters. I created this data but have no idea what I might expect to find in a directory called 'DISTEX' (though the label on the disk does help give a clue).

Note too the lack of organisation of the contents. At the end of my masters degree whilst finishing off my papers and publications I was also clearly focusing on what my next steps would be. Personal data (my CV for job applications) is stored alongside data relating to my research*. This again is something we discourage when we talk to researchers about data management. It is much easier when working with filestore to organise and categorise data more effectively, keeping personal data separate from research data. We have come a long way since the days when we were squashing any files that would fit on to a floppy disk regardless of content or context.

Here is some data on another of the disks (viewed in Windows Explorer as tiles). I have no idea what possessed me to store scanned photographs as GIF images. They look terrible! Did they always look this bad? Choosing the right file format is something we also cover in our RDM training and though file size is still a consideration for today's research students, at least they don't have to try and fit numerous images for one presentation on a single floppy disk.

More coded file names - this was a necessity when you had so few characters available.
I still remember what these mean but very much doubt anyone else would.

Some are my files are fairly easy to read, others less so (more detective work is required to find the right software). The Word documents are OK but come up in 'Protected View' (which means I'm not allowed to edit them). The default settings here are to treat a Word 6 or 95 document with suspicion but this can be easily resolved by editing these settings.

These old MS Word docs are still readable (and editable if I change the policy settings)

So, digging out my old research data has been an interesting diversion. I now use this as an example at the beginning of RDM teaching sessions and ask the students to imagine how their research data might look 20 years from now. 

Another added bonus from this exercise is that I now have even more files that I play with as I test Archivematica and file identification tools.

*Interesting to note that a first (unsuccessful) attempt to get a job in York occurred in 1998. I got here 5 years later!

Friday, 5 February 2016

New "Filling the Digital Preservation Gap" report released

I am pleased to announce that we have just published a new report on the "Filling the Digital Preservation Gap" project.

Filling the Digital Preservation Gap. A Jisc Research Data Spring project. Phase Two report - February 2016 - Jenny Mitcham, Chris Awre, Julie Allinson, Richard Green, Simon Wilson.

This phase 2 report, funded through Jisc's Research Data Spring initiative, details the work the project team have carried out with Archivematica over the last few months of the project. 

Our phase 2 work had the following aims:
  • Work with Artefactual Systems to develop Archivematica in a number of areas in order to make the system more suitable for fitting into our infrastructures for research data management
  • Develop our own detailed implementation plans for Hull and York to establish how Archivematica will be incorporated into our local infrastructures for research data
  • Consider how Archivematica could work as an above campus installation
  • Continue to spread the word, both nationally and internationally, about the ongoing work of our project

Our work in all of these areas are detailed in the report in full. Please do download it and let us know what you think.

We very much hope that the new features we have sponsored within Archivematica will be of interest to other Archivematica users (both current and future) and that these features will continue to evolve and improve over time.

Tuesday, 5 January 2016

When digital preservation really matters...

Of course digital preservation always matters* but recent events in York and beyond over the festive period really do highlight the importance of looking after your stuff - both physical and digital.

Not everyone is lucky enough to get much warning before a disaster of any type strikes but in some situations (such as that which I found myself in just after Christmas) we have some time to prepare.

Hang on...there isn't normally a lake near my house

Beyond relocating important things such as the hamster and photo albums upstairs and moving the Christmas decorations higher up the tree, it is also important to remember the digital....

Digital is robust in some respects but perhaps more at risk in others. Robust in that it is possible to very quickly make as many additional copies as you like and store them in different places (perfect for a disaster scenario such as this), but the risk is that it is more easily forgotten.

Of course I back up my personal data (digital photographs mostly) regularly, but with the chaos of the build up to Christmas I had not done so for a few weeks, so was prompted to do so before unplugging the PC and moving it to higher ground.

We were some of the lucky ones in York - the water levels didn't reach us so the preparations were not necessary but others were not so lucky. Many houses and businesses in York and in other areas of the country were flooded and many did not have the luxury of time to prepare for the worst. The very basics of digital preservation, (maintaining a regular back up strategy and storing copies of the data in different locations) really is something that should happen in a proactive way not just in response to specific threats.

* I have to say that - it is in my job description