Tuesday 24 January 2017

Creating an annual accessions report using AtoM

So, it is that time of year where we need to complete our annual report on accessions for the National Archives. Along with lots of other archives across the UK we send The National Archives summary information about all the accessions we have received over the course of the previous year. This information is collated and provided online on the Accessions to Repositories website for all to see.

The creation of this report has always been a bit time consuming for our archivists, involving a lot of manual steps and some re-typing but since we have started using AtoM as our Archival Management System the process has become much more straightforward.

As I've reported in a previous blog post, AtoM does not do all that we want to do in the way of reporting via it's front end.

However, AtoM has an underlying MySQL database and there is nothing to stop you bypassing the interface, looking at the data behind the scenes and pulling out all the information you need.

One of the things we got set up fairly early in our AtoM implementation project was a free MySQL client called Squirrel. Using Squirrel or another similar tool, you can view the database that stores all your AtoM data, browse the data and run queries to pull out the information you need. It is also possible to update the data using these SQL clients (very handy if you need to make any global changes to your data). All you need initially is a basic knowledge of SQL and you can start pulling some interesting reports from AtoM.

The downside of playing with the AtoM database is of course that it isn't nearly as user friendly as the front end.

It is always a bit of an adventure navigating the database structure and trying to work out how the tables are linked. Even with the help of an Entity Relationship Diagram from Artefactual creating more complex queries is ...well ....complex!

AtoM's database tables - there are a lot of them!


However, on a positive note, the AtoM user forum is always a good place to ask stupid questions and Artefactual staff are happy to dive in and offer advice on how to formulate queries. I'm also lucky to have help from more technical colleagues here in Information Services (who were able to help me get Squirrel set up and talking to the right database and can troubleshoot my queries) so what follows is very much a joint effort.

So for those AtoM users in the UK who are wrestling with their annual accessions report, here is a query that will pull out the information you need:

SELECT accession.identifier, accession.date, accession_i18n.title, accession_i18n.scope_and_content, accession_i18n.received_extent_units, 
accession_i18n.location_information, case when cast(event.start_date as char) like '%-00-00' then left(cast(event.start_date as char),4) 
else cast(event.start_date as char)
end as start_date,
case when cast(event.end_date as char) like '%-00-00' then left(cast(event.end_date as char),4) 
else cast(event.end_date as char)
end as end_date, 
event_i18n.date
from accession
LEFT JOIN event on event.object_id=accession.id
LEFT JOIN event_i18n on event.id=event_i18n.id
JOIN accession_i18n ON accession.id=accession_i18n.id
where accession.date like '2016%'
order by identifier

A couple of points to make here:

  • In a previous version of the query, we included some other tables so we could also capture information about the creator of the archive. The addition of the relation, actor and actor_i18n tables made the query much more complicated and for some reason it didn't work this year. I have not attempted to troubleshoot this in any great depth for the time being as it turns out we are no longer recording creator information in our accessions records. Adding a creator record to an accessions entry creates an authority record for the creator that is automatically made public within the AtoM interface and this ends up looking a bit messy (as we rarely have time at this point in the process to work this into a full authority record that is worthy of publication). Thus as we leave this field blank in our accession record there is no benefit in trying to extract this bit of the database.
  • In an earlier version of this query there was something strange going on with the dates that were being pulled out of the event table. This seemed to be a quirk that was specific to Squirrel. A clever colleague solved this by casting the date to char format and including a case statement that will list the year when there's only a year and the full date when fuller information has been entered. This is useful because in our accession records we enter dates to different levels. 
So, once I've exported the results of this query, put them in an Excel spreadsheet and sent them to one of our archivists, all that remains for her to do is to check through the data, do a bit of tidying up, ensure the column headings match what is required by The National Archives and the spreadsheet is ready to go!


Jenny Mitcham, Digital Archivist

No comments:

Post a Comment

The sustainability of a digital preservation blog...

So this is a topic pretty close to home for me. Oh the irony of spending much of the last couple of months fretting about the future prese...