Showing posts with label depositor. Show all posts
Showing posts with label depositor. Show all posts

Thursday, 29 March 2018

Digital preservation begins at home

A couple of things happened recently to remind me of the fact that I sometimes need to step out of my little bubble of digital preservation expertise.

It is a bubble in which I assume that everyone knows what language I'm speaking, in which everyone knows how important it is to back up your data, knows where their digital assets are stored, how big they might be and even what file formats they hold.

But in order to communicate with donors and depositors I need to move outside that bubble otherwise opportunities may be missed.

A disaster story

Firstly a relative of mine lost their laptop...along with all their digital photographs, documents etc.

I won't tell you who they are or how they lost it for fear of embarrassing them...

It wasn’t backed up...or at least not in a consistent way.

How can this have happened?

I am such a vocal advocate of digital preservation and do try and communicate outside my echo chamber (see for example my blog for International Digital Preservation Day "Save your digital stuff!") but perhaps I should take this message closer to home.

Lesson #1:

Digital preservation advocacy should definitely begin at home

When a back up is not a back up...

In a slightly delayed response to this sad event I resolved to help another family member ensure that their data was 'safe'. I was directed to their computer and a portable hard drive that is used as their back up. They confessed that they didn’t back up their digital photographs very often...and couldn’t remember the last time they had actually done so.

I asked where their files were stored on the computer and they didn’t know (well at least, they couldn’t explain it to me verbally).

They could however show me how they get to them, so from that point I could work it out. Essentially everything was in ‘My Documents’ or ‘My Pictures’.

Lesson #2:

Don’t assume anything. Just because someone uses a computer regularly it doesn’t mean they know where they put things.

Having looked firstly at what was on the computer and then what was on the hard drive it became apparent that the hard drive was not actually a ‘back up’ of the PC at all, but contained copies of data from a previous PC.

Nothing on the current PC was backed up and nothing on the hard drive was backed up.

There were however multiple copies of the same thing on the portable hard drive. I guess some people might consider that a back up of sorts but certainly not a very robust one.

So I spent a bit of time ensuring that there were 2 copies of everything (one on the PC and one on the portable hard drive) and promised to come back and do it again in a few months time.

Lesson #3:

Just because someone says they have 'a back up' it does not mean it actually is a back up.

Talking to donors and depositors

All of this made me re-evaluate my communication with potential donors and depositors.

Not everyone is confident in communicating about digital archives. Not everyone speaks the same language or uses the same words to mean the same thing.

In a recent example of this, someone who was discussing the transfer of a digital archive to the Borthwick talked about a 'database'. I prepared myself to receive a set of related tables of structured data alongside accompanying documentation to describe field names and table relationships, however, as the conversation evolved it became apparent that there was actually no database at all. The term database had simply been used to describe a collection of unstructured documents and images.

I'm taking this as a timely reminder that I should try and leave my assumptions behind me when communicating about digital archives or digital housekeeping practices from this point forth.











Jenny Mitcham, Digital Archivist

Friday, 16 June 2017

A typical week as a digital archivist?

Sometimes (admittedly not very often) I'm asked what I actually do all day. So at the end of a busy week being a digital archivist I've decided to blog about what I've been up to.

Monday

Today I had a couple of meetings. One specifically to talk about digital preservation of electronic theses submissions. I've also had a work experience placement in this week so have set up a metadata creation task which he has been busy working on.

When I had a spare moment I did a little more testing work on the EAD harvesting feature the University of York is jointly sponsoring Artefactual Systems to develop in AtoM. Testing this feature from my perspective involves logging into the test site that Artefactual has created for us and tweaking some of the archival descriptions. Once those descriptions are saved, I can take a peek at the job scheduler and make sure that new EAD files are being created behind the scenes for the Archives Hub to attempt to harvest at a later date.

This piece of development work has been going on for a few months now and communications have been technically quite complex so I'm also trying to ensure all the organisations involved are happy with what has been achieved and will be arranging a virtual meeting so we can all get together and talk through any remaining issues.

I was slightly surprised today to have a couple of requests to talk to the media. This has sprung from the news that the Queen's Speech will be delayed. One of the reasons for the delay relates to the fact that the speech has to be written on goat's skin parchment, which takes a few days to dry. I had previously been interviewed for a article entitled Why is the UK still printing its laws on vellum? and am now mistaken for someone who knows about vellum. I explained to potential interviewers that this is not my specialist subject!

Tuesday

In the morning I went to visit a researcher at the University of York. I wanted to talk to him about how he uses Google Drive in relation to his research. This is a really interesting topic to me right now as I consider how best we might be able to preserve current research datasets. Seeing how exactly Google Drive is used and what features the researcher considers to be significant (and necessary for reuse) is really helpful when thinking about a suitable approach to this problem. I sometimes think I work a little bit too much in my own echo chamber, so getting out and hearing different perspectives is incredibly valuable.

Later that afternoon I had an unexpected meeting with one of our depositors (well, there were two of them actually). I've not met them before but have been working with their data for a little while. In our brief meeting it was really interesting to chat and see the data from a fresh perspective. I was able to reunite them with some digital files that they had created in the mid 1980's, had saved on to floppy disk and had not been able to access for a long time.

Digital preservation can be quite a behind the scenes sort of job - we always give a nod to the reason why we do what we do (ie: we preserve for future reuse), but actually seeing the results of that work unfold in front of your eyes is genuinely rewarding. I had rescued something from the jaws of digital obsolescence so it could now be reused and revitalised!

At the end of the day I presented a joint webinar for the Open Preservation Foundation called 'PRONOM in practice'. Alongside David Clipsham (The National Archives) and Justin Simpson (Artefactual Systems), I talked about my own experiences with PRONOM, particularly relating to file signature creation, and ending with a call to arms "Do try this at home!". It would be great if more of the community could get involved!

I was really pleased that the webinar platform worked OK for me this time round (always a bit stressful when it doesn't) and that I got to use the yellow highlighter pen on my slides.

In my spare moments (which were few and far between), I put together a powerpoint presentation for the following day...

Wednesday

I spent the day at the British Library in Boston Spa. I'd been invited to speak at a training event they regularly hold for members of staff who want to find out a bit more about digital preservation and the work of the team.

I was asked specifically to talk through some of the challenges and issues that I face in my work. I found this pretty easy - there are lots of challenges - and I eventually realised I had too many slides so had to cut it short! I suppose that is better than not having enough to say!

Visiting Boston Spa meant that I could also chat to the team over lunch and visit their lab. They had a very impressive range of old computers and were able to give me a demonstration of Kryoflux (which I've never seen in action before) and talk a little about emulation. This was a good warm up for the DPC event about emulation I'm attending next week: Halcyon On and On: Emulating to Preserve.

Still left on my to do list from my trip is to download Teracopy. I currently use Foldermatch for checking that files I have copied have remained unchanged. From the quick demo I saw at the British Library I think that Teracopy would be a more simple one step solution. I need to have a play with this and then think about incorporating it into the digital ingest workflow.

Sharing information and collaborating with others working in the digital preservation field really is directly beneficial to the day to day work that we do!

Thursday

Back in the office today and a much quieter day.

I extracted some reports from our AtoM catalogue for a colleague and did a bit of work with our test version of Research Data York. I also met with another colleague to talk about storing and providing access to digitised images.

In the afternoon I wrote another powerpoint presentation, this time for a forthcoming DPC event: From Planning to Deployment: Digital Preservation and Organizational Change.

I'm going to be talking about our experiences of moving our Research Data York application from proof of concept to production. We are not yet in production and some of the reasons why will be explored in the presentation! Again I was asked to talk about barriers and challenges and again, this brief is fairly easy to fit! The event itself is over a week away so this is unprecedentedly well organised. Long may it continue!


Friday

On Fridays I try to catch up on the week just gone and plan for the week ahead as well as reading the relevant blogs that have appeared over the week. It is also a good chance to catch up with some admin tasks and emails.

Lunch time reading today was provided by William Kilbride's latest blog post. Some of it went over my head but the final messages around value and reuse and the need to "do more with less" rang very true.

Sometimes I even blog myself - as I am today!




Was this a typical week - perhaps not, but in this job there is probably no such thing! Every week brings new ideas, challenges and surprises!

I would say the only real constant is that I've always got lots of things to keep me busy.

Jenny Mitcham, Digital Archivist

Monday, 17 March 2014

'Routine encounters with the unexpected' (or what we should tell our digital depositors)


I was very interested a few months back to hear about the release of a new and much-needed report on acquiring born-digital archives: Born Digital: Guidance for Donors, Dealers, and Archival Repositories published by the Council on Library and Information Resources. I read it soon after it was published and have been mulling over its content ever since.

The quote within the title of this post "routine encounters with the unexpected" is taken from the concluding section of the report and describes the stewardship of born-digital archival collections. The report intends to describe good practices that can help reduce these archival surprises.

The publication takes an interesting and inclusive approach, being aimed at both at archivists who will taking in born-digital material, and also at those individuals and organisations involved with offering born-digital material to an archive or repository.

It appeared at a time when I was developing new content for our new website aimed specifically at donors and depositors and also a couple of weeks before I went on my first trip to collect someone's digital legacy for inclusion in our archive. This last few months alongside archivist colleagues I have also been planning and documenting our own digital accessions workflow. This report has been a rich source of information and advice and has helped inform all of these activities.

There is lots of food for thought within the publication but what I like best are the checklists at the end which neatly summarise many of the key issues highlighted within the report and provide a handy quick reference guide.

Much as I find this a very useful and interesting publication it got me thinking about the alternative and apparently conflicting advice that I give depositors and how the two relate.

I have always thought that one of the most important things that anyone can do to ensure that their digital legacy survives into the future is to put into practice good data management strategies. These strategies are often just simple common sense rules, things like weeding out duplicate or unnecessary files, organising your data into sensible and logical directory structures and naming them well.

Where we have depositors who wish to give us born-digital material for our archive, I would like to encourage them to follow rules like these to help ensure that we can make better sense of their data when it comes our way. This also helps fulfil the OAIS responsibility to ensure the independent utility of data - the more we know about data from the original source, the greater the likelihood that others will be able to make sense of it in the future. I have put guidance to this effect on our new website which is based on an advice sheet from the Archaeology Data Service.

Screenshot of the donor and depositor FAQ page on the Borthwick Institute's new website

However, this goes against the advice in the 'Born Digital' report which states that "...donors and dealers should not manipulate, rearrange, extract, or copy files from their original sources in anticipation of offering the material for gift or purchase."

In a blog post last year I talked about a digital rescue project I had been working on, looking at the data on some 5 1/4 inch floppy disks from the Marks and Gran archive. This project would not have been nearly as interesting if someone had cleaned up the data before deposit - rationalising and re-naming files and deleting earlier versions. There would have been no detective story and information about the creative process would have been lost. However, if all digital deposits came to us like this would we be able to resource the amount of work required to make sense of them?

So, my question is as follows. What do we tell our depositors? Is there room for both sets of advice - the 'organise your data before deposit' approach aimed at those organisations who regularly deposit their administrative information with us, and the 'leave well alone' approach for the digital legacies of individuals? This is the route I have tried to take on our new website, however, I have concerns as to whether it will be clear enough to donors and depositors as to which advice they should follow, especially where there are areas of cross-over. I'm interested to hear how other archives handle this question.




Jenny Mitcham, Digital Archivist

Friday, 25 October 2013

Advice for our donors and depositors


Anyone who knows anything about digital archiving knows that one of the best ways to ensure the longevity of your digital data is to plan for it at the point of creation.

If data is created with long term archiving in mind and following a few simple and common sense data management rules, then the files that are created are not only much easier for the digital archivist to manage in the future, but also easier for the creator to work with. How much easier is it to locate and retrieve files that are ordered in a sensible and logical hierarchy of folders and named in a way that is helpful? We are producing more and more data over time and as the quantity of data increases, so do the size of our problems in managing it.

We do not have many donors and depositors at the Borthwick who regularly put digital archives into our care but this picture will no doubt change over time. For those who do deposit digital archives, it is important that we encourage them to put good data management into practice and the earlier we speak to them about this the better.

File:BitRot web.png
'Bitrot' From the Digital Preservation Business
Case Toolkit http://wiki.dpconline.org/

Last week I was fortunate enough to be invited to speak to a group of people from one of our depositor organisations who are likely to start giving us digital data to archive in the future. They were from a large organisation with no central IT infrastructure and many people working from home on their own computers. Good data management is particularly important in this sort of scenario. This was a great opportunity for me to test out what could become the basis of a standard presentation on digital data management techniques that could be delivered to our donors and depositors.

I started off talking about what digital preservation is and why we really need to do it. It is always handy to throw in a few cautionary tales at this point as to what happens when we don't look after our data. I think these sorts of stories resonate with people more than just hearing the dry facts about obsolescence and corruption. I made a good plug for the 'Sorry for your data loss' cards put together by the Library of Congress earlier this year as this is something that any of us who have experienced data loss can relate to.

I then moved on to my own recent tale of digital rescue, using the 5 1/4 inch floppy disks from the Marks and Gran archive as my example (discussed in a previous blog post). This was partly because this is my current pet project, but also because it is a good way to cement and describe the real issues of hardware and software obsolescence and how we can work around these.

In the last section of the presentation I gave out my top tips on data management. I wanted the audience to go home with a positive sense of what they can start to do immediately in order to help protect their data from corruption, loss or misinterpretation.

Much of what I discussed in this section was common sense stuff really. Topics covered included:
  • how to name files sensibly
  • how to organise files well within a directory structure
  • how to document files
  • the importance of back up
  • the importance of anti-virus software
The presentation went well and sparked lots of interesting questions and debate and it was encouraging to see just how accessible this topic is to a non-specialist audience. Some of the questions raised related to current 'hot topics' in the digital archiving world which I hadn't had time to mention in any depth in my presentation:

  • How do you archive e-mails?
  • Is cloud storage safe?
  • What is wrong with pdf files?
  • What is the life span of a memory stick?
I had an answer to all of these apart from the last one, for which I have since found out the answer is 'it depends'. I have recently been told on Twitter that most digital preservation questions can be answered in this way!


Jenny Mitcham, Digital Archivist

The sustainability of a digital preservation blog...

So this is a topic pretty close to home for me. Oh the irony of spending much of the last couple of months fretting about the future prese...