Tuesday, 7 March 2017

Thumbs.db – what are they for and why should I care?

Recent work I’ve been doing on the digital archive has made me think a bit more about those seemingly innocuous files that Windows (XP, Vista, 7 and 8) puts into any directory that has images in – Thumbs.db.

Getting your folder options right helps!
Windows uses a file called Thumbs.db to create little thumbnail images of any images within a directory. It stores one of these files in each directory that contains images and it is amazing how quickly they proliferate. Until recently I wasn’t aware I had any in my digital archive at all. This is because although my preferences in Windows Explorer were set to display hidden files, the "Hide protected operating system files" option also needs to be disabled in order to see files such as these.

The reason I knew I had all these Thumbs.db files was through a piece of DROID analysis work published last month. Thumbs.db ranked at number 12 in my list of the most frequently occurring file formats in the digital archive. I had 210 of these files in total. I mentioned at the time that I could write a whole blog post about this, so here it is!

Do I really want these in the digital archive? In my mind, what is in the ‘original’ folders within the digital archive should be what OAIS would call the Submission Information Package (SIP). Just those files that were given to us by a donor or depositor. Not files that were created subsequently by my own operating system.

Though they are harmless enough they can be a bit irritating. Firstly, when I’m trying to run reports on the contents of the archive, the number of files for each archive is skewed by the Thumb.db files that are not really a part of the archive. Secondly, and perhaps more importantly, I was trying to create a profile of the dates of files within the digital archive (admittedly not an exact science when using last modified dates) and the span of dates for each individual archive that we hold. The presence of Thumbs.db files in each archive that contained images gave the false impression that all of the archives had had content added relatively recently, when in fact all that had happened was that a Thumbs.db file had automatically been added when I had transferred the data to the digital archive filestore. It took me a while to realise this - gah!

So, what to do? First I needed to work out how to stop them being created.

After a bit of googling I quickly established the fact that I didn’t have the necessary permissions to be able to disable this default behaviour within Windows so I called in the help of IT Services.

IT clearly thought this was a slightly unusual request, but made a change to my account which now stops these thumbnail images being created by me. Being that I am the only person who has direct access to the born digital material within the archive this should solve that problem.

Now I can systematically remove the files. This means that they won’t skew any future reports I run on numbers of files and last modified dates.

Perhaps once we get a proper digital archiving system in place here at the Borthwick we won’t need to worry about these issues as we won’t directly interact with the archive filestore? Archivematica will package up the data into an AIP and put it on the filestore for me.

However, I will say that now IT have stopped the use of Thumbs.db from my account I am starting to miss them. This setting applies to my own working filestore as well as the digital archive. It turns out that it is actually incredibly useful to be able to see thumbnails of your image files before double clicking on them! Perhaps I need to get better at practicing what I preach and make some improvements to how I name my own image files – without a preview thumbnail, an image file *really* does benefit from a descriptive filename!

As always, I'm interested to hear how other people tackle Thumbs.db and any other system files within their digital archives.



Jenny Mitcham, Digital Archivist

1 comment:

  1. Thumbs.db files are a pain! You'll probably find that trying to delete one in Explorer fails with:

    "The action can't be completed because the file is open in Windows Explorer"

    One way to get rid of them is from the command line with:

    DEL /F /S /A thumbs.db

    You could run this command across an entire drive, perhaps on a regular schedule. Unfortunately this won't guarantee your drive is free of them at any given time; it only takes one user to view a folder for another thumbs.db to be generated.


    Jenny Mitcham, Digital Archivist

    ReplyDelete

The sustainability of a digital preservation blog...

So this is a topic pretty close to home for me. Oh the irony of spending much of the last couple of months fretting about the future prese...