Tuesday, 13 August 2013

A short detective story involving 5 ¼ inch floppy disks

Earlier this year my colleague encountered two small boxes of 5 ¼ inch floppy disks buried within the Marks and Gran archive in the strongrooms of the Borthwick Institute. He had been performing an audit of audio visual material in our care and came across these in an unlisted archive.

This was exciting to me as I had not worked with this media before. As a digital archivist I had often encountered 3 ½ inch floppies but not their larger (and floppier) precursors. The story and detective work that follows, took us firmly into the realm of ‘digital archaeology’.

Digital archaeology: 
The process of reclaiming digital information that has been damaged or become unusable due to technological obsolescence of formats and/or media” (definition from Glossaurus)

Marks and Gran were a writing duo who wrote the scripts of many TV sitcoms from the late 1970's on-wards. ‘Birds of a Feather’ was the one that I remember watching myself in the 80’s and 90's but their credits include many others such as ‘Goodnight Sweetheart’ and ‘The New Statesman’. Their archive had been entrusted to us and was awaiting cataloguing.

Clues on the labels

There were some clues on the physical disks themselves about what they might contain. All the disks were labelled and many of the labels referred to the TV shows they had written, sometimes with other information such as an episode or series number. Some disks had dates written on them (1986 and 1987). One disk was intriguingly labelled 'IDEAS'. WordStar was also mentioned on several labels 'WordStar 2000' and 'Copy of Master WordStar disk'. WordStar was a popular and quite pioneering word processing package of the early 80’s.

However, clues on labels must always be taken with a pinch of salt. I remember being guilty of not keeping the labels on floppy disks up to date, of replacing or deleting files and not recording the fact. The information on these labels will be stored in some form in the digital archive but the files may have a different story to tell.

Reading the disks

The first challenge was to see if the data could be rescued from the obsolete media they were on. A fortuitous set of circumstances led me to a nice chap in IT who is somewhat of an enthusiast in this area. I was really pleased to learn that he had a working 5 ¼ inch drive on one of his old PCs at home. He very kindly agreed to have a go at copying the data off the disks for me and did so with very few problems. Data from 18 of the 19 disks was recovered. The only one of the disks that was no longer readable appeared from the label to be a backup disk of some accounting software - this is a level of loss I am happy to accept.

Looking at file names

Looking at the files that were recovered is like stepping back in time. Many of us remember the days when file names looked like this - capital letters, very short, rather cryptic, missing file extensions. WordStar like many of the software packages from this era, was not a stickler for enforcing use of file extensions! File extensions were also not always used correctly to define the file type but sometimes were used to hold additional information about the file.

Looking back at files from 30 years ago really does present a challenge. Modern operating systems allow long and descriptive file names to be created. When used well, file names often provide an invaluable source of metadata about the file. 30 years ago computer users had only 8 characters at their disposal. Creating file names that were both unique and descriptive was difficult. The file names in this collection do not always give many clues as to what is contained within them.

Missing clues

For a digital archivist, the file extension of a file is a really valuable piece of information. It gives us an idea of what software the file might have been created in and from this we can start to look for a suitable tool that we could use to work with these files. In a similar way a lack of file extension confuses modern operating systems. Double click on a file and Windows 7 has no idea what it is and what software to fire up to open it. File characterisation tools such as Droid used on a day to day basis by digital archivists also rely heavily on file extensions to help identify the file type. Running Droid on this collection (not surprisingly) produced lots of blank rows and inaccurate results*.

Another observation on initial inspection of this set of files is that the creation dates associated with them are very misleading. It is really useful to know the creation date of a file and this is the sort of information that digital archivists put some effort into recording as accurately as they can. The creation dates on this set of files were rather strange. The vast majority of files appeared to have been created on 1st January 1980 but there were a handful of files with creation dates between 1984 and 1987. It does seem unlikely that Marks and Gran produced the main body of their work on a bank holiday in 1980, so it would seem that this date is not very accurate. My contact in IT pointed out that on old DOS computers it was up to the user to enter the correct date each time they used the PC. If no date was entered the PC defaulted to 1/1/1980. Not a great system clearly and we should be thankful that technology has moved on in this regard!

So, we are missing important metadata that will help us understand the files, but all is not lost, the next step is to see whether we can read and make sense of them with our modern software. 

Reading the files

I have previously blogged about one of my favourite little software programmes, Quick View Plus – a useful addition to any digital archivist’s toolkit. True to form, Quick View Plus quickly identified the majority of these files as WordStar version 4.0 and was able to display them as well-formatted text documents. The vast majority of files appear to be sections of scripts and cast lists for various sitcoms from the 1980’s but there are other documents of a more administrative nature such as PHONE which looks to be a list of speed dial shortcuts to individuals and organisations that Marks and Gran were working with (including a number of famous names).

Unanswered questions

I have not finished investigating this collection yet and still have many questions in my head:

  • How do these digital files relate to what we hold in our physical archive? The majority of the Marks and Gran archive is in paper form. Do we also have this same data as print outs? 
  • Do the digital versions of these files give us useful information that the physical do not (or vice versa)? 
  • Many of the scripts are split into a number of separate files, some are just small snippets of dialogue. How do all of these relate to each other? 
  • What can these files tell us about the creative process and about early word processing practices?

I am hoping that when I have a chance to investigate further I will come up with some answers.

* I will be providing some sample files to the Droid techies at The National Archives to see if they can tackle this issue.


  1. Hi Jenny,
    this is really interesting. I guess these 18 (19) data carriers are really time-consuming. From a historical point of view these insights are just too exciting. I did not even know that the default date on old dos machines used to be the 1st of january in 1980.
    Are there any news on this topic?

  2. Thanks Yvonne - I am glad it is not just me that finds this sort of thing exciting! No more news yet as I've been busy with other things but I do hope to be able to blog about this again before the summer.