Pages

Wednesday, 10 February 2016

I'll show you my research data if you show me yours...

My research data
A few months ago I was having a clear out at home and came across a bunch of floppy disks in the drawer of my bedside table.

This is my research data...

Actually, that is not strictly true. I did a taught masters course and my research consisted of just a short dissertation at the end of the course. Most of these disks contain files from the taught element of my course and the subsequent dissemination of results. 

I published a paper at the end of the masters on the findings of my dissertation. 

If you are interested in the placement of
Iron Age hillforts in the landscape then
this is the book to look for.
No-one has since approached me and asked if they can see the data that underlies this publication

...but this was the 1990's! 

Times are different now. We expect our researchers to be able to produce the data and share it (where appropriate) so that others can build on their research. 

I'm now involved in teaching researchers here at York about Research Data Management (RDM) and how they should look after their data for future re-use.

When I created and stored this data I was not a digital archivist. I had no idea I would become a digital archivist. I like to think I would have managed my data differently if I had known more. 

Let's start with documentation. Much of the documentation for this data is what is actually written on the disk labels. I gave myself a little pat on the back for having recorded what was on the floppies so well on *most* of the disks. This of course was particularly useful in those days. File names were restricted to 8.3 characters so very little detail about the files could be incorporated into the name. Documenting things on disk labels helps add a bit of context. All well and good until you notice the disk on the far right with no label at all. This one remains a mystery!

So what are the issues here. First and most obviously, as a student in the 90's I was using cutting edge storage technology - the floppy disk! Can we read these today? Yes and no. Floppy disks fall firmly into the category of 'obsolete media' which is a topic that we digital archivists like to talk about. I found I could read about a quarter of these using the USB floppy reader that is attached to my PC. For the others I saw a lot of error messages like this:

The answer is "No"!

Fortunately I had more success using an old PC I keep in my office for the very purpose of reading old floppy disks - all but two of the floppies could be read and copied using this PC. On one disk I could view the list of files on it but couldn't copy all of them off the disk so I considered this to be a partial success. The one disk which I couldn't access at all was interestingly the one with no label. Perhaps this mystery disk was in fact never formatted or put into active use. 

Not too bad a result so far?

So what about the contents of the disks?

The contents of one of the floppy disks. Windows Explorer identifies the DOC files as
Microsoft Word 97-2003 but they are likely to be an earlier version of Word than this

As mentioned above the file and folder naming is noticeably brief (as is the way with media from this period). Today we talk to our researchers about the importance of naming files in such a way that you know what it is before you double click on it. This was near on impossible when faced with only 8 characters. I created this data but have no idea what I might expect to find in a directory called 'DISTEX' (though the label on the disk does help give a clue).

Note too the lack of organisation of the contents. At the end of my masters degree whilst finishing off my papers and publications I was also clearly focusing on what my next steps would be. Personal data (my CV for job applications) is stored alongside data relating to my research*. This again is something we discourage when we talk to researchers about data management. It is much easier when working with filestore to organise and categorise data more effectively, keeping personal data separate from research data. We have come a long way since the days when we were squashing any files that would fit on to a floppy disk regardless of content or context.

Here is some data on another of the disks (viewed in Windows Explorer as tiles). I have no idea what possessed me to store scanned photographs as GIF images. They look terrible! Did they always look this bad? Choosing the right file format is something we also cover in our RDM training and though file size is still a consideration for today's research students, at least they don't have to try and fit numerous images for one presentation on a single floppy disk.

More coded file names - this was a necessity when you had so few characters available.
I still remember what these mean but very much doubt anyone else would.


Some are my files are fairly easy to read, others less so (more detective work is required to find the right software). The Word documents are OK but come up in 'Protected View' (which means I'm not allowed to edit them). The default settings here are to treat a Word 6 or 95 document with suspicion but this can be easily resolved by editing these settings.

These old MS Word docs are still readable (and editable if I change the policy settings)

So, digging out my old research data has been an interesting diversion. I now use this as an example at the beginning of RDM teaching sessions and ask the students to imagine how their research data might look 20 years from now. 

Another added bonus from this exercise is that I now have even more files that I play with as I test Archivematica and file identification tools.




*Interesting to note that a first (unsuccessful) attempt to get a job in York occurred in 1998. I got here 5 years later!




Friday, 5 February 2016

New "Filling the Digital Preservation Gap" report released

I am pleased to announce that we have just published a new report on the "Filling the Digital Preservation Gap" project.



Filling the Digital Preservation Gap. A Jisc Research Data Spring project. Phase Two report - February 2016 - Jenny Mitcham, Chris Awre, Julie Allinson, Richard Green, Simon Wilson. https://dx.doi.org/10.6084/m9.figshare.2073220.v1



This phase 2 report, funded through Jisc's Research Data Spring initiative, details the work the project team have carried out with Archivematica over the last few months of the project. 

Our phase 2 work had the following aims:
  • Work with Artefactual Systems to develop Archivematica in a number of areas in order to make the system more suitable for fitting into our infrastructures for research data management
  • Develop our own detailed implementation plans for Hull and York to establish how Archivematica will be incorporated into our local infrastructures for research data
  • Consider how Archivematica could work as an above campus installation
  • Continue to spread the word, both nationally and internationally, about the ongoing work of our project

Our work in all of these areas are detailed in the report in full. Please do download it and let us know what you think.

We very much hope that the new features we have sponsored within Archivematica will be of interest to other Archivematica users (both current and future) and that these features will continue to evolve and improve over time.

Tuesday, 5 January 2016

When digital preservation really matters...

Of course digital preservation always matters* but recent events in York and beyond over the festive period really do highlight the importance of looking after your stuff - both physical and digital.

Not everyone is lucky enough to get much warning before a disaster of any type strikes but in some situations (such as that which I found myself in just after Christmas) we have some time to prepare.

Hang on...there isn't normally a lake near my house

Beyond relocating important things such as the hamster and photo albums upstairs and moving the Christmas decorations higher up the tree, it is also important to remember the digital....



Digital is robust in some respects but perhaps more at risk in others. Robust in that it is possible to very quickly make as many additional copies as you like and store them in different places (perfect for a disaster scenario such as this), but the risk is that it is more easily forgotten.

Of course I back up my personal data (digital photographs mostly) regularly, but with the chaos of the build up to Christmas I had not done so for a few weeks, so was prompted to do so before unplugging the PC and moving it to higher ground.

We were some of the lucky ones in York - the water levels didn't reach us so the preparations were not necessary but others were not so lucky. Many houses and businesses in York and in other areas of the country were flooded and many did not have the luxury of time to prepare for the worst. The very basics of digital preservation, (maintaining a regular back up strategy and storing copies of the data in different locations) really is something that should happen in a proactive way not just in response to specific threats.



* I have to say that - it is in my job description

Tuesday, 8 December 2015

Addressing digital preservation challenges through Research Data Spring

With the short time scales at play in the Jisc Research Data Spring initiative it is very easy to find yourself so focussed on your own project that you don’t have time to look around and see what everyone else is doing. As phase 2 of Research Data Spring comes to an end we are taking time to reflect, to think about digital preservation for research data management, to look at the other projects and think about how all the different pieces of the puzzle fit together.

Our “Filling the Digital Preservation Gap” project is very specifically about digital preservation and we are focusing primarily on what happens once the researchers have handed over their data to us for long term safekeeping. However, ‘digital preservation’ is not a thing that exists in isolation. It is very much a part of the wider ecosystem for managing data. Different projects within Research Data Spring are working on specific elements of this infrastructure and this blog post will try and unpick who is doing what and how this work contributes to helping the community address the bigger challenges of digital preservation.

The series of podcast interviews that Jisc produced for each project were a great starting point to finding out about the projects and this has been complemented by some follow up questions and discussions with project teams. Any errors or misinterpretations are my own. A follow up session on digital preservation is planned for the next Research Data Spring sandpit later this week so an update may follow next week in the light of that.

So here is a bit of a synthesis of the projects and how they relate to digital preservation and more specifically the Open Archival Information System (OAIS) reference model. If you are new to OAIS, this DPC technology watch report is a great introduction.

OAIS Functional Model (taken from the DPC Technology Watch report: http://dx.doi.org/10.7207/twr14-02)


So, starting at the left of the diagram, at the point at which researchers (producers) are creating their data and preparing it for submission to a digital archive, the CREAM project (or “Collaboration for Research Enhancement by Active Metadata”) led by the University of Southampton hopes to change the way researchers use metadata. It is looking at how different disciplines capture metadata and how this enhances the data in the long run. They are encouraging dynamic capture of metadata at the point of data creation which is the point at which researchers know most about their data. The project is investigating the use of lab notebooks (not just for scientists) and also looking at templates for metadata to help streamline the research process and enable future reuse of data.

Whilst the key aims of this project do fall within the active data creation phase and thus outside of the OAIS model, they are still fundamental to the success of a digital archive and the value of working in this area is clear. One of the mandatory responsibilities of an OAIS is to ensure the independent utility of the data that it holds. In simple terms this means that the digital archive should ensure that as well as preserving the data itself, it also preserves enough contextual information and documentation to make that data re-usable for its designated community. This sounds simple enough but speaking from experience, as a digital archivist, this is the area that often causes frustration - going back to ask a data producer for documentation after the point of submission and at a time when they have moved on to a new project can be a less than fruitful exercise. A methodology for encouraging metadata generation at the point of data creation and to enable this to be seamlessly submitted to the archive along with the data itself would be most welcome.

Another project that sits cleanly outside of the OAIS model but impacts on it in a similar way is “Artivity” from the University of the Arts London. This is again about capturing metadata but with a slightly different angle. This project is looking at metadata to capture the creative process as an artist or designer creates a piece of digital art. They are looking at tools to capture both the context and the methodology so that in the future we can ask questions such as ‘how were the software tools actually used to create this artwork?’. As above, this project is enabling an institution to fulfil the OAIS responsibility of ensuring the independent utility of the data, but the documentation and metadata it captures is quite specific to the artistic process.

For both of these projects we would need to ensure that this rich metadata and documentation was deposited in the digital archive or repository alongside the data itself in a format that could be re-used in the future. As well as thinking about the longevity of file formats used for research data we clearly also need to think about file formats for documentation and metadata. Of course, when incorporating this data and metadata into a data archive or repository, finding a way of ensuring the link between the data and associated documentation is retained is also a key consideration.

The Clipper project (“Clipper: Enhancing Time-based Media for Research”) from City of Glasgow College provides another way of generating metadata - this time specifically aimed at time-based media (audio and video). Clipper is a simple tool that allows a researcher to cite a specific clip of a digital audio or video file. This solves a real problem in the citation and re-use of time-based media. The project doesn't relate directly to digital preservation but it could interface with the OAIS model at either end. Data produced from Clipper could be deposited in a digital archive (either alongside the audio or video file itself, or referencing a file held elsewhere). This scenario could occur when a researcher needs to reference or highlight a particular section to back up their research. On the other end of the spectrum, Clipper could also be a tool that the OAIS Access system encourages data consumers to utilise, for example, by highlighting it as a way of citing a particular section of a video that they are enabling access to. The good news is that the Clipper team have already been thinking about how metadata from Clipper could be stored for the long term within a digital archive alongside the media itself. The choice of html as the native file format for metadata should ensure that this data can be fairly easily managed into the future.

Still on the edges of the OAIS model (and perhaps most comfortably sitting within the Producer-Archive Interface) is a project called “Streamlining deposit: OJS to Repository Plugin” from City University London which intends to make the process of submission of papers to journals and associated datasets to repositories more streamlined for researchers. They are developing a plugin to send data direct from a journal to a data repository. They want to streamline the submission process for authors who need to make additional data available alongside their publications. This will ensure that the appropriate data gets deposited and linked to a publication in order to ultimately enable access to others.

Along a similar theme is “Giving Researchers Credit for their Data” from the University of Oxford. This project is also looking at more streamlined ways of linking data in repositories with publisher platforms and avoiding retyping of metadata by researchers. They are working on practical prototypes with Ubiquity, Elsevier and Figshare and looking specifically at the communication between the repository platform and publication platform.

Ultimately these 2 projects are all about giving researchers the tools to make depositing data easier and, in doing so, ensuring that the repository also gets the information it needs to manage the data in the long term. This impacts on digital preservation in 2 ways. First the easier processes for deposit will encourage more data to be deposited in repositories where it can then be preserved. Secondly, data submitted in this way should include better metadata (with a direct link to a related publication) which will make the job of the repository in providing access to this data easier and ultimately encourage re-use.

Other projects explore the stages of the research lifecycle that occur once the active research phase is over, addressing what happens when data is handed over to an archive or repository for longer term storage.

The “DataVault” project at the Universities of Edinburgh and Manchester is primarily addressing the Archival Storage entity of the OAIS model. They are establishing a DataVault - a safe place to store research data arising from research that has been completed. This facility will ensure that that data is stored unchanged for an appropriate period of time. Researchers will be encouraged to use this facility for data that isn’t suitable for deposit via the repository but that they wish to keep copies of. This will enable them to fulfill funder requirements around retention periods. The DataVault whilst primarily being a storage facility will also carry out other digital preservation functionality. Data will be packaged using the BagIt specification, an initial stab at file identification will be carried out using Apache Tika and fixity checks will be run periodically to monitor the file store and ensure files remain unchanged. The project team have highlighted the fact that file identification is problematic in the sphere of research data as you work with so many data types across disciplines. This is certainly a concern that the “Filling the Digital Preservation Gap” project has shared.

Our own “Filling the Digital Preservation Gap” project focuses on some of the more hidden elements of a digital preservation system. We are not looking at digital preservation software or tools that a researcher will interact with, but with the help of Archivematica are looking at among other things the OAIS Ingest entity (how we process the data as it arrives in the digital archive) and the Preservation Planning entity (how we monitor preservation risks and react to them). In phase 3 we plan to address OAIS more holistically with our proof of concepts. I won’t go into any further detail here as our project already gets so much air space on this blog!

Another project looking more holistically at OAIS is “A Consortial Approach to Building an Integrated RDM System - Small and Specialist” led by the University for the Creative Arts. This project is looking at the whole technical infrastructure for RDM and in particular looking at how this infrastructure can be achievable for small and specialist research institutes with limited resources. In a phase 1 project report by Matthew Addis from Arkivum there are a full range of workflows described which cover many of the different elements of an OAIS. To give a few examples, there are workflows around data deposit (Producer-Archive Interface), research data archiving using Arkivum (Archival Storage), access using EPrints (Access), gathering and reporting usage metrics (Data Management) and last but not least a workflow for research data preservation using Archivematica which has parallels with some of the work we are doing in “Filling the Digital Preservation Gap”.

DMAOnline” sits firmly into the Data Management entity of the OAIS, running queries on the functions of the other entities and producing reports. This tool being created by the University of Lancaster will report on the administrative data around research data management systems (including statistics around access, storage and the preservation of that data). Using a tool like this, institutions will be able to monitor their RDM activities at a high level, drill down to see some of the detail and use this information to monitor the uptake of their RDM services or to make an assessment of their level of compliance to funder mandates. From the perspective of the “Filling the Digital Preservation Gap” project we are pleased that the DMAOnline team have agreed to include reporting on the statistics from Archivematica in their phase 3 plans. One of the limitations of Archivematica that was highlighted in the requirements section of our own phase 1 report was the lack of reporting options within the system. A development we have been sponsoring during phase 2 of our project will enable third party systems such as DMAOnline to extract information from Archivematica for reporting purposes.

Much focus in RDM activities typically goes into the Access functional entity, which naturally follows on from viewing a summary of activity through DMAOnline. This is one of the more visible parts of the model - the end product if you like of all the work that goes on behind the scenes. A project with a key focus on access is “Software Reuse, Repurposing and Reproducibility” from the University of St Andrews. However, as is the case for many of these projects, it also touches on other areas of the model. At the end of the day, access isn't sustainable without preservation so the project team are also thinking more broadly about these issues.

This project is looking at software that is created through research (the software that much research data actually depends on). What happens to software written by researchers, or created through projects when the person who was maintaining it leaves? How do people who want to reuse the data get hold of the right software? The project team have been looking at how you assign identifiers to software, how you capture software in such a way to make it usable in the future and how you then make that software accessible. Versioning is also a key concern in this area - different versions of software may need to be maintained with their own unique identifiers in order to allow future users of the data to replicate the results of a particular study. Issues around the preservation of and access to software are a bit of a hot topic in the digital preservation world so it is great to see an RDS project looking specifically at this.

The Administration entity of an OAIS coordinates the other high level functional entities, oversees the operation of them and serves as a central hub for internal and external interactions. The “Extending OPD to cover RDM” project from the University of Edinburgh could be one of these external interactions. It has put in place a framework for recording what facilities and services your institution has in place for managing research data - both technical infrastructure, policy and training. It allows an institution to make visible the information about their infrastructure and facilities and to compare it or benchmark it against others. The level of detail in this profile goes far above and beyond OAIS but allows an organisation to report on how it is meeting the ‘Data repository for longer term access and preservation’ component for example.

In summary it has been a useful exercise thinking about the OAIS model and how the different RDS projects in phase 2  fit within this framework. It is good to see how they all impact on and address digital preservation in some way - some by helping get the necessary metadata into the system, or enabling a more streamline deposit process, others helping monitor or assess the performance of the systems in place and some projects more directly addressing key entities within the model. The outputs from these projects complement each other - designed to solve different problems and addressing discrete elements of the complex puzzle that is research data management.

Wednesday, 2 December 2015

Research Data Spring - a case study for collaboration

Digital preservation is not a problem that any single institution can realistically find a solution to on their own. Collaboration with others is a great way of working towards sustainable solutions in a more effective way. This post is a case study about how we have benefited from collaboration whilst working on the "Filling the Digital Preservation Gap" project.

In late 2014 Jisc announced a new collaborative initiative called Research Data Spring. The project model specifically aimed to create innovative partnerships and collaborations between stakeholders at different HE institutions working within the field of Research Data Management. Project teams were asked to work in short sprints of between three and six months and were funded for a maximum of three phases of work. One of the projects lucky enough to be funded as part of this initiative was the “Filling the Digital Preservation Gap” project, a collaboration between the Universities of Hull and York. This was a valuable opportunity for teams at the two universities to work together on a shared solution to a shared problem and come up with a solution that might be beneficial to others.


The project team from Hull and York
The aim of the project was to address a perceived gap in existing research data management infrastructures around the active preservation of the data. Both Hull and York had existing digital repositories and sufficient storage provision but were lacking systems and workflows for fully addressing preservation. The project aimed to investigate the open source tool Archivematica and establish whether this would be a suitable solution to fill this gap.


As well as the collaboration between Hull and York, further collaborations emerged as the project progressed. 

Artefactual Systems are the organisation who support and develop Archivematica and the project team worked closely with them throughout the project. Having concluded that Archivematica has great potential for helping to preserve research data, the project team highlighted several areas where they felt additional development was required in order to enhance existing functionality. Artefactual Systems were consulted in detail as the project team scoped out priorities for further work. They were able to offer many useful insights about the best way of tackling the problems we described. Their extensive knowledge of the system put them in a good place to look at the issues from various angles to find a solution which would meet our needs as well as the needs of the wider community of users. Artefactual Systems were also able to help us with one of our outreach activities, joining us (virtually) to give a presentation about our work.

The UK Archivematica group was kept informed about the project and invited to help shape the priorities for development (you can read a bit about this in a previous blog post). Experienced and established Archivematica users from the international community were also consulted to discuss the new features and to review how the proposed features would impact on their workflows. Ultimately, none of us wanted to create bespoke developments that were only going to be of use to Hull and York.

Collaboration with another Research Data Spring project being carried out at Lancaster University was also necessary to enable future join up of these two initiatives. One of the areas highlighted for further work was improved reporting within Archivematica. By sponsoring a development to enable data to be more easily exposed to third party applications, the project team worked closely with the DMAOnline project team at Lancaster to ensure the data would be made available in a manner that was suitable for their tool to work with.  

Another area of work that called for additional collaboration was in the area of file format identification. This is very much an area that the digital preservation community as a whole needs to work together on. For research data in particular, there are many types of file that are not identified by current identification tools and are not present within the Pronom registry of file types. We wanted to get greater representation of research data file formats within Pronom and also enhance Archivematica to enable better workflows for non-identified files (see my previous post for more about file identification workflows). This is why we have also been collaborating with the team at The National Archives who develop new file signatures for Pronom.

The collaborative nature of this project brought several benefits. Despite the short time scales at play (or perhaps because of them) there was a strength in working together on a new and innovative solution to preserve research data.

The universities of Hull and York were similar enough to share the same problem and see the need to fill the digital preservation gap, but different enough to introduce interesting variations in workflows and implementation strategies. This demonstrated that there is often more than one way to implement a solution depending on institutional differences.  

By collaborating and consulting widely, the project hoped to create a better final outcome and produce a set of enhancements and case studies that would benefit a wide community of users.

Friday, 27 November 2015

File identification ...let's talk about the workflows

When receiving any new batch of files to add to the digital archive there are lots of things I want to know about them but "What file formats have we got here?" is often my first question.

Knowing what you've got is of great importance to digital archivists because...
  • It enables you to find the right software to open the file and view the contents (all being well)
  • It can trigger a dialog with your donor or depositor about alternative formats you might wish to receive the data in (...all not being well)
  • It allows you to consider the risks that relate to that format and if appropriate define a migration pathway for preservation and/or access
We've come a long way in the last few years and we now have lots of tools to choose from to identify files. This could be seen as both a blessing and a curse. Each tool has strengths and weaknesses and it is not easy to decide which one to use (or indeed which combination of tools would give the best results) ...and once we've started using a tool, in what way do we actually use it?

So currently I have more questions about workflows - how do we use these tools and at what points do we interact with them or take manual steps?

Where file format identification tools are used in isolation, we can do what we want with the results. Where multiple identifications are given, we may be able to gather further evidence to convince us what the file actually is. Where there is no identification given, we may decide we can assign an identification manually. However, where file identification tools are incorporated into larger digital preservation systems, the workflow will be handled by the system and the digital archivist will only be able to interact in ways that have been configured by the developers.

As part of our Jisc funded "Filling the Digital Preservation Gap" project, one of the areas of development we are working on is around file identification within Archivematica. This was seen to be a development priority because our project is looking specifically at research data and research data comes in a huge array of file formats, many of which will not currently be recognised by file format identification tools.

The project team...discussing file identification workflows...probably


Here are some of the questions we've been exploring:
  • What should happen if you ingest data that can't be identified? Should you get notification of this? Should you be offered the option to try other file id methods/tools for those non-identified files?
  • Should we allow the curator/digital archivist to over-ride file identifications - eg - "I know this isn't really xxxx format so I'm going to record this fact" (and record this manual intervention in the metadata) Can you envisage ever wanting to do this? 
  • Where a tool gives more than one possible identification should you be allowed to select which identification you trust or should the metadata just keep a record of all the possible identifications?
  • Where a file is not identified at all, should you have the option to add a manual identification? If there is no Pronom id for a file (because it isn't yet in Pronom) how would you record the identification? Would it simply be a case of writing "MATLAB file" for example? How sustainable is this?
  • How should you share info around file formats/file identifications with the wider digital preservation community? What is the best way to contribute to file format registries such as Pronom

We've been talking to people but don't necessarily have all the answers just yet. Thanks to everyone who has been feeding into our discussions so far! The key point to make here is that perhaps there isn't really a right answer - our systems need to be configurable enough in order that different institutions can work in different ways depending on local policies. It seems fairly obvious that this is quite a big nut to crack and it isn't something that we can fully resolve within our current project.

For the time being our Archivematica development work is focusing in the first instance on allowing the digital curator to see a report of the files that are not identified as a prompt to then working out how to handle them. This will be an important step towards helping us to understand the problem. Watch this space for further information.

Wednesday, 25 November 2015

Sharing the load: Jisc RDM Shared Services events

This is a guest post from Chris Awre, Head of Information Services, Library and Learning Innovation at the University of Hull. Chris has been working with me on the "Filling the Digital Preservation Gap" project.

On 18th/19th November, Jenny and I attended two events held by Jisc at Aston University looking at shared services for research data management.  This initiative has come about as many, if not all, institutions have struggled to identify a concrete way forward for managing research data, and there is widespread acknowledgement that some form of shared service provision will be of benefit.  To this end, the first day was about refining requirements for this provision, and saw over 70 representatives from across Higher Education feed in their ideas and views.  The day took an initial requirements list and refined, extended and clarified these extensively.  Jisc has provided a write-up of the day of its own that describes the process undertaken usefully.



Jenny and I were kindly invited to the event to contribute our experience of analysing requirements for digital preservation for research data management.  The brief presentation we gave highlighted the importance of digital preservation as part of a full RDM service, stressing of how a lack of digital preservation planning has led to data loss over time, and how consideration of requirements has been based on long established principles from the OAIS Reference Model and earlier work at York. Essentially the message was – make sure that any RDM shared service encompasses digital preservation, even if institutions have different policies about what does and does not get pushed through it.

Thankfully, it seems that Jisc has indeed taken this on board as part of the planning process, and the key message was re-iterated on a number of occasions during the day.  Digital preservation is also built into the procurement process that Jisc is putting together (of which more below).  It was great to be having discussions about research data management during the day where digital preservation was an assumed component.  The group was broken up to discuss different elements of the requirements for the latter half of the morning, and by chance I was on the table discussing digital preservation.  This highlighted most of the draft requirements as mandatory, but also split up some of the others and expanded most of them.  Context is everything when defining digital preservation workflows, and the challenge was to identify requirements that could work across many different institutions.  We await to see the final list to see how successful we have all been.

The second day was focused on suppliers who may have an interest in bidding to the tender that Jisc will be issuing shortly.  A range of companies were represented covering the different areas that could be bid for.  What became apparent during Day 1 was the need to provide a suit of shared services, not a single entity.  The tender process acknowledges this, and there are 8 Lots covering different aspects.  These are to be confirmed, and will be presented in the tender itself.  However, suffice to say that digital preservation is central to two of these: one for providing a shared service platform for digital preservation; and one to provide digital preservation tools that can be used independently by institutions wishing to build them in outside of a platform.  This separation offers flexibility to how DP is embedded, and it will be interesting to see what options emerge from the procurement process.

Jenny and I have been invited to sit on the Advisory Group for the development of the RDM shared service(s), so will have ongoing ability to raise digital preservation as a key component of RDM service.  Jisc is also looking for institutions to act as pilots for the service over the next two years.  This provides a good opportunity to work with service providers to establish what works locally, and the experiences will serve the wider sector well as we continue to tackle the issues of managing research data.