Wednesday, 30 March 2016

Post 78: Research Data Management Plans -- the Future of Data Curation

About a month ago, Wednesday, 24 February, found me sitting in the School of Environmental Studies' Dry Lab, a large meeting room in the David Turpin Building on campus. It was a sunny afternoon, and I was waiting with anticipation for my colleagues and the presenters from the on-campus library who were going to tell us about research data management plans.

The presentation seemed timely to me: for my own thesis project I had asked participants to allow me to keep their interview data (the recordings and the transcripts) for 10 years after they'd taken place (until 2024), and now that I'm nearing finishing my thesis, I'm starting to think about where to keep them once I'm finished. Some of my participants opted for full confidentiality to protect their identities, and I take that quite seriously. Once I'm out of academia, how do I best keep/store/move these protected data?

Chaenemeles species, or quince flowers! These ones are quite orange; others are more deep pink in colour. 
Research data management plans (RDMPs) are supposed to help with those questions, and, as I learned, a whole lot more. Our two wonderful presenters delivered a phenomenal talk on Research Data Management plans were Daniel Brendle-Moczuk, the subject librarian for the School of Environmental Studies (and other departments), from the Library References Services, and Kathleen Matthews, from Library Collections Management. Daniel delivered the presentation with input from Kathleen, and both answered questions at the end of the presentation. The presentation covered the what, why, why now, research data life cycle, and components of a data management plan, and I'll cover a brief overview of the presentation here. (In case I don't make sufficiently clear in the post, all of the material I share is from the presentation).

As Brendle-Moczuk and Matthews characterized, Canada is quite behind on its thinking about and thinking through RDMPs. This became quite apparent throughout the presentation, as Daniel's referred and source documents were all either America or out of the UK. For example, "Data Management for Researchers" is written by Krisin Briney, a US-based data management specialist, as are Mark Allen and Dalton Cervo, authors of the text "Mulit-Doman Master Data Management" and "Data Protection and the Cloud: Are the Risks too Great?"by UK author Paul Ticher.  Brendle-Moczuk also frequently cited documents and resources from the UK Data Service and DataONE, or Data Observation Network for Earth (US based).

Backyard Camellia flowers that have fallen. They're beautiful splashes of colour, but they don't last long!

All of this is to say that RDMPs are coming to Canada, and they're overdue. Research data management plans make you think through all aspects of your data: how to gather them, what kinds you want to gather (photos, audio, surveys, questionnaires, interviews, numerical data, geo-spatial data?), how you are going to name your files and organize them (a filename of "Sam's data" is probably only useful to Sam... unless detailed metadata^^^ exist to tell us how to use and interact with the data in the file), what the best formats for them are (remember floppy disks? Or will [proprietary software of your choice] be available in 10 years?), how you are going to treat confidential data and transport them, and how you are going to store and share them. Other questions are also important to consider, such as who has access to my data? And how is it protected if it needs to be?*** What is the researcher's responsibility when it comes to the lifetime of the data, and what are the ethics and legal considerations to take when working with data. The RDMPs make you think about all of these things, and it's very likely that your RDMP is going to need to accompany your grant applications in the future.

***Some of these questions you need to think about when you (if you) complete a Research Ethics Application, which I will cover in an upcoming post.

^^^metadata are the notes and data that tell you about a different set of data. For example, my thesis project is a qualitative research project that collected interviews as the primary data. I also had academic and grey literature that informed my research. The metadata would be a set of notes that are key to explaining my data: where the interviews are kept, how they are kept, which files they are located in, what different acronyms are, where my literature is kept, how that's organized, which program was used to analyze my data, etc. The metadata would be necessary for someone else to understand my data, and how it all fits for my project. Are there organizational responsibilities when it comes to metadata? You bet! Be neat. Be organized. And be consistent with your file names and organizing. And if you ever have the thought "I'll remember this later..." WRITE IT DOWN! I guarantee you, you won't remember.

A bouquet of neighbourhood daffodils I picked for my landlady Kathryn, before her trip!
Currently, there are strong indications that RDMPs are going to become required for researchers in Canada in the future. The Tri-Council of Canada issued a draft statement of principles on digital data management in July of last year, that outlined researchers' responsibilities when it comes to data management. These responsibilities include the collection, formatting, preservation, and sharing of their data through out the lifetime of a project and beyond. There's no telling when that statement will no longer be a draft but become a requirement for researchers, but it applies to the hard sciences and social sciences a lot, so I think it's wise to begin thinking through research data management now.

For a very cute video that added some humour to the presentation, check out "Data Sharing and Management Snafu in 3 Short Acts" (I'm so glad that Daniel found this!).

Gorgeous Lower Thetis Lake reflection. Where does the water start and stop?
A quick and flashy reminder: always make backups of your data. Always. Follow the 3-2-1 Rule:
3 copies, in 2 different media, with 1 backup located offsite.

And one last note: a practice becoming more frequent is for researchers or organizations being asked to prove their research impact, which Brendle-Mozcuk and Matthews recommended as being most easily done with a DOI, or a digital object identifier. It is a persistent link that follows the data, and that can be used when citing data that's used. In Canada, the National Science Library provides a service through DataCite Canada that is: "A data registration service that provides Canadian data centres and libraries with a mechanism for registering research data and assigning digital object identifiers (DOIs) to them. These identifiers allow research data to be findable, citable and accessible for replication and further use." So make sure to check this out if this may be applicable to you. 

Good luck, fellow jedi researchers! I'm sure that you will plan and manage your data well after this post, or at least avoid making some very painful errors with your data. Hugs! 

No comments:

Post a Comment