On Wednesday 15 February, we ran our first workshop/focus group with PhD students from the Doctoral Training Centre for Sustainable Chemical Technologies. This is the first of a series of posts summarising the outcomes of that event.
We had three aims for this session:
- To introduce the participants to data management planning and have them start writing their own data management plan (DMP);
- To better understand their current knowledge so that we can plan future training activities;
- To get feedback on what DMP template would be appropriate for PGR students.
We ran the session with 10 students in the 2010 cohort, who all started in October 2010 and are currently in the first year of their PhD proper, having completed an MRes in 2011. We also invited the 2009 cohort (in their second PhD year), of whom 3 volunteered.
The session consisted of an introductory presentation, given by Professor Matthew Davidson, followed by a hands-on session during which the students worked through a DMP template with support from myself and Cathy Pink. Our colleagues Kara Jones and Katy Jordan from the library were also present, and made notes on what was discussed.
Data management definitions
Early on in the session, we split the students up into groups of 2–3 and asked them to discuss what they understood by a handful of common data management terms. Here's what they came up with:
- There was general consensus (as you might expect from a single-discipline group) that data is information gathered directly by experiment, survey, etc. for the purposes of research. It became clear that with more thought, ‘data’ isn’t a hard-edged concept — processing data can produce new data, metadata is also data and so perhaps are the samples from which experimental data were derived.
- Metadata was described as data behind the data you want to use that gives context and background details. It was noted that this is distinct from the data itself. Chemistry is relatively rare in having a strong history of using metadata in the context of depositing crystallographic data.
- Secure storage
- The students immediately identified the two sides of security: both guaranteeing that data is (and remains) accessible to those who create and use it, and that it cannot be accessed without permission. It was generally agreed that your required level of security depends on how sensitive your data is.
- The most important aspect was seen as ensuring access for the researchers who created the data. Raw data was perceived as not being of much interest to third parties, but a need to better preserve and share experimental protocols was identified.
- Intellectual property
- It was generally accepted that, for PGR students, the university owns their data and the intellectual property therein. We're hoping to clarify this with our legal team soon, as Bath is unusual in leaving ownership of "scholarly outputs" with the originators — it would be useful to know whether we define data as a scholarly output now. Good data management practice was identified as one way to create a 'paper trail' to prove ownership of ideas in the event of a patent dispute.
Katy Jordan made an interesting comment in her notes:
"Listening in, it struck me quite forcibly that this session needs academics from the relevant department(s) to lead it. A good level of familiarity with the field, its processes, the department itself, and the way research is carried out, is required to make the session meaningful for the students."
It's occurred to me (and others) before that although the core skills of research data management are mostly discipline-independent, there is a strong need to provide "discipline-flavoured" training sessions, with relevant examples and expertise to ensure that the participants can relate to the content.
We'll be following up soon with more posts on the later part of the session, particularly a discussion of the DMP templates the students tried. Watch this space!