Work session 22 June 2017: FAIR
On the 22 June 2017 the Landelijk Coördinatiepunt Research Data Management (LCRDM) (National Coordination Point for Research Data Management) together with the UKB Research Data Work Group (UKBwg RD), organized a work session about FAIR data.
Marcel Ras (NCDD, also a member of the LCRDM Financial Aspects Work Group) discussed sustainable data in the heritage sector. The concept 'FAIR' is not used in this field. However, in the heritage sector it is important that data is visible (F, A), usable (I, R), and sustainable (R). In the sector one speaks of sustainable access. The Digital Heritage Network drew up a “national strategy for digital heritage” in 2014. This strategy defines the framework for a broad collaboration aimed at ensuring that the Dutch digital collection is conspicuous for end users. Better interconnection must improve integral visibility of the collection and make sure it remains accessible in the long-term. Increasingly more collections in the heritage sector are digitally accessible but the different infrastructures that make them visible still need to be linked. To flesh out this national strategy, three work programmes have been initiated to pursue projects aimed at realizing collective facilities. These work programmes are being realized by the Network for Digital Heritage (NDE).
The National Coalition for Digital Sustainability (NCDD) has taken major steps to realize a broader and domain-transcending collaborative venture to promote sustainable access. As part of the Network for Digital Heritage, the NCDD is responsible for the Sustainable Digital Heritage work programme that has drafted a catalogue of sustainable facilities, set up a Wiki that lists elements for sustainability policy, created a digital learning environment and drawn up a model for costs.
Jasmin Böhmer (TU Delft & 4TU Centre for Research Data; replacing Alastair Dunning, member of the LCRDM work group for Research Support and Advice) presented a research project that investigated how different repositories score according to FAIR principles. It transpires that there are differences between discipline-specific repositories: in some disciplines, extensive metadata are used while in other discipline-specific depositories, structured metadata is lacking. Researchers also noticed that some principles are very specifically formulated and therefore easy to evaluate, while other principles were broadly formulated, which complicates a fluent assessment. Researchers indicated that several rapid steps can be taken to make datasets more FAIR, i.e. assigning a DOI and a user’s license and adding structured metadata. Other points for improvement, like supplementary information about the origins of datasets and developing community standards, will cost more time.
Marjan Grootveld (DANS, also a member of the LCRDM Research Support Work Group) presented FAIR metrics to help evaluate datasets on the basis of FAIR principles. With the aid of these metrics a dataset’s level of F, A, and I can be assessed. The combination of these three ratings results in a score for R. The object of developing this tool was to see if FAIR principles for data could be combined with the 'Data Seal of Approval' guidelines for repositories: FAIR data in a Trustworthy Data Repository. This seems to be a viable goal because there are clear similarities between the different guidelines.
Currently there is a pilot of the FAIRdat tool available in web form. It should be investigated to what extent scoring can be automated because concepts like “rich metadata" and “relevant attributes" are subjective and discipline-dependent. With the use of the tool the ‘FAIRness’ of a dataset can be determined and denoted, for example by marking a dataset with a FAIR badge in a repository.
Luiz Bonino (DTL, also a member of the LCRDM Research Support Work Group) gave a presentation in which he discussed the global acceptance of FAIR principles. FAIR principles have rapidly spawned a global movement and are applied, among others, by the NIH in the U.S. The ambitions that gave rise to the European Open Science Cloud (EOSC) have also spurred the EU to take action. The GO FAIR initiative has been set up to support the global distribution of FAIR principles. Various tools are being developed to make data more FAIR, like the Data Stewardship Wizard and the FAIRifier. Through workshops and hackathons researchers are informed how they can make data FAIR.
Luiz emphasizes that the FAIR principles do not qualify as standards. Some principles relate to data, others to metadata, and yet others are concerned with infrastructure.
'FAIRness' must not be seen as a binary state, but as a spectrum: there are different levels of FAIRNESS that can be described. Currently experts are working on concrete and precise metrics for every FAIR principle.
Tessa Pronk (UU & UKB) described the obstacles one can encounter when re-using research data. Re-use is important for the sharing of research data and therefore also essential for FAIR principles. Tessa’s investigation approached re-use from the perspective of the user. The 3SA model (Share, Search, Select, Appraise, Assess, Adapt) was discussed and different challenges concerning the re-use of data were identified. For example, researchers are sometimes unaware that searching for data can also be fruitful. ('Search'). A multitude of sites or limited search functionality can make it difficult for researchers to select suitable datasets ('Select'). Deficient metadata can make it awkward to determine the relevance of a dataset ('Appraise'). And the data in question also needs to be evaluated ('Assess'). When a relevant dataset has been found, data may be incomplete or need to be decontaminated ('Adapt'). For a researcher, use of a 3SA model when re-using data, can help remove obstacles.
The presentations were followed by a discussion about ambitions for a consistent system that allows FAIR access to research data. The discussion took place in small groups after which everyone convened to give feedback. On the basis of this debate, a document was drawn up that can serve as input for the coalition assigned to write a NPOS ambition 3.2.1 policy framework.