Research Data Infrastructure Development: The Pros and Cons of User Involvement

FAIR use of data is one of the hot topics on the science policy agenda. To achieve FAIR Data or even publicly accessible data, appropriate research data infrastructures (RDIs) are needed. There are various national, transnational and international developments of information infrastructures or research data infrastructures that are currently being conceptualized, implemented or further emerging. Notable examples are the “National Open Science Cloud” in the Netherlands, the “Open Research Data Infrastructure” in the UK, the “Australian Research Data Cloud”, the German Research Data Infrastructure (NFDI) or the European Open Science Cloud (EOSC).

The mantra of user involvement

Nowadays, along with the development of such RDIs goes the claim to address the research communities and actual users’ needs through user involvement. With reference to the history and amount of research and literature referring to concepts like user involvement or user participation for the purpose of designing IT systems (see references), by the term ‘user involvement’ in its broadest sense we mean the explicit analysis and integration of both user behavior (in terms of interacting with an IT system) and user attitude or acceptance. Being aware of this coarse-grained understanding, we hope to convey the term ‘user involvement’ a more intuitive meaning in this conceptual approach, rather than a strictly operational meaning which is regularly required by empirical studies. In this sense, the RfII recommended in 2016, a ‘tight user involvement from the beginning’ as one of the guiding principles. The ultimate goal is to prevent the development of RDIs that do not need users’ expectations or needs.

Consequently, the RfII co-event If we build it, they will come” – ways of user involvement in information infrastructure development taking place during the 11th RDA Plenary Meeting on 23 March brings up some very important aspects. Among them are questions such as “How can users/research communities organize to shape the service portfolios they need?”. The event also wants to analyse the potentials and obstacles from three different perspectives:

  • Policy actors and funders
  • Infrastructure providers, in particular software developers, application operators, and system administrators, but also staff responsible for organizing and maintaining data management plans (DMPs)
  • user and research communities

User involvement – silver bullet for infrastructure development?

Although there is also some scepticism (Barki / Hartwig 1991), user involvement is assigned an important role in infrastructure development by science policy. However, the way how to involve users remains quite unspecified, informal and accidental. In general, users from research communities are integrated into such RDI projects through some kind of advisory body that is asked to give feedback. If we take a closer look at such advisory bodies, it can be stated that these potential users are biased as many of them already have a strong relation to information infrastructure development. This leaves the open question whether they really represent the average user of their respective research community. Also, many RDI projects address interdisciplinary approaches. Thus, the difficulty of representing the average user is amplified. Moreover, a quote that is affiliated with Henry Ford reminds us that users cannot foresee any need they might benefit from: “If I had asked people what they wanted, they would have said faster horses”. Some RDI projects claim that the use of the respective RDI will enable researchers to conduct different research in terms of questions, data, and tools.

Framing the golden mean of user involvement

The current situation might lead to insufficient allocation of resources. If RDIs have to heavily concentrate on user involvement and thus invest resources in (difficult) user activation, these resources cannot be used for other purposes such as improving the services. To achieve an optimal balance for RDIs between infrastructure development and user involvement, it will be essential to think carefully how difficult it will be to engage potential users in the development of the respective services and which level of integration is both suitable and feasible. Based on our experience, we believe that the level of user involvement depends on its stage according to the data lifecycle. Therefore, we suggest specifying user involvement along the data lifecycle, reflecting the different stages and use cases better than an overall approach.

For example, we assume that users are much more involved in services that help them analysing or publishing their data, rather than describing or preserving them.

User involvement and data lifecycle from low (blue color) to high (red color)


ZBW (2018) based on German Federation for Biological Data (2018). GFBio Training Materials: Data Life Cycle Fact-Sheet: Data Life Cycle: Publish., retrieved 14 Mar 2018.

Subsequently, it is possible to match this approach with different methods for involving users such as jointly developing services, formulating use cases together or simply by letting users answer questionnaires. The higher the involvement of the user in the respective service is, the more it is likely that RDIs can activate him or her. The lower the involvement is, the more carefully the RDI provider should think of moderate or alternate ways of user integration.

Methods for appropriate user involvement

The following enumeration suggests a tentative portfolio of methods suitable for managing user involvement according to some of the stages of the data lifecycle, but is definitely intended to be completed or adapted by other readers or users:

  • Describe: Tools for metadata extraction and data dictionary generation (e.g., for database documentation) to be applied by supporting staff (e.g., data librarians)
  • Discover: User story, search portal and web interface for data discovery, interactive online feedback (“Did our suggestions meet your expectations?”), (rapid) prototyping with participating and evaluating users, quality assurance
  • Integrate: Questionnaire or feedback on experiences with handling the data, general reusability

Such an approach could help providers of RDIs to identify the adequate level and method of user integration for a particular stage of research data lifecycle. It could help to stimulate the further elaboration of DMPs and to explicitly address and integrate user groups when their feedback is needed.

References:

Share this post:

The ZBW – Leibniz Information Centre for Economics is the world’s largest research infrastructure for economic literature, online as well as offline.

The Openness Profile of Knowledge Exchange: What can Infrastructure Providers do? ZBW startet Ideenwettbewerb: Ideen für noch besseren Service gesucht Practice Report Open Science: These Tools Promote Collaboration

View Comments

OpenUP Hub: Toolbox and Knowledge to Open Up the Research Lifecycle
Next Post