Libraries: Anchoring the future of software preservation
Libraries advance teaching, research, and learning by providing resources, enabling discovery, and offering expert guidance. As software source code becomes increasingly central to contemporary scholarship, libraries must support researchers who work with it. In this series of interviews, professionals share their approach to research software.
Imagine the Tower of Babel, but instead of chaos and confusion, it stands as a symbol of collaboration and understanding. This seemingly paradoxical image perfectly encapsulates the mission of the Ligue des Bibliothèques Européennes de Recherche (Liber).
The association brings together over 400 national, specialized, and university libraries from 40 countries. In July 2022, Julien Roche became president of Liber, after four years as vice-president. A library curator, Roche is the first French professional to hold this position. He serves as Director of the University Libraries and Learning Center, and Administrator of Research Data, Algorithms, and Codes (ADAC) at the University of Lille. Roche provides insights into Liber’s key initiatives and the evolutions that librarians are experiencing today.
Key takeaways:
“Source code and software are collections like any other, which need to be preserved, made accessible, and shared over the long term with ‘readers’ the world over. A real librarian’s job!”
“If new and emerging initiatives need singular skills and motivated pioneers, libraries inscribe their action on another, more perennial time scale, thus going beyond the question of individuals to anchor practice in an institutional operation.
“Librarians are the professionals best placed to take up the subject of codes and software from the angle of reporting, dissemination, preservation and curation.”
What are the priorities of your mandate at Liber?
First, Liber’s scope of action isn’t limited to research libraries: it extends to public libraries with a research dimension, and national libraries. Liber members come from the countries of the Council of Europe. France and Germany provide the largest contingents of members.
A particular challenge of my term of office is to consider the plurality of countries represented within Liber, and hence their diversity: not all countries benefit from the same resources, nor are they at the same stage of progress in terms of libraries, documentation or open science. My role is to ensure that all members can find their place in a European approach that is, if not integrated, at least articulated. This is reflected in the composition of Liber’s Board of Directors, which brings together large and small institutions. One of the challenges of this term of office is also to “digest” our 2023-2027 strategy, which is based on three main pillars: engaged and trusted hubs, state-of-the-art services, and advancing open science. Beyond this strategy, Liber has greatly expanded its portfolio of activities in recent years, notably through the programming of events: some fifteen years ago, the annual congress brought together around 200 people. Today, this figure has more than doubled. Liber also offers master classes, a mid-year event (winter event), two-year seminars for managers aspiring to become library directors, “days” for directors, and webinars in collaboration with various players such as LA Referencia and UNESCO.
Finally, Liber’s reputation is strong and leads us to regularly work with other players in Europe, for example through devices such as the EOSC (European Open Science Cloud) or the ORE (Open Research Europe) platform, and beyond.
What are the strengths of libraries in advancing the role of software in the academic ecosystem ?
In France, it’s undoubtedly open science, and in particular the second National Plan for Open Science (PNSO2), which brought software into the mainstream, expanding its reach beyond the traditional communities of computer scientists and other technical disciplines…This is how the subject was spotted by the libraries, which are still relatively unaware of it, unlike research data, which is currently the focus of much of our attention.
The advantages of libraries are the same as for other emerging topics in research support and open science. First and foremost, libraries take a long-term view of open science issues, “institutionalizing” them. While new and emerging initiatives often rely on singular skills and motivated pioneers, libraries operate on a more enduring timescale. They anchor practices within institutional operations, transcending individual efforts. And this is what codes and software need today: after a mainly militant phase, expansion and sustainability require the sustained mobilization of key players in open science, including libraries.
What’s more, software and data have a certain kinship. Indeed, software has long been mistakenly confused by non-specialists with data. And yet, libraries are positioned as key data support services. Information professionals are involved at many levels: some are administrators of data, algorithms and codes, others coordinate data workshops, but all higher education and research documentation structures host research support services. Finally, and perhaps most importantly, librarians have a strong culture in the creation and curation of metadata and identifiers. And these are precisely the issues currently being addressed in the field of codes and software. Librarians are the professionals best placed to tackle the subject of codes and software from the angle of reporting, dissemination, preservation and curation. It’s from this angle that we can now interest the entire library community in code. In this respect, Software Heritage’s work with the CCSD and HAL is very welcome, as it’s the librarians who are the driving force behind HAL today.
Given their traditional role, librarians often face questions about their relevance in the context of software services for researchers. How can they effectively position themselves in this field?
Legitimacy is an afterthought; it cannot be decreed. You have to demonstrate your usefulness. In the early days of open access, libraries were not seen as pivotal services, whereas today they are. This legitimacy is currently being built up in the field of data. For example, the “Ateliers de la donnée” integrates or relies heavily on documentary services. Librarians are also in a good position to support the appropriation by researchers at the multidisciplinary data warehouse Recherche Data Gouv.
Software and code are more recent concerns for research decision-makers. In France, we had to wait for PNSO 2, and the subject remains largely unexplored. Software is not yet fully taken on board at the institutional level. In Lille, the digital master plan and the work on identifiers are currently being written. Work has also been carried out to establish the principles of governance for data, algorithms and research codes, which are currently being validated. In addition to my investment as ADAC in steering this work, libraries are very much involved in the actions undertaken. One of the axes of the plan is dedicated to source codes and software. The subject is therefore tending to be institutionally identified, and the players involved are starting to take it on board.
In the software sector, libraries are not necessarily destined to systematically become the departments responsible for these issues, but even in this configuration, they have a major role to play given their ability to institutionalize the subjects they take up. At the University of Lille, for example, an ADAC operational unit has been set up, drawing heavily on the skills of the libraries and our LORD Data Workshop. The Lille scheme is certainly not universal, but it’s virtuous and relevant in the sense that it brings together all the players, here around the libraries, who are now seizing the subject, as was the case for data a few years ago.
Raising awareness of software calls for a different approach to that adopted for data, insofar as while virtually everyone involved in research is aware of producing data, the same cannot be said of software. At present, specialists such as computer scientists are the most mobilized, but awareness is low if we move away from the most informed circles. Software is still too much assimilated to a disciplinary issue, a subject limited to IT researchers. Software forges are little known to non-developers, and software engineering tools are not well identified. And yet, one of the strengths of librarians is their ability to understand, and help others to understand, the issues surrounding code and the role of associated tools in other disciplines, that use code without knowing it. This is undoubtedly another reason to mobilize libraries on this subject.
How can we change the perceptions of the different actors with whom academic libraries need to work when it comes to software?
It’s up to each university and each autonomous national research organization to determine the optimal organization for its needs. Within universities, there are certainly virgin territories for libraries to occupy. By “virgin,” I mean not taken care of in a reasoned way at the institutional level. This is the case, I think, when it comes to algorithms, codes, and software. We need to be proactive in raising awareness among decision-makers. It’s a question of identifying unmet needs and demonstrating how the library if it decides to take them on, can contribute to meeting them.
Library services benefit from substantial budgetary and human resources, enabling them to develop their service offering in line with staff movements, funding opportunities, and changes in the training offered to professionals. In the fast-changing world of higher education and research, the challenge is to continually reorganize library missions, and therefore librarians’ profiles, to enable us to evolve our activity towards other sectors. In recent years, Europe’s major research libraries have repositioned their resources to focus on data. This work lies ahead for the software, for those libraries willing to take it on. This prioritization of activities is essential if we’re going to develop services sustainably, in a world where the evolution of library missions is inescapable.
Libraries face complex challenges in areas like research reproducibility and AI. How can they maintain clear and effective service offerings given the diversity of technical options?
In just a few decades, libraries have gone from offering services limited to physical collections and spaces to offering services covering a wide range of sectors. Today, the range of services on offer is broad, covering both physical and digital spaces. This is the challenge we face in university libraries: even after the COVID crisis, attendance statistics for university libraries rival those of major cultural institutions, but online services are also increasingly popular, and are absorbing a huge amount of resources: libraries must therefore continue their efforts in terms of both physical and digital services.
When a department’s spectrum of activities is extended, the question of its legitimacy in its new field of intervention always comes up for discussion at some point, and that’s a good thing. Ten years ago, libraries weren’t seen as legitimate when it comes to data or education. Today, the question arises about codes and software, or even research management support. For example, librarians are well placed to contribute to and use decision-support tools, given their expertise in data standardization, metadata, and identifier management. But decision-makers need databases that are regularly updated, clean, documented, searchable and interoperable. Presenting bibliometrics from this angle helps to make the librarian’s contribution understandable outside his or her professional circle, and thus to legitimize the role of libraries in this new field of expertise. I’m sure the same will be true of codes and software, for libraries that want to take advantage of them.
Last but not least: Librarians train and self-train extensively and regularly. In a fast-changing professional environment, this is an undeniable asset, which feeds into the regular updating of the service offering, accompanying and sometimes even anticipating the needs of the scientific community, in a relevant, forward-looking approach.
How would you explain the value of Software Heritage to other library managers?
I’d like to link this to a subject at the heart of our identity as librarians: collections. What is a documentary collection? It’s an organized body of content, built up in a reasoned, long-term manner, rooted in heritage, supported by usage, and part of a forward-looking dynamic. Software Heritage is no different with its universal software library. Source codes and software are certainly collections like any others, to be preserved, made accessible, and shared over the long term with “readers” the world over. A real librarian’s job!
Software Heritage, a dedicated software infrastructure
Similar to traditional publications, software is a critical research output. Ensuring its preservation and proper citation is essential, aligning with the core mission of research libraries and archives.
• By drawing on the range of services developed by Software Heritage, you can provide your academic community with services designed specifically for software. Preserving and referencing software source codes, which are executable knowledge, is a complex task. Software Heritage is an infrastructure managed by specialists.
• You can also support Software Heritage financially and thus contribute to the development of a unique infrastructure specifically designed for this mission. Software Heritage is fully integrated into the European open science ecosystem. By joining the Archives & Libraries Interest Group (ALIG), you can benefit from support in extending your library’s field of action.
Up next
Stay tuned for more in our series of interviews with librarians.
• Understand the role of libraries in building the software pillar of the Open Science institutional charter at Sorbonne Paris Nord University.
• Discover how librarians from Grenoble Alpes University collaborate with research software engineers and researchers, around software preservation.
• Learn how and why libraries should support open science infrastructures, with Cécile Swiatek Cassafieres from the University of Paris Nanterre.