Close

November 13, 2024

Software Heritage and Zenodo integrate to safeguard research

ground view of tower

Forget the image of dusty libraries and yellowing parchments. In today’s digital landscape, the effort to preserve knowledge takes place in server farms and code repositories. Now two digital archives, Zenodo and Software Heritage, are working together with an integration aimed at safeguarding our shared scientific software legacy. 

Funded by the EU’s FAIRCORE4EOSC project, these organizations have joined forces to create a seamless pipeline for researchers. 

Here’s how it works: Code deposited in Zenodo is automatically archived in Software Heritage, the world’s largest software source code archive. Researchers get a Digital Object Identifier (DOI) for easy citation, while Software Heritage computes a Software Hash Identifier (SWHID) for ensuring the identification of the exact version that is used or mentioned, for reproducibility – a real code fingerprint. All of this takes place behind the scenes, streamlining the archiving process. Researchers can simply deposit code in Zenodo, and the rest is handled automatically.

Zenodo software record archived in Software Heritage, see bottom right corner.

Beyond the basics 

Zenodo’s upload form now offers software-specific fields, making it easier to categorize code. Additionally, support for CodeMeta and Citation File Format export formats streamlines citation workflows. Upcoming improvements focus on interoperability, allowing other repositories to join the software preservation movement. 

The integration between Zenodo and Software Heritage builds on the 2020 recommendations of the EOSC Scholarly Infrastructures for Research Software report that set out to establish research software as a valuable scholarly output by tackling issues like archiving, referencing, describing, and crediting software artifacts.

The corresponding software record in the Software Heritage archive.

 “Software Heritage is taking over the heavy lifting of proactively harvesting and archiving all software source code with its full development history…It’s important that all scholarly repositories, which may be of varying sizes and addressing different institutional or disciplinary needs, properly interface with Software Heritage and offer researchers the additional functionalities they expect, and that research articles reference the archived version of the software.” 

Looking ahead 

Though the core integration is up and running, in the next six months further backend improvements are planned to ensure seamless interoperability. The integration will also be made into InvenioRDM, making it easier for other repositories to join the Software Heritage network of partners. 

This is more than just code archiving. It’s a commitment to the future of research. By ensuring the long-term survival of software, Zenodo and Software Heritage hope to equip researchers to build upon the shoulders of giants – in code form. 

By O. Von Corven – Tolzmann, Don Heinrich; Alfred Hessel and Reuben Peiss. The Memory of Mankind. New Castle, DE: Oak Knoll Press, 2001, Public Domain https://commons.wikimedia.org/wiki/Category:Library_of_Alexandria#/media/File:Ancientlibraryalex.jpg

More about the partners 

It’s easy to say this integration was bound to happen: The name Zenodo comes from Zenodotus, the first librarian of the Library of Alexandria and considered the father of metadata; Software Heritage is often referred to as the “Library of Alexandria” of software. Interconnecting these major platforms is a crucial step forward for open science in general and source code preservation in particular.

Founded in 2016, Software Heritage is a non-profit organization on a mission to safeguard the very foundation of our digital age – source code. 

Software Heritage currently houses the world’s largest collection of publicly accessible source code, amassing nearly 18 billion unique source files from over 282 million software projects as of January 2024. 

Created to support European Commission-funded research, Zenodo has evolved into a global platform for sharing and preserving research data, software, and other artifacts. Developed by researchers for researchers, Zenodo aims to democratize open science by providing a barrier-free space for all.