Measuring the reach of CERN’s code with Software Heritage
Just as the European Organization for Nuclear Research (CERN) explores the fundamental particles of the universe, Software Heritage acts as a vast observatory for open-source software, cataloging its evolution and contributions across the globe. Software Heritage’s mission—supported by UNESCO and aligned with CERN’s dedication to open science—is to collect, preserve, and share all publicly available software for future generations.
The treasure hunt for CERN’s open-source contributions
At the heart of this partnership is the challenge of mapping CERN’s open-source contributions scattered across numerous platforms and repositories. It’s estimated that about 10% of the « official » open source code is produced by employees. That leaves the other 90% from visiting researchers who work together, devise solutions, finish projects, publish papers and code before moving on. The decentralized nature of these contributions makes capturing the full scope of CERN’s impact a complex task.
“This project is a treasure hunt,” says Axel Naumann, Chair of CERN’s Open Source Program Office (OSPO). “We know some of the gems out there, but the question is, how many more are waiting to be discovered? By tracing our contributions through Software Heritage, we hope to gain a clearer picture of CERN’s true impact on the global open-source landscape.”
This is where Software Heritage’s advanced capabilities come into play. By using its comprehensive archive and tools like the Software Heritage Identifier (SWHID), the project aims to:
- Identify CERN-related projects: Unearth software projects that mention CERN or were developed by CERN-affiliated researchers.
- Track software lineage: Analyze how these projects have evolved over time, including forks, derivatives, and related contributions.
- Measure impact: Quantify the influence of CERN’s open-source software on the global community, its adoption in scientific research, and its broader contributions to technology.
CERN’s OSPO chose Software Heritage as a partner for this project in part because of the reach of its archive. With over 50 billion software artifacts secured through the SWHID specification, enabling traceability across the entire software ecosystem. Software Heritage offers the tools needed to trace CERN’s contributions, from code written decades ago to the latest innovations.
“This partnership is not only a chance to document CERN’s legacy,” says Roberto Di Cosmo, Director of Software Heritage, “but also an opportunity to explore how open-source software accelerates scientific discovery and technological development worldwide.”
Measuring impact for the future
The bigger picture is about the public’s understanding of science. Many people fail to appreciate the significant impact of fundamental research labs like CERN on society as a whole, Naumann says, “They think we’re just burning money. CERN is not only an educational hub, but also a major producer of source code.”
The OSPO has recently hired a student researcher who will spend a year mapping these contributions. The project is more than just an academic exercise; it’s about uncovering the hidden influence of CERN’s open-source contributions on the world.
“It’s a creative solution for a problem that many people, many businesses, and many institutions have: ‘How do we measure our open-source impact?’ I think we’ve found a promising lead to answering this question,” Naumann says.
Stay tuned for updates on the investigation into CERN’s open-source contributions.